-
Notifications
You must be signed in to change notification settings - Fork 0
Fix: Dashboard timeout handling and critical architecture improvements #11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The dashboard polls regularly, so long timeouts are unnecessary and cause DBConnection errors when SQLite is busy. This change: - Sets dashboard queries to 2s timeout instead of default 15s - Catches and silently ignores timeouts in dashboard updates - Updates VideoQueries functions to accept optional timeout parameter - Inlines count queries in dashboard to apply timeout directly Since the dashboard polls every few seconds, skipping an update when the database is busy is acceptable - the next poll will retry.
Wrap all dashboard queries with rescue/catch to handle timeouts gracefully: - safe_query/1 returns 0 on timeout for count queries - safe_query_list/1 returns [] on timeout for list queries - Prevents error logging when SQLite is busy during sync operations Dashboard will show default values during timeouts and update on next poll.
…protection The encoder was running multiple concurrent processes because Broadway was doing the encoding work directly instead of delegating to the Encode GenServer. This caused duplicate encodings. Changes: - Simplified Encoder.Broadway to match CRF searcher pattern - Delegate all encoding work to AbAv1.encode/1 (Encode GenServer) - Remove direct port handling from Broadway (now in GenServer) - Encode GenServer's available?/0 check ensures only one encode at a time - Remove empty lock.ex file This matches the CRF searcher architecture which works correctly.
Build Erlang 28.1.1 from source to get the latest bugfixes that may address segfault issues seen with earlier OTP 28 builds.
- Prevents DB connection owner exit errors during test cleanup - Async tasks in tests can outlive the LiveView process causing Sandbox issues - Added ReencodarrWeb.TaskSupervisor for production async queries - Removed unnecessary try/catch and Process.alive? checks - Tests now run cleanly without connection warnings - Fixed encoder test helper to use fully qualified module name - Skip MP4 files temporarily due to compatibility issues
- Created SVG favicon with Sonarr/Radarr family aesthetic - Purple gradient background with radar-style concentric circles - Scanning arc sweep for dynamic feel - AV1 triangle and OPUS wave symbols integrated - Gold accent color matching Radarr theme - Hooked up to root layout template
- Purple gradient background (from-purple-900 to-purple-800) - Glassmorphism cards with backdrop blur and white/10 transparency - Updated all text colors to white/purple shades for contrast - Progress bars with gradient fills and glow effects - Larger icon in header (16x16) with drop shadow - Changed encoder color from green to amber (matching favicon gold) - Updated sync service component styling - All buttons now have shadow-lg for depth
Add comprehensive test suite for SharedQueries: - case_insensitive_like/2: Database-agnostic LIKE queries - videos_not_matching_exclude_patterns/1: Pattern filtering (small/large lists) - videos_with_no_chosen_vmafs_query/0: VMAF cleanup queries - aggregated_stats_query/0: Statistics calculations for dashboard Coverage: 64.29% → 100% (+35.71%) Tests: 4 → 20 tests
Use underscores for large number: 12345 → 12_345
- Fix delete_videos_with_path test to use wildcard pattern matching - Fix delete_unchosen_vmafs test to correctly test videos without chosen VMAFs - Fix list_chosen_vmafs test to set videos to :crf_searched state - Fix mark_vmaf_as_chosen test to handle transaction return type - All 74 Media tests now passing in async mode
…tions
- Add tests for queue operations: get_videos_for_crf_search, get_videos_needing_analysis, list_videos_by_estimated_percent, get_next_for_encoding_by_time, debug_encoding_queue_by_library
- Add tests for library operations: create_library with valid/invalid attrs
- Add tests for vmaf operations: create_vmaf with valid/invalid attrs
- Add tests for bulk operations: reset_all_failures
- Add test for test helpers: test_insert_path
- Fix bug in Media.add_existing_video_messages to handle {:error, :not_found} tuple from get_video_by_path
- Coverage improved from 40.60% to 58.60% (+18%)
…operations - Add tests for update operations: update_video, update_library, update_vmaf - Add tests for upsert operations: upsert_vmaf (create/update), batch_upsert_videos - Add tests for reanalysis operations: force_reanalyze_video, debug_force_analyze_video, reset_failed_videos, reset_all_videos_for_reanalysis, reset_all_videos_to_needs_analysis - Add tests for invalid audio operations: count_videos_with_invalid_audio_args, reset_videos_with_invalid_audio_args - Add tests for deletion operations: delete_video, delete_library, delete_vmaf, delete_videos_with_nonexistent_paths - Add tests for changeset operations: change_video, change_library, change_vmaf - Add tests for video failure operations: get_video_failures, resolve_video_failures - Skip test for reset_videos_with_invalid_audio_metadata (PostgreSQL-specific array_length) - Fix credo warnings: replace length/1 with Enum.empty?/1 - Coverage improved from 58.60% to 70.18% (+11.58%) - Total: 108 tests (23 new tests added)
- Replace PostgreSQL-specific array_length(?, 1) with SQLite json_array_length(?) - Fix reset_videos_with_invalid_audio_metadata/0 to work with SQLite JSON arrays - Unskip previously failing test - now passes with SQLite-compatible query - This was a critical bug preventing the function from working on SQLite databases
…s, and diagnostics - Added 19 new tests covering previously untested functions - Coverage improved from 61.26% to 67.80% (+6.54%) - Tests for list_videos_awaiting_crf_search, get_next_for_encoding - Tests for delete_vmafs_for_video, mark_vmaf_as_chosen with string CRF - Tests for upsert_vmaf savings calculations and state transitions - Tests for bulk operations: reset_all_videos_for_reanalysis, reset_videos_for_reanalysis_batched - Tests for reset_all_videos_to_needs_analysis, debug_force_analyze_video - Tests for test_insert_path diagnostic function with various scenarios - All 127 tests passing with comprehensive edge case coverage
- Added 3 tests for upsert_video/1: create, update, and error cases - Tests verify upsert behavior on path conflict - Coverage remains at 67.80% (130 tests total) - Delegates like mark_as_reencoded already covered by VideoStateMachine tests
- Added 11 new tests for upsert_vmaf edge cases and savings calculations - Tests for zero size videos, invalid percent values, percent over 100 - Tests for pre-existing savings, VMAF upsert conflicts - Added edge case tests for bulk operations (failed videos, bitrate resets) - Tests for debug_force_analyze_video field resets - Tests for multiple video failures and failure resolution - Tests for upsert_video timestamp preservation - Coverage improved from 67.80% to 68.77% (+0.97%) - Total: 141 tests passing
- Added error handling test for force_reanalyze_video with non-existent video - Added test for delete_unchosen_vmafs empty case - Added test for get_video_by_service_id type handling - Coverage improved from 68.77% to 69.25% (+0.48%) - Total: 144 tests passing
- Added 12 new tests for helper functions and transaction paths - Tests for delete_videos_with_path wildcard patterns and empty matches - Tests for reset_videos_with_invalid_audio_args edge cases - Tests for reset_all_failures with no failures and multiple failures - Tests for get_next_for_encoding_by_time sorting and empty cases - Tests for list_videos_awaiting_crf_search filtering logic - Tests for mark_vmaf_as_chosen with non-existent CRF - Tests for get_videos_in_library filtering - Tests for count_videos accuracy - Coverage improved from 69.25% to 69.73% (+0.48%) - Total: 156 tests passing
…nsaction handling
…y deduplication - Remove expensive JSON fragment codec checks from CRF search queries - Simplify queries to use indexed state column only (v.state == :analyzed) - Add mark_as_crf_searching() and mark_as_encoding() to persist state transitions - Call state transitions immediately when CRF search/encode starts - Prevents Broadway infinite loops by marking videos before port opens - Dashboard queries now timeout-safe and much faster - Codec filtering happens at Rules.build_args stage instead of query time
- Created Reencodarr.Core.Retry module for centralized retry logic - Exponential backoff: 100ms → 200ms → 400ms → 800ms → 1600ms - Added jitter (up to 50% of base delay) to prevent thundering herd - Applied retry logic to VideoUpsert.batch_upsert for sync operations - Applied retry logic to VideoStateMachine.mark_as_analyzed - Improved analyzer error reporting to include failure reasons
- Use centralized Retry module instead of custom requeue mechanism - Exponential backoff with jitter for database busy errors - Skips line after exhausting retries instead of blocking the pipeline - Removes warning spam in logs during high-concurrency sync operations
- Added catch for nested DBConnection.ConnectionError in exit tuples - Dashboard now gracefully handles client checkout timeouts during sync - Returns default values (0 or []) instead of crashing when DB is under heavy load - Prevents error logs when database connection pool is saturated
- Only log throughput metrics when there's recent activity (avg > 0) - Prevents spam of 'consecutive improvements' when no batches are being processed - Performance monitor continues to run but stays quiet when idle - Refactored nested conditions to satisfy credo depth requirements
…yzed - Check for bitrate, width, height before transition to analyzed state - Prevents validation errors when videos have existing mediainfo in DB but missing required fields - Skips transition and logs when fields are missing instead of failing repeatedly - Videos with missing fields will be reprocessed through mediainfo pipeline - Refactored to reduce nesting for credo compliance
- Check has_opus_in_filename? in determine_video_transition_decision - Was missing from the decision tree causing Opus files to be marked as analyzed - Add debug logging to track transition decisions
- transition_video_to_analyzed → decide_video_processing_path
Makes it clear this function decides between :analyzed AND :encoded states
- determine_video_transition_decision → check_encoding_requirements
Pure function that returns {:skip_encoding, reason} or {:needs_encoding, reason}
- persist_analyzed_state/persist_encoded_state → apply_processing_decision
Single private function handling DB persistence based on decision
- Updated all tests to use new function names and correct assertions
This eliminates the confusing naming where multiple functions with similar
names (transition_video_to_analyzed, persist_analyzed_state,
Media.mark_as_analyzed, VideoStateMachine.mark_as_analyzed) did completely
different things, causing maintenance nightmares.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR addresses critical architecture improvements focusing on database concurrency, timeout handling, and code clarity. The changes include introducing centralized retry logic with exponential backoff for SQLite "database busy" errors, refactoring the encoder Broadway pipeline to delegate to the Encode GenServer for proper single-worker protection, improving dashboard query resilience with shorter timeouts and safe query wrappers, and comprehensive function renaming in the analyzer for better code maintainability.
Key Changes:
- Centralized retry module with exponential backoff for database busy errors
- Dashboard query timeout reduced from 15s to 2s with safe error handling
- Encoder Broadway delegation to Encode GenServer for single-worker enforcement
- Function renaming in analyzer for code clarity (
determine_video_transition_decision→check_encoding_requirements)
Reviewed changes
Copilot reviewed 27 out of 33 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| test/reencodarr_web/live/dashboard_v2_live_test.exs | Updated test assertions from "Video Processing Dashboard" to "Processing Pipeline" |
| test/reencodarr_web/controllers/page_controller_test.exs | Updated test assertion for dashboard content |
| test/reencodarr/rules_test.exs | Added 400+ lines of comprehensive edge case tests for Rules module |
| test/reencodarr/media_test.exs | Added 2100+ lines of comprehensive test coverage for Media module functions |
| test/reencodarr/media/video_upsert_test.exs | Added edge case tests for video upsert operations |
| test/reencodarr/media/video_state_machine_test.exs | Added validation and transition function tests |
| test/reencodarr/media/exclude_patterns_test.exs | Added tests for query functions and pattern matching |
| test/reencodarr/analyzer/broadway/codec_detection_test.exs | Updated test names and assertions for refactored functions |
| test/reencodarr/ab_av1/encode_test.exs | New empty test file placeholder |
| priv/static/images/* | Added new favicon SVG files and compressed versions |
| mix.lock | Updated dependencies (ecto, gettext, phoenix_ecto, req) |
| lib/reencodarr_web/live/dashboard_live.ex | Added timeout handling, async queries, safe query wrappers |
| lib/reencodarr_web/components/layouts/root.html.heex | Added favicon link |
| lib/reencodarr/media/video_upsert.ex | Integrated retry logic for batch operations |
| lib/reencodarr/media/video_state_machine.ex | Added retry logic and new state transition functions |
| lib/reencodarr/media/video_queries.ex | Added timeout options and simplified codec queries |
| lib/reencodarr/media.ex | Updated for SQLite JSON array length and new state functions |
| lib/reencodarr/encoder/broadway.ex | Refactored to delegate encoding to GenServer |
| lib/reencodarr/core/retry.ex | New centralized retry module with exponential backoff |
| lib/reencodarr/application.ex | Added ReencodarrWeb.TaskSupervisor |
| lib/reencodarr/analyzer/broadway.ex | Renamed functions and added mediainfo field validation |
| lib/reencodarr/analyzer/processing/pipeline.ex | Enhanced error message format with reasons |
| lib/reencodarr/analyzer/core/file_operations.ex | Added MP4 file skip logic |
| lib/reencodarr/analyzer/broadway/performance_monitor.ex | Fixed idle spam by only logging when throughput > 0 |
| lib/reencodarr/ab_av1/encode.ex | Mark video as encoding before starting port |
| lib/reencodarr/ab_av1/crf_search.ex | Integrated retry logic for VMAF upserts |
| lib/reencodarr/ab_av1.ex | Added MP4 file skip logic |
| config/dev.exs | Added timeout and queue configuration |
| flake.nix | Updated to build Erlang 28.1.1 from source |
| flake.lock | Updated nixpkgs revision |
Fix: Dashboard timeout handling and critical architecture improvements
Summary
This PR addresses dashboard query timeout issues and implements several critical improvements to database concurrency, code clarity, and system reliability.
Problem Statement
Changes
1. Dashboard Query Timeout Improvements
Files:
lib/reencodarr_web/live/dashboard_live.ex2. Encoder Single-Worker Protection
Files:
lib/reencodarr/encoder/broadway.ex3. Database Concurrency & Retry Logic
Reencodarr.Core.Retry)base = (2^attempt * 100ms); jitter = rand(base/2); backoff = base + jitterFiles:
lib/reencodarr/core/retry.ex(NEW)lib/reencodarr/media/video_upsert.exlib/reencodarr/media/video_state_machine.exlib/reencodarr/ab_av1/crf_search.ex4. Performance Monitor Idle Spam Fix
Files:
lib/reencodarr/analyzer/broadway/performance_monitor.ex5. Mediainfo Field Validation
bitrate: {"can't be blank"}persist_analyzed_stateFiles:
lib/reencodarr/analyzer/broadway.ex6. Filename-Based Codec Detection
has_opus_in_filename?check to decision logicFiles:
lib/reencodarr/analyzer/broadway.ex7. Function Naming Refactor (Major Code Quality)
Before (confusing):
transition_video_to_analyzed- actually decides :analyzed OR :encodeddetermine_video_transition_decision- returns decision tuplepersist_analyzed_state- DB persistencepersist_encoded_state- DB persistenceMedia.mark_as_analyzed- context wrapperVideoStateMachine.mark_as_analyzed- state machine logicAfter (clear):
decide_video_processing_path- decides which state pathcheck_encoding_requirements- pure logic, returns{:skip_encoding, reason}or{:needs_encoding, reason}apply_processing_decision- single DB persistence functionImpact: Eliminates maintenance nightmare of 4+ similarly-named functions doing completely different things.
Files:
lib/reencodarr/analyzer/broadway.extest/reencodarr/analyzer/broadway/codec_detection_test.exsTest Coverage
Performance Impact
Breaking Changes
None - all changes are internal refactoring or improvements.
Migration Notes
No database migrations required. Retry logic is transparent to existing code.
Related Issues
Fixes issues with: