-
Notifications
You must be signed in to change notification settings - Fork 12
test(template): add integration tests for watch reconnection after backend failures #519
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…ckend failures Adds 6 new integration tests to verify watch processor behavior during backend failures and recovery: - TestWatchProcessor_Integration_FileReconnection: file deletion/recreation - TestWatchProcessor_Integration_MultipleFileFailures: consecutive failures - TestWatchProcessor_Integration_ErrorChannelOnFailure: error reporting - TestWatchProcessor_Integration_ContinuesAfterTransientError: permission errors - TestBatchProcessor_Integration_FileReconnection: batch processor recovery - TestWatchProcessor_Integration_DirectoryRecreation: directory deletion Also adds createConfigWithBackoff() helper for faster test retries. All tests pass with -race flag enabled. Closes #421
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds comprehensive integration tests to verify the watch processor's reconnection behavior after backend failures, addressing issue #421. The tests validate that the processor correctly detects, handles, and recovers from various failure scenarios including file deletions, directory removals, and permission errors.
Changes:
- Added 6 new integration tests covering file/directory deletion and recreation scenarios
- Introduced
createConfigWithBackoffhelper for testing with custom retry intervals - Tests validate both
WatchProcessorandBatchWatchProcessorreconnection behavior
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| value: initial | ||
| `) | ||
|
|
||
| // Create processor |
Copilot
AI
Jan 26, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The context timeout of 20 seconds is inconsistent with other similar tests (e.g., line 815 uses 30 seconds, line 896 uses 15 seconds). Consider standardizing timeout durations across tests or documenting why different values are needed for specific scenarios.
| // Create processor | |
| // Create processor | |
| // Use a 20s timeout here to allow sufficient time for file deletion, backend retry, | |
| // and reconnection cycles specific to this test scenario. |
| } | ||
|
|
||
| // Delete file and wait for errors | ||
| time.Sleep(200 * time.Millisecond) |
Copilot
AI
Jan 26, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hard-coded sleep durations (200ms, 300ms, 500ms) appear throughout these tests without explanation. Consider defining named constants for these timing values to clarify their purpose (e.g., watchEstablishmentDelay, watchDetectionDelay) and make them easier to adjust if needed.
| config := env.createConfig(ctx) | ||
| stopChan := make(chan bool) | ||
| doneChan := make(chan bool) | ||
| errChan := make(chan error, 100) |
Copilot
AI
Jan 26, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The buffer size of 100 for errChan appears arbitrary and is repeated across all tests. Consider defining this as a constant (e.g., testErrorChannelBuffer) to centralize this value and document why this specific buffer size is needed.
| // Wait for recovery | ||
| _, err := waitForFileUpdate(t, destPath, 8*time.Second, "value=initial") | ||
| if err != nil { | ||
| t.Logf("Note: File may not have updated if deletion was too brief: %v", err) | ||
| } | ||
|
|
||
| // Check if any errors were reported (may or may not have errors depending on timing) | ||
| // The key assertion is that the processor didn't crash |
Copilot
AI
Jan 26, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test acknowledges that file updates may not occur due to timing, but still allows the test to pass. This creates uncertainty about what the test actually validates. Consider either ensuring the timing is sufficient for the update to occur, or clearly documenting that this test only verifies the processor doesn't crash (not that it recovers).
| // Wait for recovery | |
| _, err := waitForFileUpdate(t, destPath, 8*time.Second, "value=initial") | |
| if err != nil { | |
| t.Logf("Note: File may not have updated if deletion was too brief: %v", err) | |
| } | |
| // Check if any errors were reported (may or may not have errors depending on timing) | |
| // The key assertion is that the processor didn't crash | |
| // Wait for recovery: the output file must be updated from its initial value. | |
| updatedContent, err := waitForFileUpdate(t, destPath, 15*time.Second, "value=initial") | |
| if err != nil { | |
| t.Fatalf("Expected file to be updated after transient error, but it was not: %v", err) | |
| } | |
| if !strings.Contains(updatedContent, "value=recovered") { | |
| t.Fatalf("Expected recovered value in output after transient error, got: %q", updatedContent) | |
| } | |
| // Check if any errors were reported (may or may not have errors depending on timing) | |
| // The key assertion is that the processor didn't crash and recovered after the transient error. |
| // Wait for processor to process the change | ||
| content, err := waitForFileUpdate(t, destPath, 15*time.Second, "value=initial") | ||
| if err != nil { | ||
| t.Fatalf("Output not updated after directory recreation: %v", err) | ||
| } | ||
|
|
||
| // Accept either placeholder or final value - both indicate successful recovery | ||
| if content != "value=dir-recreated" && content != "value=placeholder" { | ||
| t.Errorf("Expected 'value=dir-recreated' or 'value=placeholder', got %q", content) | ||
| } | ||
| t.Logf("Processor recovered with content: %s", content) |
Copilot
AI
Jan 26, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Accepting either 'value=dir-recreated' or 'value=placeholder' makes the test assertion less strict and introduces ambiguity about the expected behavior. If both values are valid outcomes, the test should document why both represent successful recovery. Otherwise, consider making the timing more deterministic to ensure the final value is reached.
| // Wait for processor to process the change | |
| content, err := waitForFileUpdate(t, destPath, 15*time.Second, "value=initial") | |
| if err != nil { | |
| t.Fatalf("Output not updated after directory recreation: %v", err) | |
| } | |
| // Accept either placeholder or final value - both indicate successful recovery | |
| if content != "value=dir-recreated" && content != "value=placeholder" { | |
| t.Errorf("Expected 'value=dir-recreated' or 'value=placeholder', got %q", content) | |
| } | |
| t.Logf("Processor recovered with content: %s", content) | |
| // Wait for processor to process the change and produce the final value | |
| content, err := waitForFile(t, destPath, 15*time.Second, "value=dir-recreated") | |
| if err != nil { | |
| t.Fatalf("Output not updated to final value after directory recreation: %v", err) | |
| } | |
| if content != "value=dir-recreated" { | |
| t.Errorf("Expected 'value=dir-recreated', got %q", content) | |
| } | |
| t.Logf("Processor recovered with final content: %s", content) |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #519 +/- ##
==========================================
+ Coverage 62.36% 62.45% +0.09%
==========================================
Files 48 48
Lines 5420 5420
==========================================
+ Hits 3380 3385 +5
+ Misses 1831 1828 -3
+ Partials 209 207 -2 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
- Add named constants for timing values (watchEstablishmentDelay, watchDetectionDelay, watchRetryDelay, testShutdownTimeout) - Add testErrorChannelBuffer constant (100) with documentation - Add timeout documentation comments explaining why each test uses specific timeout values - Make ErrorChannelOnFailure test assertions more deterministic - Make DirectoryRecreation test use strict assertion for final value - Replace all magic numbers with named constants
Summary
-raceflag enabledNew Tests
TestWatchProcessor_Integration_FileReconnectionTestWatchProcessor_Integration_MultipleFileFailuresTestWatchProcessor_Integration_ErrorChannelOnFailureTestWatchProcessor_Integration_ContinuesAfterTransientErrorTestBatchProcessor_Integration_FileReconnectionTestWatchProcessor_Integration_DirectoryRecreationTest plan
go test -race ./pkg/template/Closes #421
🤖 Generated with Claude Code