Skip to content

Conversation

@abtreece
Copy link
Owner

Summary

New Tests

Test Description
TestWatchProcessor_Integration_FileReconnection Verifies recovery after watched file is deleted and recreated
TestWatchProcessor_Integration_MultipleFileFailures Verifies recovery through 3 consecutive delete/recreate cycles
TestWatchProcessor_Integration_ErrorChannelOnFailure Verifies errors are reported to errChan without crashing
TestWatchProcessor_Integration_ContinuesAfterTransientError Verifies recovery after transient permission errors
TestBatchProcessor_Integration_FileReconnection Verifies BatchWatchProcessor also recovers correctly
TestWatchProcessor_Integration_DirectoryRecreation Verifies recovery when entire watched directory is deleted

Test plan

Closes #421

🤖 Generated with Claude Code

…ckend failures

Adds 6 new integration tests to verify watch processor behavior during
backend failures and recovery:

- TestWatchProcessor_Integration_FileReconnection: file deletion/recreation
- TestWatchProcessor_Integration_MultipleFileFailures: consecutive failures
- TestWatchProcessor_Integration_ErrorChannelOnFailure: error reporting
- TestWatchProcessor_Integration_ContinuesAfterTransientError: permission errors
- TestBatchProcessor_Integration_FileReconnection: batch processor recovery
- TestWatchProcessor_Integration_DirectoryRecreation: directory deletion

Also adds createConfigWithBackoff() helper for faster test retries.

All tests pass with -race flag enabled.

Closes #421
Copilot AI review requested due to automatic review settings January 26, 2026 02:51
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds comprehensive integration tests to verify the watch processor's reconnection behavior after backend failures, addressing issue #421. The tests validate that the processor correctly detects, handles, and recovers from various failure scenarios including file deletions, directory removals, and permission errors.

Changes:

  • Added 6 new integration tests covering file/directory deletion and recreation scenarios
  • Introduced createConfigWithBackoff helper for testing with custom retry intervals
  • Tests validate both WatchProcessor and BatchWatchProcessor reconnection behavior

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

value: initial
`)

// Create processor
Copy link

Copilot AI Jan 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The context timeout of 20 seconds is inconsistent with other similar tests (e.g., line 815 uses 30 seconds, line 896 uses 15 seconds). Consider standardizing timeout durations across tests or documenting why different values are needed for specific scenarios.

Suggested change
// Create processor
// Create processor
// Use a 20s timeout here to allow sufficient time for file deletion, backend retry,
// and reconnection cycles specific to this test scenario.

Copilot uses AI. Check for mistakes.
}

// Delete file and wait for errors
time.Sleep(200 * time.Millisecond)
Copy link

Copilot AI Jan 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hard-coded sleep durations (200ms, 300ms, 500ms) appear throughout these tests without explanation. Consider defining named constants for these timing values to clarify their purpose (e.g., watchEstablishmentDelay, watchDetectionDelay) and make them easier to adjust if needed.

Copilot uses AI. Check for mistakes.
config := env.createConfig(ctx)
stopChan := make(chan bool)
doneChan := make(chan bool)
errChan := make(chan error, 100)
Copy link

Copilot AI Jan 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The buffer size of 100 for errChan appears arbitrary and is repeated across all tests. Consider defining this as a constant (e.g., testErrorChannelBuffer) to centralize this value and document why this specific buffer size is needed.

Copilot uses AI. Check for mistakes.
Comment on lines 925 to 932
// Wait for recovery
_, err := waitForFileUpdate(t, destPath, 8*time.Second, "value=initial")
if err != nil {
t.Logf("Note: File may not have updated if deletion was too brief: %v", err)
}

// Check if any errors were reported (may or may not have errors depending on timing)
// The key assertion is that the processor didn't crash
Copy link

Copilot AI Jan 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test acknowledges that file updates may not occur due to timing, but still allows the test to pass. This creates uncertainty about what the test actually validates. Consider either ensuring the timing is sufficient for the update to occur, or clearly documenting that this test only verifies the processor doesn't crash (not that it recovers).

Suggested change
// Wait for recovery
_, err := waitForFileUpdate(t, destPath, 8*time.Second, "value=initial")
if err != nil {
t.Logf("Note: File may not have updated if deletion was too brief: %v", err)
}
// Check if any errors were reported (may or may not have errors depending on timing)
// The key assertion is that the processor didn't crash
// Wait for recovery: the output file must be updated from its initial value.
updatedContent, err := waitForFileUpdate(t, destPath, 15*time.Second, "value=initial")
if err != nil {
t.Fatalf("Expected file to be updated after transient error, but it was not: %v", err)
}
if !strings.Contains(updatedContent, "value=recovered") {
t.Fatalf("Expected recovered value in output after transient error, got: %q", updatedContent)
}
// Check if any errors were reported (may or may not have errors depending on timing)
// The key assertion is that the processor didn't crash and recovered after the transient error.

Copilot uses AI. Check for mistakes.
Comment on lines 1169 to 1179
// Wait for processor to process the change
content, err := waitForFileUpdate(t, destPath, 15*time.Second, "value=initial")
if err != nil {
t.Fatalf("Output not updated after directory recreation: %v", err)
}

// Accept either placeholder or final value - both indicate successful recovery
if content != "value=dir-recreated" && content != "value=placeholder" {
t.Errorf("Expected 'value=dir-recreated' or 'value=placeholder', got %q", content)
}
t.Logf("Processor recovered with content: %s", content)
Copy link

Copilot AI Jan 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Accepting either 'value=dir-recreated' or 'value=placeholder' makes the test assertion less strict and introduces ambiguity about the expected behavior. If both values are valid outcomes, the test should document why both represent successful recovery. Otherwise, consider making the timing more deterministic to ensure the final value is reached.

Suggested change
// Wait for processor to process the change
content, err := waitForFileUpdate(t, destPath, 15*time.Second, "value=initial")
if err != nil {
t.Fatalf("Output not updated after directory recreation: %v", err)
}
// Accept either placeholder or final value - both indicate successful recovery
if content != "value=dir-recreated" && content != "value=placeholder" {
t.Errorf("Expected 'value=dir-recreated' or 'value=placeholder', got %q", content)
}
t.Logf("Processor recovered with content: %s", content)
// Wait for processor to process the change and produce the final value
content, err := waitForFile(t, destPath, 15*time.Second, "value=dir-recreated")
if err != nil {
t.Fatalf("Output not updated to final value after directory recreation: %v", err)
}
if content != "value=dir-recreated" {
t.Errorf("Expected 'value=dir-recreated', got %q", content)
}
t.Logf("Processor recovered with final content: %s", content)

Copilot uses AI. Check for mistakes.
@codecov
Copy link

codecov bot commented Jan 26, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 62.45%. Comparing base (021477d) to head (cc4a735).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #519      +/-   ##
==========================================
+ Coverage   62.36%   62.45%   +0.09%     
==========================================
  Files          48       48              
  Lines        5420     5420              
==========================================
+ Hits         3380     3385       +5     
+ Misses       1831     1828       -3     
+ Partials      209      207       -2     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

- Add named constants for timing values (watchEstablishmentDelay,
  watchDetectionDelay, watchRetryDelay, testShutdownTimeout)
- Add testErrorChannelBuffer constant (100) with documentation
- Add timeout documentation comments explaining why each test uses
  specific timeout values
- Make ErrorChannelOnFailure test assertions more deterministic
- Make DirectoryRecreation test use strict assertion for final value
- Replace all magic numbers with named constants
@abtreece abtreece merged commit d10a1dd into main Jan 26, 2026
14 checks passed
@abtreece abtreece deleted the add-watch-reconnection-tests branch January 26, 2026 03:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add integration tests for watch reconnection after backend failures

1 participant