Multiple changes to fix errors and responsiveness, sanitize log messages #122

clcollins · 2025-06-09T22:27:11Z

Signed-off-by: Chris Collins collins.christopher@gmail.com

…y.Incident or a pointer to one is printed to the logs. Signed-off-by: Chris Collins <collins.christopher@gmail.com>

Simplifies the incident loading flow by: - Fetching incident details, alerts, and notes in parallel immediately when incident is selected - Removing wait conditions and sequential loading patterns - Making input handler available in all focus modes - Setting immediate loading state for better UX feedback - Adding direct action handlers in incident view This improves perceived performance and makes the UI more responsive when viewing incident details. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

Addresses issues from previous commit: - Remove unused message types (getIncidentAlertsMsg, getIncidentNotesMsg) - Delete commented-out legacy code - Remove incomplete getIncidentAlertsMsg handler that was unreachable - Add re-render trigger when incident data arrives during viewing - Clear placeholder incident on error to prevent stale state These fixes improve code clarity, eliminate dead code, and prevent potential race conditions with partial data rendering. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

Replaces polling wait loops with reactive loading state tracking: **Loading State Tracking:** - Add incidentDataLoaded, incidentNotesLoaded, incidentAlertsLoaded flags - Set flags when each piece of data arrives - Clear flags when leaving incident view **Progressive Rendering:** - Show incident details immediately when available - Display "Loading notes..." while notes fetch in progress - Display "Loading alerts..." while alerts fetch in progress - Re-render automatically as each piece of data arrives **Action Guards:** - Prevent Note action until full incident data loaded (needs HTMLURL, Title, Service) - Prevent Login action until full incident data loaded (needs Service.Summary) - Prevent Open action until full incident data loaded (needs HTMLURL) - Allow Ack/UnAck immediately (only need ID from placeholder) **Benefits:** - Eliminates 86+ message polling loops that flood event queue - Keeps UI responsive to user input during data loading - Provides clear feedback on loading progress - Prevents nil pointer errors on incomplete data - Maintains fast "show what you have" UX This replaces the waitForSelectedIncident* polling patterns with reactive state management for better performance and reliability. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

Comprehensive test coverage for the new loading state system: **TestLoadingStateTracking:** - Verifies incidentDataLoaded flag set on gotIncidentMsg - Verifies incidentNotesLoaded flag set on gotIncidentNotesMsg - Verifies incidentAlertsLoaded flag set on gotIncidentAlertsMsg - Verifies all flags cleared on clearSelectedIncidentsMsg **TestActionGuards:** - Tests Note action blocked when data not loaded - Tests Login action blocked when data not loaded - Tests Open action blocked when data not loaded - Tests Acknowledge action always allowed (only needs ID) - Verifies appropriate status messages shown when blocked **TestProgressiveRendering:** - Tests "Loading notes..." shown when notes not loaded - Tests "Loading alerts..." shown when alerts not loaded - Tests both loading messages shown when neither loaded - Verifies template rendering handles partial data gracefully These tests ensure the loading state system works correctly and prevents nil pointer errors while maintaining good UX. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

Implements incident data caching with pre-fetching and automatic cleanup to improve responsiveness when viewing incidents. Cached data is shown immediately while fresh data is fetched in the background. Cache features: - Pre-fetch incident details, notes, and alerts for all incidents in the list - Reuse cached data when opening incidents for instant display - Automatic cache cleanup when incidents are removed from the list - Debug logging for cache operations (new data, refreshes, cleanup) Logging improvements: - Made log destination configurable via log_to_journal config option - Support systemd journal (default), file logging, or stderr fallback - Added unit tests for logging configuration logic - Filter arrow key messages from debug log to reduce noise UI improvements: - Reset viewport to top when opening incident view - Progressive rendering shows cached data immediately 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

Changes the default behavior so the application automatically watches for incident updates on startup. Users can pause auto-refresh by pressing ctrl+r (toggle). Previous behavior: Started paused, user enables watching New behavior: Starts watching, user can pause if desired This provides a better on-call experience by keeping the incident list current without requiring manual intervention. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

Incident view improvements: - Removed duplicate acknowledge handler - Fixed action guards to check correct loading state: - Login now checks incidentAlertsLoaded (needs cluster_id from alerts) - Open now checks incidentDataLoaded (needs HTMLURL) - Note checks incidentDataLoaded (needs incident metadata) - All key commands now work from incident view Browser command improvements: - Changed browser opening to fire-and-forget behavior - Removed stderr checking that treated browser warnings as errors - Browser sandbox warnings are now ignored (normal browser behavior) - Added test coverage for browser command All commands working in incident view: - a: Acknowledge - ctrl+e: Re-escalate - ctrl+s: Silence - n: Add note (requires data loaded) - l: Login to cluster (requires alerts loaded) - o: Open in browser (requires data loaded) - r: Refresh incident - h: Help - esc: Back to table 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

Pre-fetching was causing UI responsiveness issues by: - Overwriting m.selectedIncident* fields with background fetches - Setting loading flags globally instead of per-incident - Triggering unnecessary re-renders (expensive template parsing) Changes: - Modified gotIncidentMsg/gotIncidentNotesMsg/gotIncidentAlertsMsg to only update selected incident fields if msg is for current incident - Cache updates still happen for all incidents (background pre-fetch) - Only trigger re-render if viewing AND this is the selected incident - Updated tests to match new behavior Result: Pre-fetching now happens silently in background without affecting currently viewed incident or causing sluggish navigation. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

The previous fix for UI responsiveness broke action handlers that depend on waitForSelectedIncidentThenDoMsg. The handlers only updated m.selectedIncident if it was already non-nil AND matched, which meant when initiating actions from table view (where selectedIncident starts as nil), it would never get set, causing infinite re-queuing. Changes: - Modified gotIncidentMsg/gotIncidentNotesMsg/gotIncidentAlertsMsg to set selectedIncident if it's nil OR if it matches current selection - Only skip updating if we have a DIFFERENT incident selected - This allows initial selection while still preventing background pre-fetch from overwriting the current selection Result: Actions like add note, login, and open from table view now work again without getting stuck in "waiting for incident info..." loop. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

The un-acknowledge key command (Ctrl-E) was only un-acknowledging incidents without re-escalating them to their escalation policies. This meant the current on-call person would not be paged. Problem: - unAcknowledgedIncidentsMsg handler only updated status and refreshed - No re-escalation was happening despite status message saying "re-escalated incidents" Changes: - After un-acknowledging, group incidents by escalation policy - Create reEscalateIncidentsMsg for each policy group - Each incident is re-escalated to its own service's escalation policy at level 2 (reEscalateDefaultPolicyLevel) - Use tea.Batch to send all re-escalation messages Result: Ctrl-E now properly un-acknowledges AND re-escalates incidents to page the current on-call person for each service. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

Browser open errors were failing silently because the openBrowserMsg handler was incorrectly wrapping the openBrowserCmd result, preventing browserFinishedMsg from ever being received by Update. Problem: - openBrowserMsg handler wrapped openBrowserCmd in another function: return m, func() tea.Msg { return openBrowserCmd(...) } - This prevented browserFinishedMsg from being sent to Update - Errors were logged but never displayed to user - No error view was triggered (unlike login errors) Changes: - Changed openBrowserMsg handler to directly return tea.Cmd: return m, openBrowserCmd(...) - Matches the pattern used by loginMsg handler - Added status message in browserFinishedMsg error handler - Error now triggers error view via errMsg Result: Browser open failures now: 1. Display "failed to open browser: <error>" in status bar 2. Send errMsg to trigger error view 3. Match the behavior of login command errors 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

Sanitizes log messages by removing instances where an entire pagerdut…

7cf81dd

…y.Incident or a pointer to one is printed to the logs. Signed-off-by: Chris Collins <collins.christopher@gmail.com>

clcollins force-pushed the sanitize_logs branch from c05383d to 7cf81dd Compare June 9, 2025 22:52

clcollins and others added 11 commits November 9, 2025 12:58

clcollins changed the title ~~Sanitizes log messages by removing instances where an entire pagerduty.Incident or a pointer to one is printed to the logs.~~ Multiple changes to fix errors and responsiveness, sanitize log messages Nov 11, 2025

clcollins merged commit 9aa0243 into main Nov 11, 2025
2 checks passed

clcollins deleted the sanitize_logs branch November 11, 2025 18:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Multiple changes to fix errors and responsiveness, sanitize log messages #122

Multiple changes to fix errors and responsiveness, sanitize log messages #122

Uh oh!

clcollins commented Jun 9, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Multiple changes to fix errors and responsiveness, sanitize log messages #122

Multiple changes to fix errors and responsiveness, sanitize log messages #122

Uh oh!

Conversation

clcollins commented Jun 9, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants