Skip to content

feat(orch-monitor-tui): implement active daemon health checks with typed failures#406

Open
proboscis wants to merge 1 commit intomainfrom
issue/orch-426/run-20260209-225511
Open

feat(orch-monitor-tui): implement active daemon health checks with typed failures#406
proboscis wants to merge 1 commit intomainfrom
issue/orch-426/run-20260209-225511

Conversation

@proboscis
Copy link
Owner

Summary

  • Replaces passive socket-file-exists check with active ping probe for availability detection
  • Preserves typed failure causes (socket missing, connection refused, timeout, permission denied) throughout proto_client → daemon_api → orch_api stack
  • Adds 22 comprehensive regression tests covering stale socket, reconnect, and error distinguishability

Acceptance Criteria

Criteria Status Evidence
Stale socket does not report as healthy PASS test_stale_socket_returns_connection_refused_error, test_is_available_false_for_stale_socket
Connection refused and timeout are distinguishable PASS test_connection_refused_message_distinct, test_timeout_message_distinct
Monitor reconnect behavior remains automatic PASS test_client_can_reconnect_after_failure, test_persistent_connection_reconnects_on_failure
Tests cover stale socket + reconnect PASS 22 tests in test_daemon_health.py

Changes

New Exception Types (types.py)

  • ProtoDaemonSocketMissingError - socket file doesn't exist
  • ProtoDaemonConnectionRefusedError - stale socket or daemon not listening
  • ProtoDaemonTimeoutError - daemon not responding
  • ProtoDaemonPermissionError - permission denied

Proto Client (proto_client.hy)

  • is-available now performs active ping probe (not just socket stat)
  • New socket-exists method for passive socket file check
  • New check-health method returning Result with specific error types

Macros (macros.hy)

  • socket-send macro catches and raises specific typed exceptions

API Layer (daemon_api.py, orch_api.py)

  • Error mapping functions translate proto exceptions to API exceptions
  • New check_health method on DaemonOrchAPI

Test Results

tests/test_daemon_health.py: 22 passed in 0.18s

Closes orch-426

…ped failures

Replace passive socket-file-exists check with active ping probe for
availability detection. Preserve typed failure causes (socket missing,
connection refused, timeout, permission denied) throughout the stack.

Closes #426
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant