Skip to content

Feature: Watch Mode - Continuous Compliance Monitoring #8

@flemzord

Description

@flemzord

Feature: Watch Mode - Continuous Compliance Monitoring

Summary

Implement a watch mode that continuously monitors repositories for compliance violations and sends real-time alerts when issues are detected.

Problem Statement

Current compliance checks are run manually or via scheduled CI/CD. This creates gaps where non-compliant changes can persist for hours or days before detection. Organizations need real-time monitoring to immediately identify and respond to compliance violations.

Proposed Solution

Add a --watch mode to the CLI that runs as a daemon, continuously monitoring repositories and sending alerts through various channels when violations are detected.

Detailed Design

Command Line Interface

# Start watch mode with default interval (5 minutes)
github-compliance-cli --config compliance.yml --token $TOKEN --watch

# Custom interval and specific checks
github-compliance-cli --config compliance.yml --token $TOKEN --watch --interval 120 --checks "branch-protection,security-scanning"

# With alert configuration
github-compliance-cli --config compliance.yml --token $TOKEN --watch --alert-config alerts.yml

# Daemon mode with PID file
github-compliance-cli --config compliance.yml --token $TOKEN --watch --daemon --pid-file /var/run/compliance-watch.pid

Configuration Schema

watch:
  enabled: true
  interval_seconds: 300  # Check every 5 minutes

  # Incremental checking - only check repos modified since last run
  incremental: true
  state_file: ".compliance-watch-state.json"

  # Alert thresholds
  thresholds:
    max_violations_before_alert: 5
    alert_on_new_violations_only: true

  # Repositories to prioritize
  priority_repos:
    - "*-production"
    - "*-prod"
    check_interval_seconds: 60  # Check priority repos more frequently

  # Alert channels
  alerts:
    slack:
      enabled: true
      webhook_url: "${SLACK_WEBHOOK_URL}"
      channel: "#compliance-alerts"
      mention_users: ["@security-team"]

    email:
      enabled: true
      smtp_host: "smtp.company.com"
      smtp_port: 587
      from: "compliance@company.com"
      to: ["security@company.com", "platform@company.com"]

    github_issues:
      enabled: true
      repo: "org/compliance-tracking"
      labels: ["compliance-violation", "automated"]
      assign_to: ["security-team"]

    webhook:
      enabled: true
      url: "https://internal-system.company.com/compliance-webhook"
      headers:
        Authorization: "Bearer ${WEBHOOK_TOKEN}"

Architecture

Core Components

class ComplianceWatcher {
  private interval: NodeJS.Timer;
  private state: WatchState;
  private alertManager: AlertManager;
  private metricsCollector: MetricsCollector;

  async start(): Promise<void>;
  async stop(): Promise<void>;
  private async runCheck(): Promise<CheckResult[]>;
  private async compareWithPreviousState(results: CheckResult[]): Delta;
  private async sendAlerts(violations: Violation[]): Promise<void>;
}

interface WatchState {
  lastRun: Date;
  lastModifiedRepos: Map<string, Date>;
  knownViolations: Set<string>;
  metrics: WatchMetrics;
}

interface AlertManager {
  send(channel: AlertChannel, violation: Violation): Promise<void>;
  batchSend(channel: AlertChannel, violations: Violation[]): Promise<void>;
  testConnection(channel: AlertChannel): Promise<boolean>;
}

Features

1. Incremental Checking

  • Track repository last-modified times
  • Only check repositories that changed since last run
  • Maintain state file for persistence across restarts

2. Smart Alerting

  • Deduplication: Don't alert for known violations
  • Batching: Group alerts to prevent spam
  • Priority-based: Immediate alerts for critical repos
  • Throttling: Rate limit alerts per repository

3. Monitoring Dashboard

// Optional web UI on localhost:3000
interface DashboardData {
  status: 'running' | 'stopped' | 'error';
  uptime: number;
  lastCheck: Date;
  nextCheck: Date;
  totalViolations: number;
  recentViolations: Violation[];
  repoStatus: Map<string, ComplianceStatus>;
  performanceMetrics: Metrics;
}

4. Health Checks

  • Self-monitoring with heartbeat
  • Automatic restart on failure
  • Memory usage monitoring
  • API rate limit tracking

User Stories

  • As a security engineer, I want immediate alerts when someone disables branch protection
  • As a platform lead, I want a dashboard showing real-time compliance status
  • As a DevOps engineer, I want to integrate compliance alerts with our incident management system
  • As a compliance officer, I need audit logs of all detected violations and remediations

Technical Considerations

Performance

  • Efficient API usage with conditional requests (ETags)
  • Parallel repository checking with worker pool
  • Memory-efficient state management
  • Graceful degradation under API rate limits

Reliability

  • Automatic reconnection for webhooks
  • State persistence across restarts
  • Transaction log for audit trail
  • Retry logic with exponential backoff

Scalability

  • Support for thousands of repositories
  • Distributed mode option (multiple watchers)
  • Redis/database backend for state (optional)
  • Metrics export (Prometheus format)

Testing Strategy

  • Unit tests for state management and diffing logic
  • Integration tests with mock alert channels
  • Load testing with large numbers of repositories
  • Fault injection testing (network failures, API errors)
  • End-to-end testing of alert delivery

Documentation Needs

  • Setup guide for various alert channels
  • Troubleshooting guide for common issues
  • Performance tuning guide
  • Integration examples with popular tools (PagerDuty, Datadog, etc.)

Success Criteria

  • Watch mode runs continuously without memory leaks
  • Alerts are delivered within 1 minute of violation detection
  • Incremental checking reduces API calls by >70%
  • All configured alert channels work reliably
  • State persists across restarts
  • Dashboard provides real-time visibility
  • Resource usage remains stable over 24+ hours

Dependencies

  • State management solution
  • Alert channel SDKs/libraries
  • Web framework for dashboard (if included)
  • Process management for daemon mode

Open Questions

  1. Should we support custom alert templates?
  2. How to handle alert fatigue for frequently changing repos?
  3. Should watch mode support auto-remediation?
  4. Integration with GitHub webhooks vs polling?
  5. Should we provide Kubernetes deployment manifests?

Future Enhancements

  • Machine learning for anomaly detection
  • Predictive alerts based on patterns
  • Integration with GitHub Advanced Security
  • Multi-organization support
  • Custom alert routing rules
  • Compliance trend analysis

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions