Skip to content

πŸ”¬ Daemon for monitoring services on my raspberry pi

Notifications You must be signed in to change notification settings

martinabeleda/argus

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

4 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Argus

A lightweight Rust daemon that monitors system health and services on Raspberry Pi, with Telegram alerts and auto-recovery.

Named after the hundred-eyed giant from Greek mythology.

Features

  • System Monitoring: CPU, memory, and temperature with rolling averages to prevent false alerts
  • Service Monitoring: Docker Compose and systemd services with health checks
  • Auto-Recovery: Attempts to recover failed services before alerting
  • Telegram Alerts: Notifications with cooldown to prevent spam

Installation

Build from source

cd ~/Development/argus
cargo build --release
cp target/release/argus ~/.cargo/bin/

Configuration

Copy the example config to ~/.config/argus/config.toml:

mkdir -p ~/.config/argus
cp config/config.example.toml ~/.config/argus/config.toml

Edit the config with your Telegram bot token and chat ID.

Usage

# Show current system status
argus --status

# Validate configuration
argus --validate

# Test Telegram connection
argus --test-telegram

# Run a single check cycle
argus --once

# Run as daemon (foreground)
argus

# With debug logging
argus --log-level debug

Example status output:

Argus Status
============

Daemon:      running

System
------
CPU:          26.2%  (threshold: 85%)
Memory:       17.0%  (2.7 GB / 15.8 GB)
Temperature:  55.1Β°C (threshold: 80Β°C)

Services
--------
waypoint:    healthy  (docker-compose)
pihole:      healthy  (systemd)

Telegram Bot Setup

  1. Message @BotFather on Telegram
  2. Send /newbot and follow the prompts
  3. Copy the bot token
  4. Message your new bot (say "hi")
  5. Get your chat ID:
    curl "https://api.telegram.org/bot<TOKEN>/getUpdates"
  6. Add bot_token and chat_id to your config

Systemd Service

Install and enable the service:

sudo cp systemd/argus.service /etc/systemd/system/
sudo systemctl daemon-reload
sudo systemctl enable --now argus

Check logs:

journalctl -fu argus

Sudoers for Pi-hole Recovery

To allow argus to restart Pi-hole without a password:

echo 'mabeleda ALL=(ALL) NOPASSWD: /usr/bin/systemctl restart pihole-FTL.service, /usr/bin/systemctl stop pihole-FTL.service, /usr/bin/systemctl start pihole-FTL.service, /usr/bin/systemctl reset-failed pihole-FTL.service' | sudo tee /etc/sudoers.d/argus

Configuration

[general]
check_interval = 30  # seconds between system checks
log_level = "info"   # trace, debug, info, warn, error

[system.cpu]
enabled = true
threshold = 85.0           # alert above this %
rolling_window = 10        # EMA window (samples)
sustained_duration = 300   # seconds before alerting

[system.memory]
enabled = true
threshold = 90.0
rolling_window = 10
sustained_duration = 300

[system.temperature]
enabled = true
threshold = 80.0           # warning threshold (Celsius)
critical_threshold = 85.0  # critical alert

[[services]]
name = "waypoint"
type = "docker-compose"
health_url = "http://localhost:3001/api/health"
compose_file = "/path/to/docker-compose.yml"

[services.recovery]
enabled = true
max_attempts = 3
retry_delay_secs = 30

[[services]]
name = "pihole"
type = "systemd"
systemd_unit = "pihole-FTL.service"
dns_check = "google.com"  # optional DNS resolution check

[services.recovery]
enabled = true
max_attempts = 3
retry_delay_secs = 30

[telegram]
enabled = true
bot_token = "your-bot-token"
chat_id = "your-chat-id"
cooldown_secs = 300  # minimum seconds between alerts

How It Works

Architecture

Argus runs as an async Tokio daemon with an event-driven loop:

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                   Main Loop                          β”‚
β”‚                                                      β”‚
β”‚   tokio::select! {                                   β”‚
β”‚       system_ticker (30s) => check CPU/RAM/temp      β”‚
β”‚       service_ticker (60s) => check services         β”‚
β”‚       shutdown_signal => exit gracefully             β”‚
β”‚   }                                                  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
  • Event-driven: Uses Tokio interval timers, not blocking sleep
  • Independent timers: System and service checks run on separate schedules
  • Non-blocking recovery: If a service recovery takes time, other checks resume after

Monitoring Algorithm

Uses Exponential Moving Average (EMA) for CPU and memory to smooth out spikes:

  • EMA = Ξ± Γ— current + (1-Ξ±) Γ— previous
  • Ξ± = 2/(window+1) where window=10 gives ~5 min smoothing at 30s intervals

Alerts only trigger after the threshold is exceeded for sustained_duration seconds.

Recovery Strategy

Docker Compose services:

  1. docker compose restart
  2. Wait for healthy status
  3. If failed: docker compose down then up -d
  4. Alert only after all attempts fail

Systemd services:

  1. systemctl restart
  2. If failed: stop β†’ start
  3. If failed: reset-failed β†’ start
  4. Alert only after all attempts fail

License

MIT

About

πŸ”¬ Daemon for monitoring services on my raspberry pi

Resources

Stars

Watchers

Forks

Languages