A lightweight Rust daemon that monitors system health and services on Raspberry Pi, with Telegram alerts and auto-recovery.
Named after the hundred-eyed giant from Greek mythology.
- System Monitoring: CPU, memory, and temperature with rolling averages to prevent false alerts
- Service Monitoring: Docker Compose and systemd services with health checks
- Auto-Recovery: Attempts to recover failed services before alerting
- Telegram Alerts: Notifications with cooldown to prevent spam
cd ~/Development/argus
cargo build --release
cp target/release/argus ~/.cargo/bin/Copy the example config to ~/.config/argus/config.toml:
mkdir -p ~/.config/argus
cp config/config.example.toml ~/.config/argus/config.tomlEdit the config with your Telegram bot token and chat ID.
# Show current system status
argus --status
# Validate configuration
argus --validate
# Test Telegram connection
argus --test-telegram
# Run a single check cycle
argus --once
# Run as daemon (foreground)
argus
# With debug logging
argus --log-level debugExample status output:
Argus Status
============
Daemon: running
System
------
CPU: 26.2% (threshold: 85%)
Memory: 17.0% (2.7 GB / 15.8 GB)
Temperature: 55.1Β°C (threshold: 80Β°C)
Services
--------
waypoint: healthy (docker-compose)
pihole: healthy (systemd)
- Message
@BotFatheron Telegram - Send
/newbotand follow the prompts - Copy the bot token
- Message your new bot (say "hi")
- Get your chat ID:
curl "https://api.telegram.org/bot<TOKEN>/getUpdates" - Add
bot_tokenandchat_idto your config
Install and enable the service:
sudo cp systemd/argus.service /etc/systemd/system/
sudo systemctl daemon-reload
sudo systemctl enable --now argusCheck logs:
journalctl -fu argusTo allow argus to restart Pi-hole without a password:
echo 'mabeleda ALL=(ALL) NOPASSWD: /usr/bin/systemctl restart pihole-FTL.service, /usr/bin/systemctl stop pihole-FTL.service, /usr/bin/systemctl start pihole-FTL.service, /usr/bin/systemctl reset-failed pihole-FTL.service' | sudo tee /etc/sudoers.d/argus[general]
check_interval = 30 # seconds between system checks
log_level = "info" # trace, debug, info, warn, error
[system.cpu]
enabled = true
threshold = 85.0 # alert above this %
rolling_window = 10 # EMA window (samples)
sustained_duration = 300 # seconds before alerting
[system.memory]
enabled = true
threshold = 90.0
rolling_window = 10
sustained_duration = 300
[system.temperature]
enabled = true
threshold = 80.0 # warning threshold (Celsius)
critical_threshold = 85.0 # critical alert
[[services]]
name = "waypoint"
type = "docker-compose"
health_url = "http://localhost:3001/api/health"
compose_file = "/path/to/docker-compose.yml"
[services.recovery]
enabled = true
max_attempts = 3
retry_delay_secs = 30
[[services]]
name = "pihole"
type = "systemd"
systemd_unit = "pihole-FTL.service"
dns_check = "google.com" # optional DNS resolution check
[services.recovery]
enabled = true
max_attempts = 3
retry_delay_secs = 30
[telegram]
enabled = true
bot_token = "your-bot-token"
chat_id = "your-chat-id"
cooldown_secs = 300 # minimum seconds between alertsArgus runs as an async Tokio daemon with an event-driven loop:
βββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Main Loop β
β β
β tokio::select! { β
β system_ticker (30s) => check CPU/RAM/temp β
β service_ticker (60s) => check services β
β shutdown_signal => exit gracefully β
β } β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββ
- Event-driven: Uses Tokio interval timers, not blocking sleep
- Independent timers: System and service checks run on separate schedules
- Non-blocking recovery: If a service recovery takes time, other checks resume after
Uses Exponential Moving Average (EMA) for CPU and memory to smooth out spikes:
EMA = Ξ± Γ current + (1-Ξ±) Γ previousΞ± = 2/(window+1)where window=10 gives ~5 min smoothing at 30s intervals
Alerts only trigger after the threshold is exceeded for sustained_duration seconds.
Docker Compose services:
docker compose restart- Wait for healthy status
- If failed:
docker compose downthenup -d - Alert only after all attempts fail
Systemd services:
systemctl restart- If failed: stop β start
- If failed: reset-failed β start
- Alert only after all attempts fail
MIT