ExecOps: A governance-first, AI-powered internal operating system for SaaS founders focused on replacing manual toil with deterministic reliability.
OpsMate: A production-grade AWS cost optimization and compliance extension utilizing intelligent agents for autonomous cloud resource management and cost savings.
🚀 Quick Start & Project Overview
This monorepo contains a full-stack application demonstrating event-driven autonomous agents.
- (Live Demo Link Here): Interact with the Inbox UI and see proposals in action (if possible).
- Business Impact: OpsMate agents are designed to save up to 60% on staging EC2 costs and ensure 100% IAM compliance.
bash
# AI Service (Backend: FastAPI, LangGraph, PostgreSQL)
cd ai-service
uv sync
uv run pytest tests/ -v # Run all 292 passing tests
PYTHONPATH=src .venv/bin/activate && uvicorn ai_service.main:app --host 0.0.0.0 --port 8000
# Frontend (Next.js, Shadcn/UI - assuming from previous resume)
cd fullstack
pnpm install
pnpm dev
Use code with caution.
🏗️ Architecture: The "Council of Agents"
This system utilizes a "Council of Agents" pattern, implementing a critical Human-in-the-Loop (HITL) safety mechanism: the Action-Propose Pattern. Agents generate idempotent proposals requiring explicit human approval via the Inbox UI, preventing unintended production changes while maintaining full auditability.
Webhooks → AI Service (/process_event) → Vertical Agents (Sentinel, Watchman, Hunter, Guard)
↓
[ActionProposal]
↓
Inbox UI (Frontend) → Approve/Reject (API)
🤖 Vertical Agents & Business Value
The OpsMate extension focuses on high-impact, autonomous cloud governance tasks.
| Agent | Purpose & Value | Triggers & MLOps |
|---|---|---|
| Sentinel | Enforces PR compliance & deployment policies; ensures auditability. | GitHub Webhook Events |
| Watchman | Saves ~60% on staging EC2 costs by auto-shutting down staging environments based on team activity. | Scheduled (8PM) / Inactivity Monitoring |
| Hunter | Scans for unattached EBS volumes & old snapshots (>30 days); proposes cleanup for immediate cost savings. | On-demand / Scheduled Scan |
| Guard | Ensures security compliance by detecting employee departures & revoking IAM access automatically. | Slack/GitHub Membership Changes (Webhooks) |
✅ Testing & Quality Assurance (MLOps Maturity)
We prioritize reliability with 292 passing unit and integration tests.
- Test Status: 🟢 292 tests passing (Run with
uv run pytest ai-service/tests/ -v) - Note: 3 tests skipped (Neo4j not available in CI environment)
- Future MLOps: Implementing CI/CD pipelines via GitHub Actions for automated testing and deployment.
🧠 Tech Stack
- Agent Orchestration: LangGraph (State Machines), Multi-Agent Systems
- Backend Core: FastAPI, Python, Pydantic validation
- Data Fabric: PostgreSQL + Prisma (persistence), Redis + Celery (async tasks), Neo4j (graph context/RAG)
- Integrations: Boto3 (AWS integration), Webhooks (GitHub, Slack)
- Frontend (UI): Next.js, Shadcn/UI (for the
Inbox UIfor approvals)