A production-ready reference architecture for building enterprise AI agent systems. Features 7 collaborating agents, 8 classifier strategies, RAG-powered document search, and full observability.
# Clone and start (one command!)
git clone git@github.com:eggai-tech/eggai-demo.git
cd eggai-demo
make startOpen http://localhost:8000 and start chatting.
Note: Runs completely locally with LM Studio - no cloud services or API keys required!
| Feature | Description |
|---|---|
| 7 Collaborating Agents | Triage, Billing, Claims, Policies, Escalation, Audit, Frontend |
| 9 Classifier Strategies | Compare LLM, fine-tuned, and neural network approaches (v0-v8) |
| RAG Document Search | Vespa-powered hybrid search (70% semantic + 30% keyword) |
| Production Patterns | Health checks, observability, message-driven architecture |
| Full Observability | Grafana dashboards, distributed tracing, metrics |
User (WebSocket) → Frontend Agent → Triage Agent → Specialized Agent → Response
│
Classifier
(v0-v8: pick your strategy)
Message Flow:
- User sends message via WebSocket to Frontend
- Triage classifies intent using configurable classifier (v0-v8)
- Routes to specialized agent: Billing, Claims, Policies, or Escalation
- Audit monitors all interactions for compliance
- Response streams back to user
| Command | Description |
|---|---|
make start |
Start everything (infrastructure + agents) |
make stop |
Stop agents |
make health |
Check service health |
make test |
Run tests |
make help |
Show all commands |
| Want to... | Start here |
|---|---|
| Understand the system | docs/system-architecture.md |
| See how agents work | agents/triage/agent.py |
| Compare classifiers | agents/triage/classifiers/ |
| Add a new agent | docs/building-agents-eggai.md |
| Configure RAG search | docs/agentic-rag.md |
├── agents/ # AI agents (self-contained modules)
│ ├── frontend/ # WebSocket gateway + chat UI
│ ├── triage/ # Classification and routing
│ │ ├── agent.py # Message handler
│ │ ├── classifiers/ # 9 classifier strategies (v0-v8)
│ │ └── dspy_modules/ # DSPy-based classifiers
│ ├── billing/ # Payment inquiries
│ ├── claims/ # Claim processing
│ ├── policies/ # RAG-powered policy search
│ ├── escalation/ # Complex issue handling
│ └── audit/ # Compliance monitoring
├── libraries/ # Shared utilities
│ ├── communication/ # Kafka channels
│ ├── observability/ # Logging, tracing, metrics
│ └── ml/ # DSPy, MLflow integration
├── config/ # Configuration
│ └── defaults.env # Sensible defaults (works out of box)
├── scripts/ # Startup utilities
│ ├── start.py # One-command startup
│ ├── stop.py # Graceful shutdown
│ └── health_check.py # Service health checks
└── docs/ # Documentation
The triage agent supports 9 classification strategies. Select via
TRIAGE_CLASSIFIER_VERSION:
| Version | Type | Latency | API Call | Training |
|---|---|---|---|---|
| v0 | Minimal prompt | ~500ms | Yes | No |
| v1 | Enhanced prompt | ~600ms | Yes | No |
| v2 | COPRO optimized | ~500ms | Yes | One-time |
| v3 | Few-shot MLflow | ~50ms | No | Yes |
| v4 | Zero-shot COPRO | ~400ms | Yes | One-time |
| v5 | Attention network | ~20ms | No | Yes |
| v6 | OpenAI fine-tuned | ~300ms | Yes | Yes |
| v7 | Gemma fine-tuned | ~100ms | No | Yes |
| v8 | RoBERTa LoRA | ~50ms | No | Yes |
Default: v4 (configured in config/defaults.env — best balance of accuracy
and simplicity)
# Using the unified classifier interface
from agents.triage.classifiers import get_classifier, list_classifiers
classifier = get_classifier("v4")
result = classifier.classify("User: What's my bill?")
print(result.target_agent) # BillingAgent
# Compare all classifiers
for info in list_classifiers():
print(f"{info.version}: {info.name}")| Service | URL | Description |
|---|---|---|
| Chat UI | http://localhost:8000 | Main application |
| Redpanda Console | http://localhost:8082 | Message queue UI |
| Vespa | http://localhost:8080 | Vector search |
| Temporal UI | http://localhost:8081 | Workflow monitoring |
| MLflow | http://localhost:5001 | Experiment tracking |
| Grafana | http://localhost:3000 | Dashboards |
| Prometheus | http://localhost:9090 | Metrics |
Configuration uses a 3-layer approach:
config/defaults.env- Sensible defaults (committed, works out of box).env- Local overrides (gitignored)- Environment variables - Runtime overrides
Key settings:
# Classifier selection
TRIAGE_CLASSIFIER_VERSION=v4
# LLM provider (default: local LM Studio)
TRIAGE_LANGUAGE_MODEL=lm_studio/gemma-3-12b-it-qat
# Or use OpenAI
# TRIAGE_LANGUAGE_MODEL=openai/gpt-4o-mini
# OPENAI_API_KEY=sk-...- Python 3.11+
- Docker and Docker Compose
- uv (recommended) or pip
- LM Studio (for local models) or OpenAI API key
# Run tests
make test
# Run with coverage
make test-coverage
# Lint code
make lint
# Auto-fix lint issues
make lint-fix
# Full reset (removes all data)
make full-reset- System Architecture
- Agent Capabilities
- Multi-Agent Communication
- Building Agents
- RAG with Vespa
- Classifier Guide
- Deployment Guide
MIT
