A production-minded reference implementation for A2A-style agent systems
This repository is a learning-first, production-minded reference implementation of an agent system built using explicit contracts, deterministic routing, guardrails, and cost-aware LLM usage.
It exists to answer a simple but under-documented question:
What does a real, inspectable, production-style agent system actually look like?
Rather than relying on opaque abstractions or "fully autonomous" claims, this project focuses on:
- clear agent boundaries
- explicit intent routing
- deterministic fallbacks
- observable behavior
- realistic operational constraints
It is not a toy demo β but it is also not a finished product. It's a system being built step by step, in public, with trade-offs documented along the way.
This project includes a complete AI-assisted development toolkit β a practical implementation of Claude Code Creator best practices that helps you build better software faster.
Stop guessing how to work with AI. Use proven patterns:
# See all available patterns
make show-promptsVerification Patterns β Prove your code works before merging:
- π― Prove It Works β Demand concrete evidence (tests, benchmarks, edge cases)
- π Show Me The Tests β Verify test coverage and quality
- β Code Review Checklist β Self-review against standards
- π Compare Branches β Verify behavior changes between branches
Debugging Patterns β Find and fix issues systematically:
- π Root Cause Analysis β Investigate bugs methodically
- π Rubber Duck Debug β Explain the problem to find the solution
- π Performance Optimization β Measure, optimize, verify
Planning Patterns β Design before you code:
- π Detailed Specification β Turn vague ideas into clear specs
- ποΈ Architecture Decision β Document technical choices
- π§© Break Down Complex Task β Split large tasks into steps
Built-in scripts that analyze your codebase:
# Find technical debt
make techdebt
# Aggregate project context for AI
make context
# Automated code review
make review
# Find performance bottlenecks
make optimize
# Security vulnerability scanning
make security-scanIntelligent debugging that extracts and analyzes errors:
# Extract and analyze the last error from logs
make debug-last-error
# Analyze failing tests with fix suggestions
make debug-failing-tests
# Debug Docker containers with error correlation
make docker-debugExample output:
## Error Analysis
Pattern: ModuleNotFoundError
Category: dependency
Severity: high
## Root Cause
Missing Python package in environment
## Suggested Solutions
1. Install missing package: pip install <package>
2. Add to requirements.txt
3. Rebuild Docker container if neededThe framework learns from your mistakes:
# Capture a learning after fixing a bug
make update-rules MSG="Always test Docker builds before committing"
# Validate compliance with development rules
make validate-rulesLearnings are automatically:
- β
Stored in
docs/lessons-learned/ - β Indexed for easy discovery
- β Applied to future development
Traditional development:
- β Vague prompts β inconsistent results
- β Manual debugging β slow iteration
- β Repeat mistakes β wasted time
- β No code quality checks β technical debt
With this toolkit:
- β Proven patterns β predictable results
- β Automated analysis β fast debugging
- β Captured learnings β continuous improvement
- β Built-in quality checks β cleaner code
# Clone and explore
git clone https://github.com/jharjadi/procode-agent-framework.git
cd procode-agent-framework
# See available development tools
make help
# View prompt patterns
make show-prompts
# Run code quality analysis
make techdebtLearn more:
- π Complete Prompt Patterns Library (21 patterns)
- ποΈ Architect Mode Templates (6 templates)
- π» Code Mode Templates (7 templates)
- π Development Rules (Complete guide)
To set expectations clearly, this project does not attempt to solve:
- long-horizon autonomous planning
- unsupervised execution across critical systems
- regulatory approval or legal compliance
- trust between unknown or adversarial agents
- "AGI-style" general intelligence
Those are real problems β and intentionally out of scope here.
This project is guided by a few core principles:
Agents communicate via explicit schemas and agent cards.
LLMs enhance the system, but never replace inspectable logic.
Model choice, routing, and fallbacks are designed with real budgets in mind.
Every request is auditable. Every decision can be traced.
External agents run as independent services, not hidden classes.
You can run the entire system locally in about a minute.
git clone https://github.com/jharjadi/procode-agent-framework.git
cd procode-agent-framework
cp .env.example .env
docker-compose up -d- Principal Agent API: http://localhost:9998
- Frontend UI (Next.js): http://localhost:3001
- PostgreSQL: localhost:5433
- Weather Agent (external): http://localhost:9996
- Insurance Agent (external): http://localhost:9997
The system works out of the box with deterministic routing.
Adding an LLM API key enables enhanced intent classification.
Each agent in this system is described using an agent card β a declarative contract that defines:
- agent identity
- capabilities
- supported intents
- routing expectations
- security and operational constraints
Example:
agent:
name: insurance_agent
role: principal
version: 1.0.0
capabilities:
info:
intents: ["get", "check", "quote", "coverage"]
creation:
intents: ["create", "update", "cancel"]This allows agents to be:
- independently deployable
- discoverable
- replaceable
- reasoned about without reading implementation code
This repository is being built incrementally.
The system is currently at Step 11 of 25 in a documented production roadmap.
- Principal agent with deterministic routing
- LLM-assisted intent classification with fallback logic
- Multi-turn conversation memory
- Server-Sent Events (SSE) streaming
- Tool integration with mocked / real execution modes
- Weather Agent (OpenWeatherMap API, caching, standalone service)
- Insurance Agent (principal + task-agent pattern)
- SQLAlchemy ORM with SQLite / PostgreSQL
- Alembic migrations
- Full audit trail (database + file logging)
- Conversation history persistence
- Optional enterprise-style API key system
- Rate limiting (per-IP / per-key)
- PII detection and redaction
- CORS restrictions
- Circuit breaker patterns
One of the goals of this project is to treat LLM cost as a design constraint, not an afterthought.
The system supports:
- multiple LLM providers (Anthropic, OpenAI, Google)
- lightweight models for simple intents
- higher-capability models only when needed
- deterministic routing when LLMs are unavailable
A detailed breakdown of the cost strategy is available here:
π Cost Optimization Summary
Principal Agent
β
ββ Intent Classification (LLM + deterministic fallback)
ββ Task Routing
ββ Guardrails & Audit Logging
β
ββ Internal Task Agents
β ββ Tickets
β ββ Account
β ββ Payments
β ββ General
β
ββ External Agents (A2A)
ββ Weather Agent (standalone service)
ββ Insurance Agent (principal + task agents)
Each external agent runs as its own process, communicates over A2A, and can be replaced independently.
All behavior is driven via environment variables.
LLM usage is optional.
# LLMs (optional)
ANTHROPIC_API_KEY=
OPENAI_API_KEY=
GOOGLE_API_KEY=
# Database
USE_DATABASE=true
DATABASE_URL=postgresql://user:pass@localhost:5433/procode
# Security (optional)
ENABLE_API_KEY_AUTH=false
ENABLE_API_SECURITY=false
RATE_LIMIT_PER_MINUTE=10
# External agents
OPENWEATHER_API_KEY=
EXTERNAL_AGENTS_CONFIG=config/external_agents.production.jsonSee .env.example for the full list.
make test-allIncludes:
- unit tests
- LLM integration tests
- streaming tests
- agent-to-agent communication tests
- database persistence tests
- Database persistence
- API authentication
- API security & rate limiting
- External agents system
- β³ Redis caching (next)
- β³ Horizontal scaling
- β³ Message queues
- β³ Observability
- β³ Vector search
- β³ RAG
- β³ Model optimization
- β³ Multi-tenancy
- β³ Billing & usage tracking
- β³ Admin UI
- β³ CI/CD
- β³ Deployment guides
Full roadmap: Production Roadmap
This project is evolving quickly, and some APIs are still in flux.
That said:
- issues for discussion are welcome
- forks are encouraged
- patterns and ideas are free to reuse elsewhere
Once the core architecture stabilises, contributions will open up more formally.
MIT License β use freely, modify responsibly.
Copyright (c) 2026 Jimmy Harjadi
This repository is not trying to predict the future of AI agents.
It's trying to make the present less confusing by showing:
- what works
- what breaks
- what trade-offs exist
- and where human judgment is still required
If it helps you reason more clearly about agent systems, then it's done its job.
Built with expertise in: AI/ML Engineering, Solution Architecture, Production Systems, Cost Optimization, Security & Compliance
Questions? Check the documentation or review the development history for context.