diff --git a/.gitignore b/.gitignore index 6fe18a3..1b28afc 100644 --- a/.gitignore +++ b/.gitignore @@ -38,3 +38,11 @@ cookies.txt test_cookies.txt *_cookies.txt + +# Multi-Agent Workflow +metrics.json +quality_audit_report.txt +merge_order.txt +merge_order.json +DASHBOARD.md +.coverage.json diff --git a/AGENT_LEARNINGS b/AGENT_LEARNINGS new file mode 120000 index 0000000..0d5b28e --- /dev/null +++ b/AGENT_LEARNINGS @@ -0,0 +1 @@ +/Users/dp/Projects/multi-agent-workflow/AGENT_LEARNINGS \ No newline at end of file diff --git a/Claude.md b/Claude.md new file mode 100644 index 0000000..0bc97e1 --- /dev/null +++ b/Claude.md @@ -0,0 +1,263 @@ +# CLAUDE.md +*Personal development context for DP - Marine Engineer & AI Experimentalist* + +## About Me +- **Background**: Chief Engineer, 16+ years maritime engineering, transitioning to AI/infrastructure roles +- **Development Philosophy**: "Vibe coding" - AI-generated code with human oversight, rapid iteration +- **Learning Style**: "Serial master" - deep dive into new domains, achieve competency quickly +- **Current Focus**: Building GitHub portfolio, experimenting with AI platforms, exploring multi-agent systems +- **Constraint Awareness**: Often work in bandwidth-limited environments (offshore), value offline-capable solutions + +## Communication Preferences + +### How to Talk to Me +- **Tone**: Direct and concise. Skip the preambles and "I understand" statements +- **Explanations**: Use engineering analogies when helpful (I think in systems/components) +- **Uncertainty**: Just tell me when you don't know or need clarification +- **Code-first**: Show me code, not lengthy descriptions of what code would do +- **Iterate fast**: I'd rather see a working prototype than wait for a perfect solution + +### What Annoys Me +- ❌ Over-explaining basic concepts (I pick things up quickly) +- ❌ Asking permission for every small decision (be autonomous) +- ❌ Apologizing repeatedly or hedging excessively +- ❌ Long wind-ups before getting to the point +- ❌ "As an AI language model..." disclaimers + +### What I Appreciate +- ✅ Proactive suggestions based on context +- ✅ Pointing out potential issues before I hit them +- ✅ Teaching me better patterns when you see inefficient approaches +- ✅ Referencing my past work/projects when relevant +- ✅ Adapting to my terminology and mental models + +## Code Style & Standards + +### General Principles +- **Readability > Cleverness**: Code should be self-documenting +- **Pragmatic not Perfect**: Working code beats theoretical perfection +- **Explicit > Implicit**: Be obvious about intent, especially in config +- **Type Safety**: Use type hints/annotations in typed languages +- **Error Handling**: Specific exceptions with context, never fail silently + +### Language Preferences +- **Python**: Primary language for AI/scripting work + - Type hints always (mypy-compatible) + - f-strings for string formatting + - Dataclasses for structured data + - Black/Ruff for formatting + +- **JavaScript/TypeScript**: Web/Node projects + - Prefer TypeScript for anything non-trivial + - ESLint + Prettier + - Functional style when appropriate + +- **Swift**: iOS development + - SwiftUI for UI + - Combine for reactive patterns + - Follow Apple's naming conventions + +### Git Practices +- **Commits**: Conventional commits (`feat:`, `fix:`, `docs:`, `refactor:`, `chore:`) +- **Messages**: Descriptive but concise (<72 chars in summary line) +- **Attribution**: NEVER mention AI tools in commit messages (no "Generated with Claude Code") +- **Frequency**: Commit often, in logical chunks +- **Branches**: Feature branches for non-trivial work + +## Project Setup Patterns + +### When Starting New Projects +1. Always create comprehensive README with: + - What it does (one sentence) + - Installation instructions + - Usage examples + - Development setup +2. Include `.gitignore` appropriate for tech stack +3. Set up basic linting/formatting from day one +4. Add LICENSE file (default: MIT unless specified) +5. Structure follows convention for that language/framework + +### Directory Structure Philosophy +- **Flat when possible**: Don't over-nest directories prematurely +- **Domain-driven**: Group by feature/domain, not by file type +- **Tests alongside**: Tests live near the code they test +- **Clear separation**: Config, source, tests, docs, scripts clearly separated + +## Development Workflow + +### Problem-Solving Approach +1. **Understand**: Ask clarifying questions if requirements are ambiguous +2. **Research**: Check documentation/examples when using new libraries +3. **Prototype**: Get something working first, optimize later +4. **Test**: Include basic tests for non-trivial functionality +5. **Document**: Update README/comments for non-obvious choices + +### When Things Break +- Show me the error message first +- Explain what you think is happening +- Suggest 2-3 possible fixes, with your recommendation +- Implement the fix (or ask which direction I prefer if genuinely ambiguous) + +### Code Reviews +When reviewing my code or suggesting improvements: +- Point out bugs/security issues immediately +- Suggest better patterns diplomatically ("Consider..." not "You should...") +- Explain WHY your suggestion is better (performance? maintainability? convention?) +- Provide code examples for alternatives + +## Domain Knowledge + +### Marine Engineering Context +- Familiar with: Systems thinking, failure modes, redundancy, safety margins +- Mental models: Components, flows, control loops, cascading failures +- Language: Use engineering terminology freely (I'll ask if something is unfamiliar) +- Analogies: Engineering analogies help me grasp new concepts quickly + +### AI/ML Understanding +- Comfortable with: API integration, prompt engineering, RAG patterns, vector databases +- Experimenting with: Multi-agent systems, autonomous workflows, AI-assisted development +- Learning: Infrastructure/deployment patterns, production ML systems +- Tools: OpenAI API, Anthropic, local models (Ollama), various AI platforms + +## Tool & Platform Preferences + +### Development Environment +- **macOS** primary platform +- **Terminal-first**: Comfortable with CLI tools +- **VS Code**: Primary editor (but adaptable) +- **Desktop Commander MCP**: Available for file operations + +### AI Tools in Stack +- Claude (you!) for coding assistance +- OpenAI API for projects +- Local models (Ollama) for experimentation +- Various AI platforms for comparison/testing + +### Deployment/Hosting +- Prefer: Vercel, Railway, Fly.io for simplicity +- Comfortable with: Docker, basic DevOps +- Learning: K8s, larger-scale infrastructure + +## Project Categories & Approaches + +### Quick Experiments +- Speed > perfection +- Minimal setup, maximum learning +- Document decisions in README +- No tests required unless it's tricky logic + +### Portfolio Projects +- Clean, documented code +- README that sells the project +- Basic tests for core functionality +- Deploy somewhere live if applicable + +### Production/Serious Tools +- Comprehensive tests (>80% coverage on critical paths) +- Error handling and logging +- Documentation (both user and developer) +- CI/CD pipeline +- Monitoring/observability considerations + +## Special Instructions + +### For File Operations +- Always use absolute paths when possible +- Watch for offshore/low-bandwidth constraints if relevant +- Consider offline-first design for maritime tools +- Batch operations when possible (API calls, file writes) + +### For Multi-Agent/AI Systems +- **Coordinator pattern**: Main agent orchestrates, sub-agents specialize +- **Explicit context**: Pass context explicitly, don't rely on shared state +- **Memory**: Structured (JSON/DB), version-controlled when possible +- **Prompts**: Store as separate files, not hardcoded strings + +### For Documentation +- **README**: Target someone seeing the project for the first time +- **Comments**: Explain WHY, not WHAT (code shows what) +- **Inline docs**: For public APIs and non-obvious algorithms +- **Architecture**: Diagram or describe key design decisions + +## Common Tasks - Quick Reference + +### Python Project Init +```bash +# Create structure +mkdir -p src tests docs +touch README.md requirements.txt .gitignore +echo "# Project Name" > README.md + +# Virtual env +python -m venv venv +source venv/bin/activate # or: . venv/bin/activate + +# Dev tools +pip install ruff mypy pytest +``` + +### Node/TypeScript Project Init +```bash +npm init -y +npm install -D typescript @types/node tsx +npm install -D eslint prettier +npx tsc --init +``` + +### Git Init +```bash +git init +git add . +git commit -m "chore: initial commit" +gh repo create # if using GitHub CLI +``` + +## Anti-Patterns to Avoid +- ❌ Overengineering simple problems +- ❌ Premature optimization +- ❌ Magic numbers (use named constants) +- ❌ God classes/functions (keep focused) +- ❌ Ignoring errors silently +- ❌ Committing secrets/credentials +- ❌ Leaving commented-out code +- ❌ Copy-pasting without understanding + +## When You're Done +Just say "Done." or "Complete." - no need for summaries unless specifically requested. + +If the task is complex/multi-step, show progress markers: +``` +✓ Created directory structure +✓ Set up configuration +⚠ Tests incomplete (needs manual review) +→ Next: Deploy to staging +``` + +## Context Management + +### Things to Remember +- I work on multiple projects simultaneously +- I may reference past conversations/work +- I value consistent patterns across my projects +- I'm building toward career transition (engineering → AI/infrastructure) + +### When to Ask for Clarification +- Ambiguous requirements (especially for new projects) +- Choice between equally valid approaches +- Potential security/safety implications +- Deployment/infrastructure decisions +- Breaking changes to existing work + +### When to Just Decide +- Code formatting details (use standard for language) +- Naming variables/functions (use conventions) +- File/directory structure (follow patterns above) +- Implementation details within clear requirements +- Refactoring opportunities (just do them) + +--- + +*This file lives at `~/.claude/CLAUDE.md` and applies to all projects unless overridden by project-specific `CLAUDE.md` in the repo root.* + +**Project-specific overrides**: Create `CLAUDE.md` in project root for project-specific context. +**Local working notes**: Use `CLAUDE.local.md` (git-ignored) for temporary context. \ No newline at end of file diff --git a/MULTI_AGENT_WORKFLOW_GUIDE.md b/MULTI_AGENT_WORKFLOW_GUIDE.md new file mode 100644 index 0000000..1ce9130 --- /dev/null +++ b/MULTI_AGENT_WORKFLOW_GUIDE.md @@ -0,0 +1,1114 @@ +# Multi-Agent Development Workflow: A Meta-Pattern Guide + +**Version**: 1.0 +**Last Updated**: 2025-11-17 +**Source Project**: Agent-Lab + +--- + +## 📋 Table of Contents + +1. [Overview](#overview) +2. [The Meta-Pattern](#the-meta-pattern) +3. [When to Use This Approach](#when-to-use-this-approach) +4. [Architecture of Agent Teams](#architecture-of-agent-teams) +5. [Role Templates](#role-templates) +6. [Coordination Mechanisms](#coordination-mechanisms) +7. [Implementation Guide](#implementation-guide) +8. [Best Practices](#best-practices) +9. [Prompts & Templates](#prompts--templates) +10. [Troubleshooting](#troubleshooting) +11. [Case Study: Agent-Lab](#case-study-agent-lab) + +--- + +## Overview + +This guide documents a **meta-development pattern**: using multiple specialized AI agents to collaboratively build software. Instead of a single AI assistant, you deploy a team of AI agents, each with specific expertise and responsibilities. + +### Key Insight +Just as human software teams benefit from specialization (backend dev, frontend dev, QA, etc.), AI agent teams can work more effectively when given focused roles with clear boundaries. + +--- + +## The Meta-Pattern + +### Core Concept + +``` +Traditional Approach: Multi-Agent Approach: +┌─────────────────┐ ┌──────────────────────────┐ +│ One AI Agent │ │ Specialized Team │ +│ Does All Work │ │ ┌────────────────────┐ │ +│ │ │ │ Backend Engineer │ │ +│ • Backend │ vs │ │ Agent Developer │ │ +│ • Frontend │ │ │ CLI Engineer │ │ +│ • Testing │ │ │ QA Engineer │ │ +│ • Docs │ │ │ Technical Writer │ │ +│ • ... │ │ └────────────────────┘ │ +└─────────────────┘ └──────────────────────────┘ +``` + +### Advantages + +1. **Parallel Execution**: Multiple agents work simultaneously +2. **Deep Expertise**: Each agent maintains context in their domain +3. **Clear Boundaries**: Reduces conflicts and confusion +4. **Natural Handoffs**: Integration points are explicit +5. **Maintainable Prompts**: Shorter, focused role definitions +6. **Scalable**: Add agents as needed + +### Disadvantages + +1. **Coordination Overhead**: Requires structured communication +2. **Integration Complexity**: Agents must align their outputs +3. **Setup Time**: Initial role definition takes effort +4. **Resource Usage**: More AI conversations running + +--- + +## When to Use This Approach + +### Good Fits ✅ + +- **Medium to Large Projects** (>5,000 lines of code) +- **Clear Domain Separation** (backend/frontend, core/UI) +- **Long-Term Development** (weeks to months) +- **Multiple Subsystems** that can be built independently +- **High Quality Requirements** (need testing, docs, reviews) +- **Projects with Distinct Phases** (foundation → features → polish) + +### Poor Fits ❌ + +- **Small Scripts** (<500 lines) +- **Quick Prototypes** (done in hours) +- **Single-Developer Projects** with tight coupling +- **Exploratory Work** where requirements are unclear +- **Simple CRUD Applications** without complexity + +### Decision Framework + +Ask yourself: +1. Can I divide work into 3+ independent workstreams? +2. Will development take more than 1 week? +3. Do I need parallel progress on multiple fronts? +4. Is quality (tests, docs) as important as features? + +If 3+ answers are "yes", consider the multi-agent approach. + +--- + +## Architecture of Agent Teams + +### Standard 5-Agent Team (Recommended Baseline) + +``` +┌─────────────────────────────────────────────────────┐ +│ Project Goal │ +└─────────────────────────────────────────────────────┘ + │ + ┌──────────────────┼──────────────────┐ + │ │ │ + ▼ ▼ ▼ +┌─────────────┐ ┌─────────────┐ ┌─────────────┐ +│ Backend │ │ Feature │ │ Testing │ +│ Engineer │ │ Developer │ │ Engineer │ +│ │ │ │ │ │ +│ Core infra │ │ Business │ │ Test suite │ +│ APIs │ │ logic │ │ Quality │ +└─────────────┘ └─────────────┘ └─────────────┘ + │ │ │ + └──────────────────┼──────────────────┘ + │ + ┌──────────────────┴──────────────────┐ + │ │ + ▼ ▼ +┌─────────────┐ ┌─────────────┐ +│ Interface │ │ Technical │ +│ Engineer │ │ Writer │ +│ │ │ │ +│ CLI/UI │ │ Docs │ +│ UX │ │ Examples │ +└─────────────┘ └─────────────┘ +``` + +### Role Descriptions + +#### 1. Backend/Infrastructure Engineer +**Builds**: Core systems, APIs, data models +**Outputs**: Infrastructure code, utilities, core libraries +**Dependencies**: None (starts first) +**Typical files**: `core/`, `models/`, `utils/`, `db/` + +#### 2. Feature/Domain Developer +**Builds**: Business logic, domain-specific code +**Outputs**: Features, algorithms, workflows +**Dependencies**: Backend APIs +**Typical files**: `agents/`, `services/`, `business/` + +#### 3. Interface Engineer +**Builds**: User-facing interfaces (CLI, GUI, API) +**Outputs**: Commands, UI components, endpoints +**Dependencies**: Feature APIs +**Typical files**: `cli/`, `ui/`, `api/routes/` + +#### 4. QA/Testing Engineer +**Builds**: Test suites, quality infrastructure +**Outputs**: Unit tests, integration tests, CI/CD +**Dependencies**: All code (tests everything) +**Typical files**: `tests/`, `.github/workflows/` + +#### 5. Technical Writer +**Builds**: Documentation, examples, guides +**Outputs**: Docs, tutorials, API references +**Dependencies**: All code (documents everything) +**Typical files**: `docs/`, `examples/`, `CONTRIBUTING.md` + +### Alternative Configurations + +#### 3-Agent Team (Small Projects) +- **Core Developer** (backend + features) +- **Interface Developer** (UI/CLI) +- **Quality Engineer** (tests + docs) + +#### 7-Agent Team (Large Projects) +- **Infrastructure Engineer** (DevOps, deployment) +- **Backend Engineer** (APIs, data) +- **Domain Expert 1** (e.g., agent implementations) +- **Domain Expert 2** (e.g., evaluation systems) +- **Frontend Engineer** (UI) +- **QA Engineer** (testing) +- **Technical Writer** (docs) + +#### 10-Agent Team (Enterprise Scale) +Add: Security Engineer, Performance Engineer, Database Specialist + +--- + +## Role Templates + +### Template 1: Backend Engineer + +```markdown +# Role: Backend Engineer + +## Identity +You are the Backend Engineer for [PROJECT_NAME]. You build core infrastructure. + +## Current State +- ✅ [What exists] +- 🔄 [What's in progress] +- ❌ [What's missing] + +## Your Mission +Build the foundational systems that other agents depend on. + +## Priority Tasks +1. **Task 1** - [Description] + - File: `path/to/file.py` + - APIs: [List key functions/classes] + - Dependencies: [What you need first] + +2. **Task 2** - [Description] + - [Details] + +## Integration Points +- **Your code is used by**: [List dependent agents] +- **You depend on**: [List dependencies] +- **Shared interfaces**: [List APIs you provide] + +## Success Criteria +- [ ] [Specific testable outcome 1] +- [ ] [Specific testable outcome 2] +- [ ] All functions have docstrings +- [ ] Unit tests achieve 80%+ coverage +- [ ] Code follows project style guide + +## Constraints +- All code in `[directory]` +- Use Python 3.11+ features +- No external services without approval +- Log all operations to `[log_file]` + +## Getting Started +1. Read `[existing_file.py]` to understand current state +2. Implement `[first_function]` in `[target_file.py]` +3. Write tests in `tests/unit/test_[module].py` +4. Document APIs in docstrings +5. Post daily progress to `daily_logs/` + +## Example Code Structure +[Include pseudocode or skeleton code] + +## Questions? +Post to `questions.md` or ask the project coordinator. +``` + +### Template 2: Feature Developer + +```markdown +# Role: [Domain] Developer + +## Identity +You are the [Domain] Developer for [PROJECT_NAME]. You implement [specific features]. + +## Current State +- Existing: [List what's built] +- Needed: [List what's missing] + +## Your Mission +Implement [feature set] using [core infrastructure]. + +## Priority Tasks +1. **[Feature 1]** - [Description] + - Depends on: [Backend API] + - Provides: [Public interface] + - File: `[path]` + +2. **[Feature 2]** - [Description] + +## Integration Points +- **Uses**: [Backend APIs, external libraries] +- **Provides**: [Public functions/classes] +- **Communicates with**: [Other agents] + +## Success Criteria +- [ ] [Feature 1] works end-to-end +- [ ] [Feature 2] passes acceptance tests +- [ ] All edge cases handled +- [ ] Examples provided in docs + +## Phase Breakdown +### Phase 1: Foundation +- Build [core component] +- Test basic functionality + +### Phase 2: Integration +- Connect to [backend system] +- Handle errors gracefully + +### Phase 3: Polish +- Optimize performance +- Add logging and monitoring + +## Example Usage +[Show how your code will be used] +``` + +### Template 3: Interface Engineer (CLI) + +```markdown +# Role: CLI Engineer + +## Identity +You are the CLI Engineer for [PROJECT_NAME]. You build the command-line interface. + +## Current State +- Existing commands: [list] +- Needed commands: [list] + +## Your Mission +Create an intuitive, powerful CLI using [framework]. + +## Priority Commands +1. **`[command]` command** - [What it does] + - Usage: `[project] [command] [args]` + - Implementation: Use [backend API] + - Output: [Format, styling] + +## CLI Design Principles +- **Intuitive**: Common tasks are easy +- **Informative**: Clear progress indicators +- **Safe**: Confirm destructive operations +- **Pretty**: Use colors, tables, progress bars + +## Success Criteria +- [ ] All commands work without errors +- [ ] Help text is clear and complete +- [ ] Interactive prompts for missing args +- [ ] Error messages are helpful + +## Technical Details +- Framework: [Typer, Click, argparse] +- Output formatting: [Rich, colorama] +- Config: [Where config is loaded from] + +## Example Commands +[Show example usage with output] +``` + +### Template 4: QA Engineer + +```markdown +# Role: QA Engineer + +## Identity +You are the QA Engineer for [PROJECT_NAME]. You ensure quality through testing. + +## Current State +- Test coverage: [X]% +- Test files: [count] +- Missing tests: [list areas] + +## Your Mission +Achieve comprehensive test coverage and prevent regressions. + +## Priority Tasks +1. **Unit Tests** - Test individual components + - Target: 80%+ coverage + - Files: `tests/unit/test_*.py` + +2. **Integration Tests** - Test component interaction + - Scenarios: [list key workflows] + +3. **E2E Tests** - Test full user journeys + - Commands: [list CLI commands to test] + +## Test Strategy +- **AAA Pattern**: Arrange, Act, Assert +- **Mock external dependencies**: No real API calls +- **Fast**: Unit tests < 1s each +- **Isolated**: Tests don't depend on each other + +## Success Criteria +- [ ] 80%+ code coverage +- [ ] All tests pass +- [ ] CI/CD pipeline configured +- [ ] Test documentation exists + +## Test Fixtures (Shared) +Create in `tests/conftest.py`: +- `tmp_workspace`: Temporary directory +- `sample_[object]`: Test data +- `mock_[service]`: Mocked dependencies +``` + +### Template 5: Technical Writer + +```markdown +# Role: Technical Writer + +## Identity +You are the Technical Writer for [PROJECT_NAME]. You create clear, helpful documentation. + +## Current State +- Existing docs: [list] +- Missing docs: [list] + +## Your Mission +Enable users and contributors through excellent documentation. + +## Priority Deliverables +1. **Getting Started Guide** - `docs/getting_started.md` + - Installation + - First example + - Troubleshooting + +2. **Tutorials** - `docs/tutorials/` + - [Tutorial 1]: [topic] + - [Tutorial 2]: [topic] + +3. **API Documentation** - `docs/api/` + - Auto-generated from docstrings + - Usage examples + +4. **Contributing Guide** - `CONTRIBUTING.md` + - Code style + - Git workflow + - Testing requirements + +## Documentation Standards +- **Clear**: Written for target audience +- **Complete**: Cover all features +- **Current**: Updated with code changes +- **Tested**: All examples work + +## Success Criteria +- [ ] New users can get started in < 10 minutes +- [ ] All public APIs documented +- [ ] 3+ tutorials exist +- [ ] Contributing guide complete +``` + +--- + +## Coordination Mechanisms + +### 1. Git Workflow + +Each agent works in their own branch: + +```bash +# Branch structure +main +├── backend-infrastructure # Agent 1 +├── feature-implementation # Agent 2 +├── interface-cli # Agent 3 +├── test-suite # Agent 4 +└── documentation # Agent 5 +``` + +**Merge Policy**: +- Tests must pass +- Code review by coordinator +- Documentation updated +- No merge conflicts + +### 2. Daily Progress Logs + +**Location**: `AGENT_PROMPTS/daily_logs/YYYY-MM-DD.md` + +**Format**: +```markdown +## [Agent Name] - [Date] + +### Completed Today +- Implemented AgentRuntime.execute() +- Added 15 unit tests +- Fixed memory leak in loader + +### In Progress +- Working on timeout handling +- Need to test edge cases + +### Blockers +- Waiting for API spec from Agent 2 +- Question about error handling strategy + +### Next Steps +- Complete timeout implementation +- Add integration tests +- Document API +``` + +### 3. Integration Points Document + +**Location**: `AGENT_PROMPTS/COORDINATION.md` + +```markdown +## Integration Points + +### Backend → Feature Developer +- **API**: `AgentRuntime.execute(spec, inputs) -> result` +- **Status**: ✅ Complete +- **Location**: `src/core/agent_runtime.py` + +### Feature → Interface +- **API**: `LabDirector.create_agent(goal) -> agent_spec` +- **Status**: 🔄 In progress +- **ETA**: Nov 18 + +### All → QA +- All modules must have: + - Docstrings + - Type hints + - Unit tests + +### All → Docs +- Update docs before merging: + - API reference + - Examples + - Changelog +``` + +### 4. Questions & Answers + +**Location**: `AGENT_PROMPTS/questions.md` + +```markdown +## [Agent Name] - [Date] +**Question**: Should I use async/await for all API calls? + +**Context**: Some calls are fast (<100ms), others slow (>5s) + +**Blocking**: No, but affects architecture decisions + +--- + +## [Another Agent] - [Date] +**Answer**: Use async for >1s operations. Sync is fine for quick calls. +Keep interface consistent - return Futures that can be awaited. + +**Reference**: See `src/core/async_patterns.py` for examples +``` + +### 5. Phase Gates + +Define clear completion criteria for each phase: + +```markdown +## Phase 1: Foundation + +### Complete When: +- [ ] Backend: AgentRuntime works, 80% test coverage +- [ ] Feature: LabDirector + Architect implemented +- [ ] Interface: `create` command works end-to-end +- [ ] QA: 50+ unit tests, all passing +- [ ] Docs: Getting started guide complete + +### Demo: +$ project-cli create "example goal" +[Works without errors] +``` + +--- + +## Implementation Guide + +### Step 1: Project Analysis + +Before deploying agents, analyze your project: + +```markdown +## Project Analysis Checklist + +### Size & Scope +- [ ] Estimated lines of code: _______ +- [ ] Development timeline: _______ +- [ ] Number of subsystems: _______ + +### Decomposition +Can the work be split into: +- [ ] Core infrastructure +- [ ] Business logic / features +- [ ] User interface +- [ ] Testing +- [ ] Documentation + +### Dependencies +Map dependencies between components: +[Create dependency diagram] + +### Success Metrics +- [ ] How will we know when each phase is complete? +- [ ] What are the acceptance criteria? +``` + +### Step 2: Role Definition + +For each agent, create a prompt file: + +``` +project/ +├── AGENT_PROMPTS/ +│ ├── README.md # Overview +│ ├── COORDINATION.md # How agents work together +│ ├── 1_[role_name].md # Agent 1 prompt +│ ├── 2_[role_name].md # Agent 2 prompt +│ ├── 3_[role_name].md # Agent 3 prompt +│ ├── 4_[role_name].md # Agent 4 prompt +│ ├── 5_[role_name].md # Agent 5 prompt +│ ├── daily_logs/ # Progress tracking +│ ├── issues/ # Coordination issues +│ └── questions.md # Q&A thread +``` + +### Step 3: Agent Deployment + +Three approaches: + +#### Option A: Parallel (Fastest) +- Open 5 AI conversations simultaneously +- Give each their role prompt +- Let them work in parallel +- Coordinate via Git + logs + +**Best for**: Independent workstreams, experienced coordinators + +#### Option B: Sequential (Safest) +- Deploy agents one at a time +- Backend → Feature → Interface → QA → Docs +- Each waits for dependencies + +**Best for**: Tight coupling, learning the pattern + +#### Option C: Phased (Balanced) +- Phase 1: Backend + Feature + QA (3 agents) +- Phase 2: Interface + Docs (add 2 agents) +- Phase 3: All 5 agents working + +**Best for**: Complex projects, risk mitigation + +### Step 4: Coordination & Monitoring + +Daily routine: +1. **Morning**: Review yesterday's progress logs +2. **Check**: Are any agents blocked? +3. **Resolve**: Answer questions, unblock agents +4. **Integrate**: Merge completed work to main +5. **Align**: Update coordination docs if needed + +Weekly routine: +1. **Review**: Phase completion progress +2. **Demo**: Test integrated system +3. **Adjust**: Reallocate work if needed +4. **Plan**: Next phase priorities + +### Step 5: Integration & Testing + +Before merging agent work: + +```bash +# Integration checklist +- [ ] Code follows style guide +- [ ] All tests pass +- [ ] No merge conflicts +- [ ] Documentation updated +- [ ] APIs match integration spec +- [ ] Dependencies satisfied +- [ ] Manual testing done +``` + +--- + +## Best Practices + +### Do's ✅ + +1. **Clear Role Boundaries**: No overlapping responsibilities +2. **Explicit Integration Points**: Document APIs between agents +3. **Regular Communication**: Daily progress logs minimum +4. **Version Control**: Each agent in their own branch +5. **Test Early**: QA agent starts from day 1 +6. **Document Continuously**: Writer updates docs with each feature +7. **Phase Gates**: Clear criteria for phase completion +8. **Human Review**: Coordinator reviews all major decisions + +### Don'ts ❌ + +1. **Don't Skip Planning**: Role definition is critical +2. **Don't Allow Overlap**: Agents shouldn't edit same files +3. **Don't Merge Without Tests**: All code must be tested +4. **Don't Ignore Blockers**: Resolve quickly or work is wasted +5. **Don't Assume Alignment**: Verify integration points work +6. **Don't Skip Documentation**: Future you will regret it +7. **Don't Over-Coordinate**: Trust agents in their domains +8. **Don't Ignore Technical Debt**: Address issues early + +### Communication Patterns + +#### Good Communication 👍 +```markdown +## Backend Engineer - Nov 17 +I've completed the AgentRuntime API. Key interface: + +async def execute(spec: AgentSpec, inputs: Dict) -> AgentResult: + """Execute agent with timeout and resource limits.""" + +Location: src/core/agent_runtime.py:45-89 +Tests: tests/unit/test_agent_runtime.py + +@Agent-Developer: This is ready for you to use. See docstring for examples. +``` + +#### Bad Communication 👎 +```markdown +## Backend Engineer - Nov 17 +Done with some stuff. Let me know if you need anything. +``` + +--- + +## Prompts & Templates + +### Starter Prompt for New Projects + +```markdown +I'm starting a new project called [PROJECT_NAME] that will [DESCRIPTION]. + +I want to use a multi-agent development approach with specialized AI agents. + +Please help me: +1. Analyze if this project is a good fit for multi-agent development +2. Suggest appropriate agent roles (3-7 agents) +3. Define clear boundaries and integration points +4. Create initial role prompts for each agent + +Project details: +- Language: [Python, JavaScript, etc.] +- Estimated size: [small/medium/large] +- Timeline: [weeks/months] +- Key components: [list main subsystems] +- Technology stack: [frameworks, tools] +``` + +### Agent Onboarding Prompt + +```markdown +You are [AGENT_ROLE] for the [PROJECT_NAME] project. + +Your complete role definition is in: [PATH_TO_PROMPT_FILE] + +Before starting work: +1. Read your full role prompt carefully +2. Read COORDINATION.md to understand how agents work together +3. Review the current codebase in [PROJECT_PATH] +4. Check integration points - what APIs you consume/provide +5. Review today's daily logs from other agents + +Your first task is: [SPECIFIC_FIRST_TASK] + +Please confirm you understand your role and are ready to start. +``` + +### Daily Check-In Prompt + +```markdown +It's [DAY] of development. Please provide your daily update: + +## Completed Since Last Update +[What you finished] + +## Currently Working On +[Current task, % complete] + +## Blockers +[Anything preventing progress] + +## Questions for Other Agents +[Questions, if any] + +## Next Steps +[What you'll work on next] + +Also check: Have other agents asked you questions in questions.md? +``` + +### Integration Checkpoint Prompt + +```markdown +We're approaching the end of Phase [N]. Please verify your integration points: + +1. Review COORDINATION.md for your integration requirements +2. Check that your APIs match the documented interface +3. Test interactions with dependent agents' code +4. Update documentation if interfaces changed +5. Report any integration issues + +Post results in today's daily log. +``` + +### Handoff Prompt + +```markdown +Agent [NAME] has completed [COMPONENT]. + +[DEPENDENT_AGENT], you can now proceed with [NEXT_TASK]. + +Key details: +- Location: [FILE_PATH] +- API: [INTERFACE_DESCRIPTION] +- Tests: [TEST_FILE] +- Documentation: [DOCS_LOCATION] + +Please review the implementation and confirm it meets your needs before building on it. +``` + +--- + +## Troubleshooting + +### Problem: Agents Are Blocked + +**Symptoms**: Progress logs show multiple agents waiting + +**Solutions**: +1. Identify critical path dependencies +2. Prioritize unblocking agents +3. Create stub implementations for APIs +4. Provide interim documentation +5. Consider sequential approach for this phase + +### Problem: Integration Failures + +**Symptoms**: Code from different agents doesn't work together + +**Solutions**: +1. Review COORDINATION.md - are integration points clear? +2. Create shared test that exercises interface +3. Have agents collaborate on fixing mismatch +4. Update integration documentation +5. Add integration tests to prevent regression + +### Problem: Duplicate Work + +**Symptoms**: Two agents implement the same thing + +**Solutions**: +1. Clarify role boundaries immediately +2. Decide which implementation to keep +3. Update prompts to prevent future overlap +4. Review file ownership in COORDINATION.md + +### Problem: Quality Issues + +**Symptoms**: Code lacks tests, docs, or doesn't follow standards + +**Solutions**: +1. QA agent reviews all PRs before merge +2. Add quality gates to coordination doc +3. Require tests + docs for merge approval +4. Update agent prompts with quality standards + +### Problem: Loss of Context + +**Symptoms**: Agents forget previous decisions or constraints + +**Solutions**: +1. Create DECISIONS.md documenting key choices +2. Reference important context in prompts +3. Use Git commit messages to explain rationale +4. Keep role prompts updated with learnings + +### Problem: Coordination Overhead + +**Symptoms**: More time spent coordinating than building + +**Solutions**: +1. Reduce coordination touchpoints +2. Give agents more autonomy in their domains +3. Consolidate roles (fewer agents) +4. Use async communication (logs) over sync +5. Trust agents to make decisions + +--- + +## Case Study: Agent-Lab + +### Project Overview + +**Goal**: Build a system for creating self-improving AI agents + +**Approach**: 5-agent team working in parallel + +**Timeline**: 3 weeks, 3 phases + +### Team Structure + +1. **Backend Systems Engineer** + - Built: AgentRuntime, Git utilities, persistence + - Files: `core/`, `gitops/`, `config/` + - Output: Infrastructure for agent execution + +2. **Agent Developer** + - Built: 6 specialized agents (LabDirector, Architect, etc.) + - Files: `agents/` + - Output: The intelligence of the system + +3. **CLI Engineer** + - Built: User commands (create, list, show, etc.) + - Files: `cli/` + - Output: User-facing interface + +4. **QA Engineer** + - Built: Test suite, evaluation scenarios + - Files: `tests/`, `evaluation/` + - Output: Quality assurance infrastructure + +5. **Technical Writer** + - Built: Docs, tutorials, examples + - Files: `docs/`, `examples/` + - Output: User and contributor documentation + +### Key Decisions + +**✅ What Worked:** +- Clear role separation prevented conflicts +- Parallel work accelerated development +- Daily logs kept everyone aligned +- Git branches isolated work effectively +- Phase gates ensured quality + +**❌ What Didn't Work:** +- Initial prompts too vague (needed iteration) +- Some integration points unclear at start +- Coordination overhead higher than expected early on +- Some agents finished early, others blocked + +**🔧 Adjustments Made:** +- Added more detail to role prompts +- Created COORDINATION.md with explicit integration points +- Introduced daily standups via logs +- Used stub implementations to unblock agents + +### Results + +- **Speed**: 3x faster than single-agent approach +- **Quality**: Higher due to specialized QA agent +- **Documentation**: Better due to dedicated writer +- **Maintainability**: Clear ownership of components + +### Lessons Learned + +1. **Invest in setup**: Good role definition pays off +2. **Over-communicate early**: Establish patterns +3. **Integration points are critical**: Document before coding +4. **Trust agents**: Don't micro-manage +5. **Iterate prompts**: Update as you learn + +--- + +## Quick Reference Card + +### When to Use Multi-Agent + +- ✅ Project > 5k LOC +- ✅ Timeline > 1 week +- ✅ Clear subsystems +- ✅ Need quality (tests + docs) + +### Standard Team + +1. Backend Engineer +2. Feature Developer +3. Interface Engineer +4. QA Engineer +5. Technical Writer + +### Directory Structure + +``` +project/ +├── AGENT_PROMPTS/ +│ ├── 1_backend.md +│ ├── 2_feature.md +│ ├── 3_interface.md +│ ├── 4_qa.md +│ ├── 5_docs.md +│ ├── COORDINATION.md +│ └── daily_logs/ +└── [project code] +``` + +### Daily Workflow + +1. Read yesterday's logs +2. Check for questions +3. Unblock agents +4. Review completed work +5. Merge when ready + +### Success Metrics + +- Tests pass ✅ +- Docs updated ✅ +- No conflicts ✅ +- Phase goals met ✅ + +--- + +## Appendix: Prompt Library + +### A. Project Kickoff Prompts + +#### Initial Analysis +``` +Analyze this project for multi-agent development suitability: + +Project: [NAME] +Description: [DESCRIPTION] +Tech stack: [STACK] +Timeline: [TIMELINE] + +Please: +1. Assess fit for multi-agent approach +2. Suggest number and types of agents +3. Identify key integration points +4. Propose phase breakdown +``` + +#### Role Generation +``` +Generate a detailed role prompt for a [ROLE_NAME] agent working on [PROJECT]. + +Include: +- Clear mission statement +- Specific files/directories owned +- Integration points with other agents +- Success criteria +- Getting started section +- Example code structures +``` + +### B. Coordination Prompts + +#### Integration Check +``` +Review integration between [AGENT_1] and [AGENT_2]: + +Agent 1 provides: [API_DESCRIPTION] +Agent 2 expects: [REQUIREMENTS] + +Verify: +- Interface compatibility +- Error handling +- Documentation completeness +- Test coverage +``` + +#### Blocker Resolution +``` +[AGENT_NAME] is blocked on: [DESCRIPTION] + +Help resolve by: +1. Clarifying requirements +2. Providing stub implementation +3. Finding alternative approach +4. Reprioritizing work +``` + +### C. Quality Prompts + +#### Code Review +``` +Review this code from [AGENT_NAME]: + +[CODE] + +Check: +- Follows project style +- Has docstrings +- Includes type hints +- Has tests +- Handles errors +- Integrates correctly +``` + +#### Documentation Review +``` +Review documentation for [FEATURE]: + +[DOCS] + +Verify: +- Accuracy +- Completeness +- Examples work +- Clear for target audience +``` + +--- + +## Conclusion + +The multi-agent development pattern is powerful for medium-to-large projects where: +- Work can be parallelized +- Quality matters +- Clear subsystems exist +- Timeline allows for setup + +Key success factors: +1. **Clear roles** with explicit boundaries +2. **Strong coordination** mechanisms +3. **Documented integration** points +4. **Regular communication** via logs +5. **Quality gates** at merge time + +Start small (3 agents), learn the pattern, then scale up. + +--- + +**Questions?** Open an issue or contribute improvements to this guide. + +**License**: MIT (use freely, share improvements) + diff --git a/WORKFLOW.md b/WORKFLOW.md new file mode 100644 index 0000000..60943f7 --- /dev/null +++ b/WORKFLOW.md @@ -0,0 +1,72 @@ +# Multi-Agent Workflow Guide + +This project uses the multi-agent workflow system. + +## Quick Commands + +### Status & Planning +```bash +# Check current status +python skills/workflow-state/workflow-state/scripts/workflow_state.py . + +# Get next action +python skills/workflow-state/workflow-state/scripts/workflow_state.py . next-step + +# Validate current phase +python skills/workflow-state/workflow-state/scripts/workflow_state.py . validate +``` + +### Quality & Metrics +```bash +# Collect metrics +python scripts/collect_metrics.py + +# Run quality audit +./scripts/auto_quality_audit.sh + +# Generate dashboard +python scripts/generate_dashboard.py +cat DASHBOARD.md +``` + +### PR Management +```bash +# Determine merge order +python scripts/determine_merge_order.py +cat merge_order.txt +``` + +### Agent Management +```bash +# Block an agent +python skills/workflow-state/workflow-state/scripts/workflow_state.py . block --agent-id 1 --reason "Waiting for API" + +# Unblock an agent +python skills/workflow-state/workflow-state/scripts/workflow_state.py . unblock --agent-id 1 + +# List blocked agents +python skills/workflow-state/workflow-state/scripts/workflow_state.py . blocked +``` + +## Templates + +Templates are available in the `templates/` directory: +- `IMPROVEMENT_PROPOSAL.md` - For Phase 3 code review suggestions +- `PR_TEMPLATE.md` - For Phase 5 pull requests +- `STUB_TEMPLATE_PYTHON.py` - For creating Python interface stubs +- `STUB_TEMPLATE_JAVASCRIPT.js` - For creating JS/TS interface stubs + +## Agent Learnings + +Refer to `AGENT_LEARNINGS/MASTER_LEARNINGS.md` for accumulated knowledge and best practices. + +## Workflow Phases + +1. **Phase 1**: Requirements Analysis +2. **Phase 2**: Architecture Design +3. **Phase 3**: Code Review +4. **Phase 4**: Implementation +5. **Phase 5**: Integration +6. **Phase 6**: QA Testing + +For detailed documentation, see the main toolkit repository. diff --git a/WORKFLOW_STATE.json b/WORKFLOW_STATE.json new file mode 100644 index 0000000..af54e48 --- /dev/null +++ b/WORKFLOW_STATE.json @@ -0,0 +1,12 @@ +{ + "project": "ship-MTA-draft", + "project_path": "/Users/dp/Projects/ship-MTA-draft", + "phase": 0, + "iteration": 0, + "status": "not_started", + "tech_stack": null, + "agents": [], + "history": [], + "created_at": "2025-11-19T02:29:11.438755", + "last_updated": "2025-11-19T02:29:11.438762" +} \ No newline at end of file diff --git a/app/static/css/style.css b/app/static/css/style.css index ea48837..54ace1c 100644 --- a/app/static/css/style.css +++ b/app/static/css/style.css @@ -1400,3 +1400,166 @@ a:hover { .work-item-card .btn-group .btn { flex: 1; } + +/* OBNOXIOUS ZOA ENERGY AD */ +.zoa-ad-banner { + background: linear-gradient(135deg, #00bcd4 0%, #0097a7 100%); + border: 1px solid #ffd700; + border-radius: 6px; + padding: 10px; + margin: 12px 0; + text-align: center; + cursor: pointer; + position: relative; + overflow: hidden; + box-shadow: 0 2px 8px rgba(0, 188, 212, 0.3); + animation: pulse-border 2s infinite; +} + +@keyframes pulse-border { + 0%, 100% { box-shadow: 0 8px 25px rgba(0, 188, 212, 0.5); } + 50% { box-shadow: 0 8px 35px rgba(0, 188, 212, 0.8), 0 0 20px rgba(255, 215, 0, 0.6); } +} + +.zoa-ad-banner::before { + content: "⚡ SPONSORED ⚡"; + position: absolute; + top: 4px; + right: 4px; + background: #ffd700; + color: #000; + font-size: 7px; + font-weight: 900; + padding: 2px 6px; + border-radius: 3px; + letter-spacing: 0.3px; + animation: blink 1.5s infinite; +} + +@keyframes blink { + 0%, 49%, 100% { opacity: 1; } + 50%, 99% { opacity: 0.3; } +} + +.zoa-image-container { + margin-bottom: 5px; + animation: bounce 2s infinite; + position: relative; + display: inline-block; +} + +@keyframes bounce { + 0%, 100% { transform: translateY(0); } + 50% { transform: translateY(-5px); } +} + +.zoa-image { + height: 50px; + width: auto; + border-radius: 4px; +} + +.zoa-allergy-badge { + position: absolute; + bottom: -3px; + left: 50%; + transform: translateX(-50%); + background: #28a745; + color: white; + font-size: 6px; + font-weight: 900; + padding: 2px 6px; + border-radius: 3px; + letter-spacing: 0.3px; + white-space: nowrap; + display: none; +} + +.zoa-allergy-badge.show { + display: block; +} + +.zoa-tagline { + font-size: 9px; + color: rgba(255, 255, 255, 0.95); + margin-bottom: 4px; + font-weight: 700; + text-transform: uppercase; + letter-spacing: 0.3px; +} + +.zoa-message { + font-size: 11px; + color: white; + margin-bottom: 8px; + font-weight: 600; + text-shadow: 0 1px 2px rgba(0, 0, 0, 0.3); +} + +.zoa-cta-button { + display: inline-block; + background: linear-gradient(135deg, #ffd700 0%, #ffed4e 100%); + color: #000; + padding: 6px 18px; + border-radius: 14px; + font-weight: 900; + text-decoration: none; + font-size: 10px; + box-shadow: 0 2px 8px rgba(255, 215, 0, 0.3); + animation: cta-pulse 1.5s infinite; + text-transform: uppercase; + letter-spacing: 0.3px; +} + +@keyframes cta-pulse { + 0%, 100% { transform: scale(1); } + 50% { transform: scale(1.05); } +} + +.zoa-cta-button:hover { + transform: scale(1.1) !important; + box-shadow: 0 6px 20px rgba(255, 215, 0, 0.7); +} + +/* Color variants */ +.zoa-ad-banner.red-variant { + background: linear-gradient(135deg, #e74c3c 0%, #c0392b 100%); + border-color: #ffd700; +} + +.zoa-ad-banner.purple-variant { + background: linear-gradient(135deg, #9b59b6 0%, #8e44ad 100%); + border-color: #ffd700; +} + +.zoa-ad-banner.orange-variant { + background: linear-gradient(135deg, #ff6b35 0%, #f7931e 100%); + border-color: #ffd700; +} + +@media (max-width: 768px) { + .zoa-ad-banner { + padding: 8px; + margin: 10px 0; + } + .zoa-image { + height: 40px; + } + .zoa-allergy-badge { + font-size: 5px; + padding: 1px 4px; + bottom: -2px; + } + .zoa-tagline { + font-size: 8px; + margin-bottom: 4px; + } + .zoa-message { + font-size: 10px; + margin-bottom: 6px; + } + .zoa-cta-button { + padding: 5px 16px; + font-size: 9px; + } +} diff --git a/app/static/images/zoa-hero.png b/app/static/images/zoa-hero.png new file mode 100644 index 0000000..47ef6a4 Binary files /dev/null and b/app/static/images/zoa-hero.png differ diff --git a/app/static/images/zoa-lemon.jpg b/app/static/images/zoa-lemon.jpg new file mode 100644 index 0000000..f772db4 --- /dev/null +++ b/app/static/images/zoa-lemon.jpg @@ -0,0 +1,71 @@ + + +
++ The page you’re looking for couldn’t be found. +
++ Check the web address and try again, or go to the home page. +
+