Skip to content

Conversation

@joelteply
Copy link
Contributor

Summary

  • Phase 2: Rust file engine foundation (FileEngine, ChangeGraph, PathSecurity) + TypeScript code/* commands (read, write, edit, search, tree, undo, history, diff) with ts-rs type generation
  • Phase 3: Single-agent coding pipeline — CodingModelSelector (task→model routing), PlanFormulator (LLM plan generation with risk assessment), CodeAgentOrchestrator (DAG execution with budget tracking)
  • Phase 4 Foundation: CodingPlanEntity with hierarchical persistence, real-time step status tracking
  • Phase 4A: Sandbox security — SecurityTier (4-tier access control), ToolAllowlistEnforcer (per-tier tool filtering), ExecutionSandbox (process isolation)
  • Phase 4B: Self-modifying skills — skill/propose, skill/generate, skill/validate, skill/activate, skill/list commands with SkillEntity lifecycle
  • Phase 4C: Multi-agent coordination — CodeCoordinationStream (file-level MUTEX locking), CodeTaskDelegator (union-find cluster decomposition), PlanGovernance (risk-based approval routing), DryRun mode
  • code/task command: Entry point that wires the full pipeline together — validates params, builds CodingTask, invokes CodeAgentOrchestrator, maps CodingResult to CommandResult
  • Governance wiring: High-risk plans now route through PlanGovernance for team approval before execution (returns pending_approval status)
  • Fixes parameter validation bugs in should-respond-fast, activity/join, activity/create
  • Fixes coordination system killing AI engagement (removed mechanical throttles from InferenceCoordinator)

Test plan

  • TypeScript compiles cleanly (npm run build:ts)
  • 342 coding system unit tests pass (npx vitest run tests/unit/code/)
  • Deploy with npm start and verify code/task command is discoverable via ./jtag help code/task
  • Test code/task dry-run mode with a simple description
  • Verify skill/* commands register and appear in help
  • Integration test: code/task with real LLM for a single-file edit

🤖 Generated with Claude Code

Joel added 9 commits February 1, 2026 17:05
…type gen

Rust Foundation (continuum-core/src/code/):
- FileEngine: read/write/edit/delete with per-persona workspace scoping
- ChangeGraph: DAG of ChangeNodes with undo via reverse diff
- DiffEngine: unified diff computation (similar crate)
- PathSecurity: workspace isolation, path traversal guard, extension allowlist
- CodeSearch: regex + glob search with .gitignore support (ignore crate)
- Tree: recursive directory tree generation
- GitBridge: git status and diff operations
- IPC handlers for all 12 code/* endpoints (359 tests passing)

TypeScript Commands (8 generated via CommandGenerator):
- code/read, code/write, code/edit, code/diff
- code/search, code/tree, code/undo, code/history
- Each with Types.ts, ServerCommand.ts, BrowserCommand.ts, README, tests

Type Safety (ts-rs single source of truth):
- 14 Rust types exported via #[derive(TS)] → shared/generated/code/
- Zero hand-written wire type duplicates
- All object/any casts eliminated from code/* commands
- CommandParams.userId used as canonical identity field

RAG Integration:
- CodeToolSource: dynamic coding workflow guidance in persona system prompts
- Only shows tools persona has permission to use
- Budget-aware with minimal fallback
- 15 unit tests passing

Infrastructure fixes:
- PersonaToolExecutor now injects userId (standard CommandParams field)
- CLAUDE.md documents ts-rs pattern and regeneration workflow
Delete old pre-Rust development/code/read and development/code/pattern-search
commands that caused TS2300 duplicate identifier collision with new code/*
commands. Remove legacy CodeDaemon methods (readFile, searchCode, getGitLog,
clearCache, getCacheStats, getRepositoryRoot), their types, and the
PathValidator/FileReader modules — all superseded by Rust IPC workspace ops.

- Delete commands/development/code/ (7 files)
- Delete daemons/code-daemon/server/modules/ (PathValidator, FileReader)
- Clean CodeDaemonTypes.ts: remove 222 lines of legacy types
- Clean CodeDaemon.ts: remove 7 legacy static methods
- Clean CodeDaemonServer.ts: remove old CodeDaemonImpl class
- Fix cli.ts: replace CODE_COMMANDS import with string literals
- Fix PersonaToolDefinitions.ts: update essentialTools to code/*
- Regenerate server/generated.ts and command constants
…strator

CodingModelSelector routes coding tasks to frontier models with provider
fallback. PlanFormulator decomposes tasks into executable step DAGs via
LLM. CodeAgentOrchestrator executes plans with budget enforcement, retry
logic, and dependency-ordered step execution. 51 unit tests.
CodingPlanEntity is a first-class persistent entity for coding plans.
Supports hierarchical delegation (parentPlanId), team assignment
(assignees + leadId), governance integration (proposalId), and
real-time execution tracking. CodeAgentOrchestrator now persists
plans via DataDaemon with best-effort semantics (works without DB
in unit tests). 80 unit tests passing.
…ordination

Phase 4A — Sandbox & Security Tiers:
- SecurityTier: 4-tier access control (discovery/read/write/system)
- ToolAllowlistEnforcer: per-tier command filtering with glob matching
- ExecutionSandbox: process-isolated code execution with timeout/output limits
- Risk assessment integrated into PlanFormulator output

Phase 4B — Self-Modifying Skills:
- SkillEntity: persistent skill registry with full lifecycle
- skill/propose: AI creates command specifications
- skill/generate: programmatic CommandGenerator invocation
- skill/validate: sandbox compilation + test execution
- skill/activate: dynamic tool registration
- skill/list: query skill registry

Phase 4C — Multi-Agent Coordination & Delegation:
- CodeCoordinationStream: file-level MUTEX via BaseCoordinationStream
- PlanGovernance: risk-based approval routing (auto-approve low risk, require approval for multi-agent/high-risk/system-tier)
- CodeTaskDelegator: union-find plan decomposition into parallel file clusters, load-balanced agent assignment, sub-plan creation, result consolidation
- DryRun mode: execute plans read-only, mock write operations

342 tests across 12 test files, all passing.
Two mechanical throttle layers were overriding AI cognition:

1. Temperature system: each AI "servicing" a room subtracted -0.2,
   so 14 personas crashed rooms to 0.00 in 35 seconds. Fixed by
   flipping to +0.05 warmth (active conversation stays alive).
   Removed hard -0.1 priority penalty for cold rooms.

2. InferenceCoordinator: gating calls consumed per-message "cards"
   in messageResponders, so when actual response generation tried
   to acquire a slot with the same messageId, every persona was
   denied. Rewrote from 489→197 lines, removing 6 mechanical rules
   (card dealing, responder caps, reserved slots, cooldowns, stagger
   delays, auto-thinning). Kept only hardware capacity protection.

Result: AIs respond within seconds instead of being silenced.
…n, activity/create

- should-respond-fast: params.messageText crash (toLowerCase on undefined)
  when AI calls without messageText. Now returns graceful false result.
- activity/join: activityId undefined → "Activity not found: undefined"
  Now validates activityId before DB lookup.
- activity/create: recipeId undefined → "Recipe not found: undefined"
  Now validates recipeId before DB lookup.

All three were AIs calling tools with missing params, getting either
crashes or confusing error messages instead of clear validation errors.
Add code/task command as the entry point for the full coding agent pipeline.
Wire PlanGovernance approval flow and CodeTaskDelegator into orchestrator.
Add pending_approval status to CodingResult for high-risk plan gating.
Copilot AI review requested due to automatic review settings February 2, 2026 04:37
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR implements a comprehensive coding agent pipeline with multi-phase capabilities spanning file operations, security enforcement, multi-agent coordination, and self-modifying skills.

Changes:

  • Establishes Rust-based file engine with workspace isolation, change tracking, and TypeScript type generation via ts-rs
  • Implements single-agent coding pipeline with LLM-powered planning, risk assessment, and DAG-based execution with budget controls
  • Adds multi-tier sandbox security (discovery/read/write/system) with tool allowlists and governance approval routing for high-risk operations

Reviewed changes

Copilot reviewed 151 out of 206 changed files in this pull request and generated no comments.

Show a summary per file
File Description
ChatCoordinationStream.ts Fixes temperature adjustment logic to warm rooms on AI activity instead of cooling them
ToolAllowlistEnforcer.ts New security gateway enforcing tier-based tool restrictions with audit logging
SecurityTier.ts Defines 4-tier access control system with command allowlists and risk-to-tier mapping
PlanGovernance.ts Routes high-risk plans through team approval via DecisionProposal integration
CodingModelSelector.ts Maps coding task types to frontier models with provider fallback chains
CodeDaemonTypes.ts Refactored to re-export Rust-generated workspace types via ts-rs
CodeDaemon.ts Updated API surface for workspace-scoped operations backed by Rust IPC
CodeTaskServerCommand.ts Entry point wiring task validation, orchestrator invocation, and result mapping
All code/* commands New workspace commands (read/write/edit/search/tree/undo/history/diff) with Rust backend
All skill/* commands Self-modifying skill lifecycle (propose/generate/validate/activate/list)
EntityRegistry.ts Registers CodingPlanEntity and SkillEntity for persistence
activity/join, activity/create Adds missing parameter validation for activityId and recipeId
should-respond-fast Adds missing messageText validation
version.ts, package.json Version bumps to 1.0.7521
generated files Command registry updates for new code/* and skill/* commands
Files not reviewed (1)
  • src/debug/jtag/package-lock.json: Language not supported

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants