Skip to content

Conversation

@jhaynie
Copy link
Member

@jhaynie jhaynie commented Jan 1, 2026

Summary

This PR fixes two issues that prevented the workbench from working correctly when configured in agentuity.config.ts:

Problem 1: Workbench config not passed in server mode

The generateEntryFile call in the server mode path (Bun.build) was missing the workbench config parameter, causing hasWorkbench to be false in the generated entry file even when the config existed.

Fix: Load workbench config and pass it to generateEntryFile in vite-builder.ts.

Problem 2: Agent registry not imported at startup

Agents were only registered when imported by API routes. For the workbench metadata endpoint to return JSON schemas (needed for the Monaco editor input), all agents need to be registered at startup.

Fix: Add import of ./registry.js in the generated app.ts entry file.

Additional Changes

  • Updated workbench.ts to use async thread state API (await for get/set/delete/push)
  • Added counter agent for e2e testing of thread state persistence
  • Updated workbench e2e tests to work with Monaco editor (JsonEditor) and use more robust locators

Testing

  • All 17 e2e tests pass
  • All unit tests pass
  • Typecheck and build pass

Summary by CodeRabbit

  • New Features

    • Counter agent for testing persistent thread state and per-agent 50-message history.
  • Tests

    • New E2E test validating thread-state persistence, incremental counts, reset behavior, editor interactions, and more stable dropdown/editor handling with CI-tolerant timing/logging.
  • Chores

    • Build and workbench initialization updated to include the testing web app and load workbench metadata.
  • Public API

    • Agent metadata now exposed on the agent execution context.

✏️ Tip: You can customize this high-level summary in your review settings.

…tartup

- Pass workbench config to generateEntryFile in vite-builder.ts server mode
- Import agent registry in generated app.ts to ensure all agents are registered
- Update workbench.ts to use async thread state API (await for get/set/delete)
- Add counter agent for e2e testing of thread state persistence
- Update workbench e2e tests to work with Monaco editor and fix locators
@coderabbitai
Copy link

coderabbitai bot commented Jan 1, 2026

📝 Walkthrough

Walkthrough

Adds a persistent Counter agent and E2E test; includes the E2E web app in the build; surfaces workbench config into CLI/Vite entry generation; converts workbench to async per-agent persisted thread state with capped history and dynamic cleanup; exports AgentMetadata and exposes ctx.current on agent contexts.

Changes

Cohort / File(s) Summary
Counter agent & E2E test
apps/testing/e2e-web/src/agent/counter/agent.ts, e2e/workbench.pw.ts
New default-exported Counter agent persisting numeric count in thread state (actions: increment, decrement, reset, get); added Playwright test validating agent selection, Monaco input, persistent increments, and clear/reset behavior.
Build configuration
package.json
Added apps/testing/e2e-web to the scripts.build filter chain.
CLI entry generation & Vite build
packages/cli/src/cmd/build/entry-generator.ts, packages/cli/src/cmd/build/vite/vite-builder.ts
Inverted workbench route gating logic; import agent registry earlier in generated entries; Vite builder loads workbench config and conditionally passes it into entry generation.
Workbench runtime: thread-state persistence
packages/runtime/src/workbench.ts
Replaced in-memory per-agent message arrays with async thread.state.push under messages_<agentId> (max 50); outputs pushed only if non-null; retrieval and clear iterate thread state keys to purge messages_<agentId> and keys starting with <agentId>_.
Agent metadata & context exposure
packages/runtime/src/agent.ts, packages/runtime/src/_context.ts, packages/runtime/src/_standalone.ts
Exported AgentMetadata type and added current: AgentMetadata to agent contexts (AgentContext, RequestAgentContext, StandaloneAgentContext); ctx.current assigned during agent invocation.
Schema JSON serializer
packages/schema/src/json-schema.ts
Replaced instanceof checks with isSchemaType helper; extended toJSONSchema to support Record, Unknown/Any, CoerceDate, and refined Literal/Object/Array/Optional/Nullable/Union handling; internal behavior broadened without public API changes.

Sequence Diagram(s)

sequenceDiagram
  participant Client
  participant WorkbenchServer
  participant AgentRuntime
  participant ThreadStateStore

  Client->>WorkbenchServer: POST /execute (agentId, input)
  rect rgb(245,250,255)
    Note over WorkbenchServer: prepare request\n(assign ctx.current = agent.metadata)
    WorkbenchServer->>AgentRuntime: invoke agent(handler, ctx, input)
  end
  rect rgb(245,255,245)
    Note over AgentRuntime,ThreadStateStore: persist inputs/outputs\npush "messages_<agentId>" (max 50)
    AgentRuntime->>ThreadStateStore: push "messages_<agentId>" (input)
    AgentRuntime->>ThreadStateStore: read/write keys like "<agentId>_*"
    AgentRuntime->>ThreadStateStore: push "messages_<agentId>" (output) if output != null
  end
  ThreadStateStore-->>AgentRuntime: ack
  AgentRuntime-->>WorkbenchServer: result
  WorkbenchServer-->>Client: response
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Pre-merge checks and finishing touches

✅ Passed checks (1 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch fix/workbench-config-and-agent-registry

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
packages/runtime/src/workbench.ts (1)

107-111: Clarify intent for non-fatal thread state persistence failure in execution route.

The execution route logs a warning when thread save fails but returns { success: true } (lines 106-111, 116). The clear state route returns a 500 error for the same failure (lines 171-176). This inconsistency means:

  1. An agent execution can succeed and return results even if state is not persisted
  2. Users see success: true without indication that state persistence failed

Confirm whether this lenient behavior is intentional (execution results are primary, state persistence is best-effort), or whether both routes should consistently fail on persistence errors. If intentional, document this design decision and consider whether the response should indicate partial success.

🧹 Nitpick comments (1)
packages/runtime/src/workbench.ts (1)

93-103: Consider extracting maxMessages to module level.

The maxMessages constant is currently scoped to the handler function. If other handlers or functions in this module need the same limit, extracting it as a module-level constant would improve maintainability and consistency.

🔎 Optional refactor to module-level constant
+const MAX_AGENT_MESSAGES = 50;
+
 export const createWorkbenchExecutionRoute = (): Handler => {
 	const authHeader = process.env.AGENTUITY_WORKBENCH_APIKEY
 		? `Bearer ${process.env.AGENTUITY_WORKBENCH_APIKEY}`
 		if (ctx.var.thread) {
 			const agentMessagesKey = `messages_${agentId}`;
-			const maxMessages = 50;
 
-			await ctx.var.thread.state.push(agentMessagesKey, { type: 'input', data: input }, maxMessages);
+			await ctx.var.thread.state.push(agentMessagesKey, { type: 'input', data: input }, MAX_AGENT_MESSAGES);
 
 			if (result !== undefined && result !== null) {
 				await ctx.var.thread.state.push(
 					agentMessagesKey,
 					{ type: 'output', data: result },
-					maxMessages
+					MAX_AGENT_MESSAGES
 				);
 			}
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 99a5ab5 and beb47d4.

⛔ Files ignored due to path filters (5)
  • apps/testing/cloud-deployment/src/generated/app.ts is excluded by !**/generated/**
  • apps/testing/e2e-web/src/generated/app.ts is excluded by !**/generated/**
  • apps/testing/e2e-web/src/generated/registry.ts is excluded by !**/generated/**
  • apps/testing/integration-suite/src/generated/app.ts is excluded by !**/generated/**
  • bun.lock is excluded by !**/*.lock
📒 Files selected for processing (6)
  • apps/testing/e2e-web/src/agent/counter/agent.ts
  • e2e/workbench.pw.ts
  • package.json
  • packages/cli/src/cmd/build/entry-generator.ts
  • packages/cli/src/cmd/build/vite/vite-builder.ts
  • packages/runtime/src/workbench.ts
🧰 Additional context used
📓 Path-based instructions (6)
**/*.{ts,tsx,js,jsx,json}

📄 CodeRabbit inference engine (AGENTS.md)

Use Prettier for code formatting with tabs (width 3), single quotes, and semicolons

Files:

  • apps/testing/e2e-web/src/agent/counter/agent.ts
  • packages/runtime/src/workbench.ts
  • package.json
  • packages/cli/src/cmd/build/vite/vite-builder.ts
  • packages/cli/src/cmd/build/entry-generator.ts
  • e2e/workbench.pw.ts
**/*.{ts,tsx}

📄 CodeRabbit inference engine (AGENTS.md)

Use TypeScript strict mode with ESNext target and bundler moduleResolution

Files:

  • apps/testing/e2e-web/src/agent/counter/agent.ts
  • packages/runtime/src/workbench.ts
  • packages/cli/src/cmd/build/vite/vite-builder.ts
  • packages/cli/src/cmd/build/entry-generator.ts
  • e2e/workbench.pw.ts
**/*.{ts,tsx,js,jsx}

📄 CodeRabbit inference engine (AGENTS.md)

**/*.{ts,tsx,js,jsx}: Use StructuredError from @agentuity/core for error handling
Ensure all errors AND warnings must be zero before committing

Files:

  • apps/testing/e2e-web/src/agent/counter/agent.ts
  • packages/runtime/src/workbench.ts
  • packages/cli/src/cmd/build/vite/vite-builder.ts
  • packages/cli/src/cmd/build/entry-generator.ts
  • e2e/workbench.pw.ts
packages/runtime/src/**/*.{ts,tsx}

📄 CodeRabbit inference engine (packages/runtime/AGENTS.md)

packages/runtime/src/**/*.{ts,tsx}: Use ctx.logger for logging instead of console.log
Use agent.validator() for automatic input validation in route handlers
Do NOT add type annotations to handler parameters - let TypeScript infer them from schemas
Every handler must receive AgentContext with logger, tracer, and storage utilities

Files:

  • packages/runtime/src/workbench.ts
packages/cli/src/cmd/**/*.ts

📄 CodeRabbit inference engine (packages/cli/AGENTS.md)

packages/cli/src/cmd/**/*.ts: Use tui.* helpers for formatted output instead of raw console logs
Use ctx.logger for logging; call logger.fatal() to log and exit with code 1

Files:

  • packages/cli/src/cmd/build/vite/vite-builder.ts
  • packages/cli/src/cmd/build/entry-generator.ts
packages/cli/src/**/*.ts

📄 CodeRabbit inference engine (packages/cli/AGENTS.md)

Use Bun.file(f).exists() instead of existsSync(f) for file existence checks

Files:

  • packages/cli/src/cmd/build/vite/vite-builder.ts
  • packages/cli/src/cmd/build/entry-generator.ts
🧠 Learnings (2)
📚 Learning: 2025-12-21T00:31:41.858Z
Learnt from: jhaynie
Repo: agentuity/sdk PR: 274
File: packages/cli/src/cmd/build/vite/server-bundler.ts:12-41
Timestamp: 2025-12-21T00:31:41.858Z
Learning: In Bun runtime, BuildMessage and ResolveMessage are global types and are not exported from the bun module. Do not import { BuildMessage } from 'bun' or similar; these types are available globally and should be used without import. This applies to all TypeScript files that target the Bun runtime within the repository.

Applied to files:

  • apps/testing/e2e-web/src/agent/counter/agent.ts
  • packages/runtime/src/workbench.ts
  • packages/cli/src/cmd/build/vite/vite-builder.ts
  • packages/cli/src/cmd/build/entry-generator.ts
  • e2e/workbench.pw.ts
📚 Learning: 2025-12-13T14:15:18.261Z
Learnt from: jhaynie
Repo: agentuity/sdk PR: 168
File: packages/runtime/src/session.ts:536-546
Timestamp: 2025-12-13T14:15:18.261Z
Learning: The agentuity/runtime package is Bun-only; during code reviews, do not replace Bun-native APIs (e.g., Bun.CryptoHasher, Bun.serve, and other Bun namespace APIs) with Node.js alternatives. Review changes with the assumption that runtime runs on Bun, and ensure any edits preserve Bun compatibility and do not introduce Node.js-specific fallbacks. Apply this guidance broadly to files under packages/runtime (e.g., packages/runtime/src/...); if there are conditional environment checks, document why Bun is required and avoid dereferencing Bun-only APIs in non-Bun contexts.

Applied to files:

  • packages/runtime/src/workbench.ts
🧬 Code graph analysis (3)
apps/testing/e2e-web/src/agent/counter/agent.ts (1)
packages/schema/src/index.ts (1)
  • s (105-151)
packages/cli/src/cmd/build/vite/vite-builder.ts (2)
packages/cli/src/cmd/build/vite/config-loader.ts (1)
  • loadAgentuityConfig (13-39)
packages/core/src/index.ts (1)
  • getWorkbenchConfig (115-115)
e2e/workbench.pw.ts (2)
apps/testing/integration-suite/src/test/suite.ts (1)
  • test (218-220)
packages/cli/src/tui.ts (1)
  • json (1728-1740)
🔇 Additional comments (10)
packages/cli/src/cmd/build/entry-generator.ts (2)

240-242: LGTM! Workbench route blocking logic correctly inverted.

The logic now properly blocks /workbench/* routes only when the workbench is NOT enabled (!hasWorkbench), which allows workbench routes to be accessible when configured. This fixes the issue mentioned in the PR objectives where workbench routes were incorrectly blocked when a workbench config existed.

Also applies to: 275-277


460-462: LGTM! Registry import ensures agents are available for workbench metadata.

This import correctly ensures all agents are registered at startup, which is necessary for the workbench metadata endpoint to return JSON schemas needed by the Monaco editor. The placement between user app loading (Step 4) and provider initialization (Step 5) is appropriate.

packages/cli/src/cmd/build/vite/vite-builder.ts (1)

89-102: LGTM! Workbench config now correctly passed to entry generator.

This change fixes the issue where the workbench config parameter was omitted in the generateEntryFile call for server mode. The workbench config is now loaded and conditionally passed when enabled, ensuring the generated entry file has the correct hasWorkbench value. This aligns with the PR objective to fix Problem 1.

apps/testing/e2e-web/src/agent/counter/agent.ts (1)

1-44: LGTM! Clean implementation for E2E testing.

The counter agent correctly uses the async thread state API and handles all specified actions appropriately. The use of nullish coalescing for the default value and the default case in the switch statement ensures robust behavior.

package.json (1)

11-11: LGTM! E2E web package added to build chain.

The addition of apps/testing/e2e-web to the build script is appropriate and follows the existing pattern for other testing packages. This ensures the new counter agent is built and available for the workbench E2E tests.

e2e/workbench.pw.ts (2)

46-49: LGTM! Agent selector updated for robustness.

Adding "counter" to the agent selector ensures the test handles cases where the counter agent was previously selected. This defensive pattern makes the test more resilient to execution order variations.


84-154: LGTM! Comprehensive test for thread state persistence.

The test thoroughly validates that thread state persists across multiple agent executions by incrementing a counter three times and verifying the count increases correctly. The use of Monaco editor helpers, proper waits, and .first() for handling multiple matches demonstrates good testing practices.

packages/runtime/src/workbench.ts (3)

93-103: Async thread state operations correctly implemented.

The conversion to async push operations with proper await correctly implements the new thread state API. The conditional push for output (only when not null/undefined) appropriately prevents storing empty results.


168-168: Async delete operation correctly implemented.

The conversion to async delete with proper await correctly implements the new thread state API.


214-214: Async get operation correctly implemented.

The conversion to async get with proper await correctly implements the new thread state API. The fallback to an empty array (line 219) appropriately handles cases where no messages exist for the agent.

@blacksmith-sh

This comment has been minimized.

…gentId prefix

- Add 'current' property to AgentContext exposing agent metadata (id, agentId, name, etc.)
- Export AgentMetadata type from @agentuity/runtime
- Fix workbench clear state to iterate keys with agentId prefix convention
- Update counter agent to use ctx.current.agentId for state key prefix
- Update e2e test locators for clear thread functionality
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
packages/runtime/src/agent.ts (1)

350-350: AgentMetadata now publicly exported.

The export aligns with the PR objectives to expose agent metadata from @agentuity/runtime. Note that similar AgentMetadata types exist in packages/workbench/src/hooks/useAgentSchemas.ts and packages/cli/src/cmd/build/vite/agent-discovery.ts. Consider consolidating these definitions in a future refactor to maintain a single source of truth.

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between beb47d4 and 47b17df.

⛔ Files ignored due to path filters (1)
  • apps/testing/e2e-web/src/generated/app.ts is excluded by !**/generated/**
📒 Files selected for processing (6)
  • apps/testing/e2e-web/src/agent/counter/agent.ts
  • e2e/workbench.pw.ts
  • packages/runtime/src/_context.ts
  • packages/runtime/src/_standalone.ts
  • packages/runtime/src/agent.ts
  • packages/runtime/src/workbench.ts
🚧 Files skipped from review as they are similar to previous changes (2)
  • apps/testing/e2e-web/src/agent/counter/agent.ts
  • e2e/workbench.pw.ts
🧰 Additional context used
📓 Path-based instructions (4)
packages/runtime/src/**/*.{ts,tsx}

📄 CodeRabbit inference engine (packages/runtime/AGENTS.md)

packages/runtime/src/**/*.{ts,tsx}: Use ctx.logger for logging instead of console.log
Use agent.validator() for automatic input validation in route handlers
Do NOT add type annotations to handler parameters - let TypeScript infer them from schemas
Every handler must receive AgentContext with logger, tracer, and storage utilities

Files:

  • packages/runtime/src/_standalone.ts
  • packages/runtime/src/agent.ts
  • packages/runtime/src/_context.ts
  • packages/runtime/src/workbench.ts
**/*.{ts,tsx,js,jsx,json}

📄 CodeRabbit inference engine (AGENTS.md)

Use Prettier for code formatting with tabs (width 3), single quotes, and semicolons

Files:

  • packages/runtime/src/_standalone.ts
  • packages/runtime/src/agent.ts
  • packages/runtime/src/_context.ts
  • packages/runtime/src/workbench.ts
**/*.{ts,tsx}

📄 CodeRabbit inference engine (AGENTS.md)

Use TypeScript strict mode with ESNext target and bundler moduleResolution

Files:

  • packages/runtime/src/_standalone.ts
  • packages/runtime/src/agent.ts
  • packages/runtime/src/_context.ts
  • packages/runtime/src/workbench.ts
**/*.{ts,tsx,js,jsx}

📄 CodeRabbit inference engine (AGENTS.md)

**/*.{ts,tsx,js,jsx}: Use StructuredError from @agentuity/core for error handling
Ensure all errors AND warnings must be zero before committing

Files:

  • packages/runtime/src/_standalone.ts
  • packages/runtime/src/agent.ts
  • packages/runtime/src/_context.ts
  • packages/runtime/src/workbench.ts
🧠 Learnings (2)
📚 Learning: 2025-12-13T14:15:18.261Z
Learnt from: jhaynie
Repo: agentuity/sdk PR: 168
File: packages/runtime/src/session.ts:536-546
Timestamp: 2025-12-13T14:15:18.261Z
Learning: The agentuity/runtime package is Bun-only; during code reviews, do not replace Bun-native APIs (e.g., Bun.CryptoHasher, Bun.serve, and other Bun namespace APIs) with Node.js alternatives. Review changes with the assumption that runtime runs on Bun, and ensure any edits preserve Bun compatibility and do not introduce Node.js-specific fallbacks. Apply this guidance broadly to files under packages/runtime (e.g., packages/runtime/src/...); if there are conditional environment checks, document why Bun is required and avoid dereferencing Bun-only APIs in non-Bun contexts.

Applied to files:

  • packages/runtime/src/_standalone.ts
  • packages/runtime/src/agent.ts
  • packages/runtime/src/_context.ts
  • packages/runtime/src/workbench.ts
📚 Learning: 2025-12-21T00:31:41.858Z
Learnt from: jhaynie
Repo: agentuity/sdk PR: 274
File: packages/cli/src/cmd/build/vite/server-bundler.ts:12-41
Timestamp: 2025-12-21T00:31:41.858Z
Learning: In Bun runtime, BuildMessage and ResolveMessage are global types and are not exported from the bun module. Do not import { BuildMessage } from 'bun' or similar; these types are available globally and should be used without import. This applies to all TypeScript files that target the Bun runtime within the repository.

Applied to files:

  • packages/runtime/src/_standalone.ts
  • packages/runtime/src/agent.ts
  • packages/runtime/src/_context.ts
  • packages/runtime/src/workbench.ts
🧬 Code graph analysis (3)
packages/runtime/src/_standalone.ts (1)
packages/runtime/src/agent.ts (1)
  • AgentMetadata (350-350)
packages/runtime/src/agent.ts (2)
packages/workbench/src/hooks/useAgentSchemas.ts (1)
  • AgentMetadata (16-24)
packages/cli/src/cmd/build/vite/agent-discovery.ts (1)
  • AgentMetadata (54-64)
packages/runtime/src/_context.ts (1)
packages/runtime/src/agent.ts (1)
  • AgentMetadata (350-350)
🔇 Additional comments (7)
packages/runtime/src/agent.ts (2)

290-307: Well-documented addition of agent metadata exposure.

The new current field provides agents with convenient access to their own metadata during execution. The documentation includes practical examples for namespaced state keys and logging, making the API easy to use.


1577-1578: Current agent metadata properly populated.

The assignment ensures that ctx.current is available throughout the agent handler's execution. The cast to any is acceptable for this dynamic property assignment.

packages/runtime/src/_standalone.ts (1)

9-9: Consistent addition of current metadata field.

The current field is properly declared with the definite assignment assertion (!) and will be populated by the agent handler during execution (via line 1578 in agent.ts). This maintains consistency with RequestAgentContext.

Also applies to: 108-108

packages/runtime/src/_context.ts (1)

11-11: Consistent addition of current metadata field.

The current field is properly declared with the definite assignment assertion (!) and follows the same pattern as StandaloneAgentContext. The field will be populated during agent handler execution.

Also applies to: 54-54

packages/runtime/src/workbench.ts (3)

93-103: LGTM! Async thread state API correctly implemented.

The changes properly use await for async state operations with a capped message history. The conditional output storage (only when result is not null/undefined) is correct.


165-177: LGTM! Agent-specific state clearing correctly implemented.

The async iteration through all keys with conditional deletion properly clears both the message history (messages_${agentId}) and agent-specific state keys (${agentId}_ prefix). The await operations are correctly placed.


222-222: LGTM! State retrieval correctly updated to async.

The await is properly added for the async state.get() operation.

Change Meta+a to ControlOrMeta+A for select-all in Monaco editor,
making the test work on Windows/Linux and macOS.
Add delay after agent selection to allow metadata to load,
and increase Monaco visibility timeout from 10s to 20s for CI.
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
e2e/workbench.pw.ts (1)

107-107: Consider replacing hard-coded timeouts with condition-based waits.

Lines 107, 113, and 166 use waitForTimeout() with hard-coded delays. While these are acceptable for CI stability (as noted in comments), consider waiting for specific UI state changes or network requests where possible. For example, line 107 could wait for the Monaco editor to be enabled or for a specific metadata-loaded indicator rather than using a fixed 2-second delay.

That said, the current approach with generous timeouts is pragmatic for CI environments and adequately documented.

Also applies to: 113-113, 166-166

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 47b17df and 6313bee.

📒 Files selected for processing (1)
  • e2e/workbench.pw.ts
🧰 Additional context used
📓 Path-based instructions (3)
**/*.{ts,tsx,js,jsx,json}

📄 CodeRabbit inference engine (AGENTS.md)

Use Prettier for code formatting with tabs (width 3), single quotes, and semicolons

Files:

  • e2e/workbench.pw.ts
**/*.{ts,tsx}

📄 CodeRabbit inference engine (AGENTS.md)

Use TypeScript strict mode with ESNext target and bundler moduleResolution

Files:

  • e2e/workbench.pw.ts
**/*.{ts,tsx,js,jsx}

📄 CodeRabbit inference engine (AGENTS.md)

**/*.{ts,tsx,js,jsx}: Use StructuredError from @agentuity/core for error handling
Ensure all errors AND warnings must be zero before committing

Files:

  • e2e/workbench.pw.ts
🧠 Learnings (1)
📚 Learning: 2025-12-21T00:31:41.858Z
Learnt from: jhaynie
Repo: agentuity/sdk PR: 274
File: packages/cli/src/cmd/build/vite/server-bundler.ts:12-41
Timestamp: 2025-12-21T00:31:41.858Z
Learning: In Bun runtime, BuildMessage and ResolveMessage are global types and are not exported from the bun module. Do not import { BuildMessage } from 'bun' or similar; these types are available globally and should be used without import. This applies to all TypeScript files that target the Bun runtime within the repository.

Applied to files:

  • e2e/workbench.pw.ts
🔇 Additional comments (3)
e2e/workbench.pw.ts (3)

46-49: LGTM! Agent selector locator updated correctly.

The updated locator properly includes the new "counter" agent option and handles all possible button states (no selection, "hello", or "counter" selected).


122-130: LGTM! Monaco editor helper is well-implemented.

The typeInMonaco helper correctly uses ControlOrMeta+A for cross-platform select-all functionality (as addressed in the previous review). The interaction pattern (click to focus, select all, delete, type) is appropriate for Monaco editor manipulation.


132-158: LGTM! State persistence validation is thorough and well-structured.

The test correctly validates incremental counter state (1 → 2 → 3) with consistent assertion patterns. Good use of .first() to handle text appearing in multiple UI locations and appropriate timeouts for CI environments.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
e2e/workbench.pw.ts (1)

184-186: Use consistent assertion pattern for better maintainability.

Line 186 uses page.waitForSelector('text="count": 1', ...) which differs from the pattern used on lines 153, 161, and 169. For consistency and clearer test output, use the same assertion style:

-		// After clearing, the count should reset to 1
-		// Wait for any visible element with "count": 1
-		await page.waitForSelector('text="count": 1', { state: 'visible', timeout: 15000 });
+		// After clearing, the count should reset to 1
+		await expect(page.getByText('"count": 1').first()).toBeVisible({ timeout: 15000 });

This also provides better assertion failure messages through Playwright's expect API.

🧹 Nitpick comments (2)
e2e/workbench.pw.ts (2)

105-118: Consider replacing fixed wait with condition-based waiting.

The waitForTimeout(2000) on line 107 is an arbitrary delay that can cause flakiness—tests may fail if Monaco takes longer in CI, or waste time if it loads faster. Consider polling for a ready state instead.

That said, the subsequent expect(monacoEditor).toBeVisible({ timeout: 20000 }) on line 130 does provide a proper wait, so this fixed delay may be acceptable as a stabilization buffer before the check.

🔎 Suggested improvement
-		// Wait for the workbench to fetch agent metadata and render the input UI
-		// Monaco editor needs schema data to render, which may take time in CI
-		await page.waitForTimeout(2000);
+		// Wait for Monaco editor to be ready (it needs schema data from agent metadata)
+		await page.waitForFunction(() => {
+			return document.querySelector('.monaco-editor') !== null;
+		}, { timeout: 20000 });

This waits only as long as needed rather than a fixed 2 seconds.


176-178: Consider waiting for a specific condition after clearing.

The waitForTimeout(1000) is another arbitrary delay. If the Clear Thread action produces an observable UI change (e.g., a toast notification, button state change, or cleared output), waiting for that condition would be more reliable.

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6313bee and 66205cd.

📒 Files selected for processing (1)
  • e2e/workbench.pw.ts
🧰 Additional context used
📓 Path-based instructions (3)
**/*.{ts,tsx,js,jsx,json}

📄 CodeRabbit inference engine (AGENTS.md)

Use Prettier for code formatting with tabs (width 3), single quotes, and semicolons

Files:

  • e2e/workbench.pw.ts
**/*.{ts,tsx}

📄 CodeRabbit inference engine (AGENTS.md)

Use TypeScript strict mode with ESNext target and bundler moduleResolution

Files:

  • e2e/workbench.pw.ts
**/*.{ts,tsx,js,jsx}

📄 CodeRabbit inference engine (AGENTS.md)

**/*.{ts,tsx,js,jsx}: Use StructuredError from @agentuity/core for error handling
Ensure all errors AND warnings must be zero before committing

Files:

  • e2e/workbench.pw.ts
🧠 Learnings (1)
📚 Learning: 2025-12-21T00:31:41.858Z
Learnt from: jhaynie
Repo: agentuity/sdk PR: 274
File: packages/cli/src/cmd/build/vite/server-bundler.ts:12-41
Timestamp: 2025-12-21T00:31:41.858Z
Learning: In Bun runtime, BuildMessage and ResolveMessage are global types and are not exported from the bun module. Do not import { BuildMessage } from 'bun' or similar; these types are available globally and should be used without import. This applies to all TypeScript files that target the Bun runtime within the repository.

Applied to files:

  • e2e/workbench.pw.ts
🔇 Additional comments (2)
e2e/workbench.pw.ts (2)

46-50: LGTM!

The agent selector locator correctly includes the new "counter" option alongside existing options, maintaining consistency with the test patterns.


132-141: LGTM!

The typeInMonaco helper is well-structured with proper cross-platform keyboard handling (ControlOrMeta+A). The approach of click → select all → delete → type is a reliable pattern for Monaco editor interaction.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 66205cd and c0a48ef.

📒 Files selected for processing (1)
  • packages/runtime/src/workbench.ts
🧰 Additional context used
📓 Path-based instructions (4)
packages/runtime/src/**/*.{ts,tsx}

📄 CodeRabbit inference engine (packages/runtime/AGENTS.md)

packages/runtime/src/**/*.{ts,tsx}: Use ctx.logger for logging instead of console.log
Use agent.validator() for automatic input validation in route handlers
Do NOT add type annotations to handler parameters - let TypeScript infer them from schemas
Every handler must receive AgentContext with logger, tracer, and storage utilities

Files:

  • packages/runtime/src/workbench.ts
**/*.{ts,tsx,js,jsx,json}

📄 CodeRabbit inference engine (AGENTS.md)

Use Prettier for code formatting with tabs (width 3), single quotes, and semicolons

Files:

  • packages/runtime/src/workbench.ts
**/*.{ts,tsx}

📄 CodeRabbit inference engine (AGENTS.md)

Use TypeScript strict mode with ESNext target and bundler moduleResolution

Files:

  • packages/runtime/src/workbench.ts
**/*.{ts,tsx,js,jsx}

📄 CodeRabbit inference engine (AGENTS.md)

**/*.{ts,tsx,js,jsx}: Use StructuredError from @agentuity/core for error handling
Ensure all errors AND warnings must be zero before committing

Files:

  • packages/runtime/src/workbench.ts
🧠 Learnings (2)
📚 Learning: 2025-12-13T14:15:18.261Z
Learnt from: jhaynie
Repo: agentuity/sdk PR: 168
File: packages/runtime/src/session.ts:536-546
Timestamp: 2025-12-13T14:15:18.261Z
Learning: The agentuity/runtime package is Bun-only; during code reviews, do not replace Bun-native APIs (e.g., Bun.CryptoHasher, Bun.serve, and other Bun namespace APIs) with Node.js alternatives. Review changes with the assumption that runtime runs on Bun, and ensure any edits preserve Bun compatibility and do not introduce Node.js-specific fallbacks. Apply this guidance broadly to files under packages/runtime (e.g., packages/runtime/src/...); if there are conditional environment checks, document why Bun is required and avoid dereferencing Bun-only APIs in non-Bun contexts.

Applied to files:

  • packages/runtime/src/workbench.ts
📚 Learning: 2025-12-21T00:31:41.858Z
Learnt from: jhaynie
Repo: agentuity/sdk PR: 274
File: packages/cli/src/cmd/build/vite/server-bundler.ts:12-41
Timestamp: 2025-12-21T00:31:41.858Z
Learning: In Bun runtime, BuildMessage and ResolveMessage are global types and are not exported from the bun module. Do not import { BuildMessage } from 'bun' or similar; these types are available globally and should be used without import. This applies to all TypeScript files that target the Bun runtime within the repository.

Applied to files:

  • packages/runtime/src/workbench.ts
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Cloud Deployment Tests
  • GitHub Check: Playwright E2E Smoke Test
🔇 Additional comments (3)
packages/runtime/src/workbench.ts (3)

93-103: LGTM! Async thread state with capped history.

The shift to asynchronous push operations with a 50-message cap aligns with the PR objectives for persistent thread state. The logic correctly awaits each operation and only stores output when result is non-null.


165-176: LGTM! Comprehensive agent-specific state cleanup.

The dynamic purge correctly clears both workbench message history (messages_${agentId}) and all agent-specific state keys (${agentId}_*). This is more thorough than the previous single-key deletion and properly uses the async state API.


222-222: LGTM! Correct async state retrieval.

Properly uses await for the async thread state API.

Comment on lines 464 to 467
// Debug: Log registered agents
console.log(
`[workbench] Registered agents: [${Array.from(agents.keys()).join(', ')}], metadata agents: [${(metadata.agents || []).map((a) => a.name).join(', ')}]`
);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Replace console.log with ctx.var.logger for diagnostic logging.

While the diagnostic logging is valuable for troubleshooting, it should follow the coding guideline to use ctx.var.logger instead of console.log. This ensures logs are properly captured by the logging infrastructure.

As per coding guidelines: Use ctx.logger for logging instead of console.log in packages/runtime/src/**/*.{ts,tsx}.

🔎 Proposed fix
-			// Debug: Log registered agents
-			console.log(
-				`[workbench] Registered agents: [${Array.from(agents.keys()).join(', ')}], metadata agents: [${(metadata.agents || []).map((a) => a.name).join(', ')}]`
-			);
-
+			// Debug: Log registered agents
+			ctx.var.logger?.debug(
+				`[workbench] Registered agents: [${Array.from(agents.keys()).join(', ')}], metadata agents: [${(metadata.agents || []).map((a) => a.name).join(', ')}]`
+			);
+

Based on coding guidelines.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// Debug: Log registered agents
console.log(
`[workbench] Registered agents: [${Array.from(agents.keys()).join(', ')}], metadata agents: [${(metadata.agents || []).map((a) => a.name).join(', ')}]`
);
// Debug: Log registered agents
ctx.var.logger?.debug(
`[workbench] Registered agents: [${Array.from(agents.keys()).join(', ')}], metadata agents: [${(metadata.agents || []).map((a) => a.name).join(', ')}]`
);
🤖 Prompt for AI Agents
In packages/runtime/src/workbench.ts around lines 464 to 467, replace the
diagnostic console.log call with the runtime logger: use ctx.var.logger (or
ctx.logger per surrounding code) to emit the same message and data instead of
console.log so logs integrate with the logging infrastructure; construct the log
text or pass structured fields (agents list and metadata agents names) to
ctx.var.logger at an appropriate level (debug/info) and remove the console.log
call.

jhaynie and others added 2 commits January 1, 2026 00:17
The toJSONSchema function was using instanceof checks which fail
when code is bundled or when multiple copies of the package exist.

This caused the workbench to show 'This agent has no input schema'
because the JSON schema was returning {} for all schema types.

Changed to use constructor.name string comparisons which work
across module boundaries in bundled code. Also added support for:
- CoerceStringSchema, CoerceNumberSchema, CoerceBooleanSchema
- CoerceDateSchema (represented as string with date-time format)
- RecordSchema (object with additionalProperties)
- UnknownSchema and AnySchema (empty schema - accepts any value)

Amp-Thread-ID: https://ampcode.com/threads/T-019b7839-038e-712a-a4bb-14d15cfa04a7
Co-authored-by: Amp <amp@ampcode.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
packages/schema/src/json-schema.ts (2)

26-37: Extend JSONSchema interface to avoid type assertions.

The interface is missing additionalProperties and format properties that are used at lines 170 and 184 with type assertions. Extending the interface improves type safety and eliminates the need for as any casts.

🔎 Proposed extension to JSONSchema interface
 export interface JSONSchema {
 	type?: 'string' | 'number' | 'integer' | 'boolean' | 'object' | 'array' | 'null';
 	description?: string;
 	const?: string | number | boolean;
 	enum?: Array<string | number | boolean>;
 	properties?: Record<string, JSONSchema>;
 	required?: string[];
 	items?: JSONSchema;
 	anyOf?: JSONSchema[];
 	oneOf?: JSONSchema[];
 	allOf?: JSONSchema[];
+	additionalProperties?: JSONSchema | boolean;
+	format?: string;
 }

Then remove the type assertions at lines 170 and 184:

-		(result as any).additionalProperties = toJSONSchema(valueSchema);
+		result.additionalProperties = toJSONSchema(valueSchema);
-		(result as any).format = 'date-time';
+		result.format = 'date-time';

40-56: Consider updating documentation to reflect new schema types.

The JSDoc comment could mention the newly supported schema types (Record, Unknown, Any, and CoerceDate) for completeness.

📝 Suggested documentation update
 /**
  * Convert a schema to JSON Schema format.
- * Supports primitives, objects, arrays, unions, literals, optional, and nullable types.
+ * Supports primitives (including coerce variants), objects, arrays, records, unions, literals, 
+ * optional, nullable, unknown, any, and date types.
  *
  * @param schema - The schema to convert
  * @returns JSON Schema object
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 077ec17 and 945f067.

📒 Files selected for processing (1)
  • packages/schema/src/json-schema.ts
🧰 Additional context used
📓 Path-based instructions (5)
packages/schema/src/**/*.ts

📄 CodeRabbit inference engine (packages/schema/AGENTS.md)

packages/schema/src/**/*.ts: All schemas must implement StandardSchemaV1 from @agentuity/core
Use '~standard' property for StandardSchema interface
Error messages should be clear and actionable
Support type inference via Infer<T> utility type
No side effects - pure validation functions only
Runtime must be Node/Bun/Browser compatible with no runtime-specific code
Use fluent API with chainable methods where appropriate for schema builders

Files:

  • packages/schema/src/json-schema.ts
packages/schema/src/json-schema.ts

📄 CodeRabbit inference engine (packages/schema/AGENTS.md)

Provide JSON Schema conversion utilities via json-schema.ts

Files:

  • packages/schema/src/json-schema.ts
**/*.{ts,tsx,js,jsx,json}

📄 CodeRabbit inference engine (AGENTS.md)

Use Prettier for code formatting with tabs (width 3), single quotes, and semicolons

Files:

  • packages/schema/src/json-schema.ts
**/*.{ts,tsx}

📄 CodeRabbit inference engine (AGENTS.md)

Use TypeScript strict mode with ESNext target and bundler moduleResolution

Files:

  • packages/schema/src/json-schema.ts
**/*.{ts,tsx,js,jsx}

📄 CodeRabbit inference engine (AGENTS.md)

**/*.{ts,tsx,js,jsx}: Use StructuredError from @agentuity/core for error handling
Ensure all errors AND warnings must be zero before committing

Files:

  • packages/schema/src/json-schema.ts
🧠 Learnings (1)
📚 Learning: 2025-12-21T00:31:41.858Z
Learnt from: jhaynie
Repo: agentuity/sdk PR: 274
File: packages/cli/src/cmd/build/vite/server-bundler.ts:12-41
Timestamp: 2025-12-21T00:31:41.858Z
Learning: In Bun runtime, BuildMessage and ResolveMessage are global types and are not exported from the bun module. Do not import { BuildMessage } from 'bun' or similar; these types are available globally and should be used without import. This applies to all TypeScript files that target the Bun runtime within the repository.

Applied to files:

  • packages/schema/src/json-schema.ts
🧬 Code graph analysis (1)
packages/schema/src/json-schema.ts (1)
packages/schema/src/base.ts (1)
  • Schema (92-106)
🔇 Additional comments (4)
packages/schema/src/json-schema.ts (4)

66-90: LGTM! Primitive type handling is correct.

The consistent use of isSchemaType for both base and coerce variants properly handles schema type detection. The special handling of UndefinedSchema (returning empty object) is appropriate since JSON Schema doesn't have a direct undefined type.


93-162: LGTM! Complex type conversions are well-implemented.

The handling of literal, object, array, optional, nullable, and union types correctly uses the isSchemaType helper and follows JSON Schema conventions. The nullable pattern using anyOf with null is the standard approach.


164-186: Good additions for extended schema type support.

The new handlers for RecordSchema, UnknownSchema/AnySchema, and CoerceDateSchema correctly map to their JSON Schema equivalents:

  • RecordSchema → object with additionalProperties
  • Unknown/Any → empty (permissive) schema
  • CoerceDate → string with date-time format

Note: The type assertions at lines 170 and 184 should be eliminated by extending the JSONSchema interface (addressed in a previous comment).


14-20: The constructor.name approach in the isSchemaType helper is safe and does not require changes. The project's build pipeline uses pure TypeScript compilation (bunx tsc --build) with no minifiers configured, so constructor names will be preserved in production builds.

Likely an incorrect or invalid review comment.

The 'hello agent' test was flaky because clicking on dropdown options
was failing due to animations making the element unstable. Using
keyboard navigation (type to filter + Enter) is more reliable.
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
e2e/workbench.pw.ts (1)

196-198: Use consistent assertion pattern.

Line 198 uses waitForSelector while the rest of the test uses expect(page.getByText(...).first()).toBeVisible(). For consistency and to handle multiple matches:

-		// Wait for any visible element with "count": 1
-		await page.waitForSelector('text="count": 1', { state: 'visible', timeout: 15000 });
+		// After clearing, the count should reset to 1
+		await expect(page.getByText('"count": 1').first()).toBeVisible({ timeout: 15000 });
🧹 Nitpick comments (1)
e2e/workbench.pw.ts (1)

104-106: Consider using keyboard navigation for consistency.

The "hello agent" test (lines 58-62) uses keyboard navigation to select the option because it's more reliable with animated dropdowns. This test uses a direct click instead, which may be less stable.

🔎 Proposed fix for consistent selection
 		const counterAgentOption = page.locator('[role="option"]:has-text("counter")');
 		await expect(counterAgentOption).toBeVisible({ timeout: 5000 });
-		await counterAgentOption.click();
+		await page.keyboard.type('counter', { delay: 50 });
+		await page.waitForTimeout(200);
+		await page.keyboard.press('Enter');
📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 945f067 and d8b8539.

📒 Files selected for processing (1)
  • e2e/workbench.pw.ts
🧰 Additional context used
📓 Path-based instructions (3)
**/*.{ts,tsx,js,jsx,json}

📄 CodeRabbit inference engine (AGENTS.md)

Use Prettier for code formatting with tabs (width 3), single quotes, and semicolons

Files:

  • e2e/workbench.pw.ts
**/*.{ts,tsx}

📄 CodeRabbit inference engine (AGENTS.md)

Use TypeScript strict mode with ESNext target and bundler moduleResolution

Files:

  • e2e/workbench.pw.ts
**/*.{ts,tsx,js,jsx}

📄 CodeRabbit inference engine (AGENTS.md)

**/*.{ts,tsx,js,jsx}: Use StructuredError from @agentuity/core for error handling
Ensure all errors AND warnings must be zero before committing

Files:

  • e2e/workbench.pw.ts
🧠 Learnings (1)
📚 Learning: 2025-12-21T00:31:41.858Z
Learnt from: jhaynie
Repo: agentuity/sdk PR: 274
File: packages/cli/src/cmd/build/vite/server-bundler.ts:12-41
Timestamp: 2025-12-21T00:31:41.858Z
Learning: In Bun runtime, BuildMessage and ResolveMessage are global types and are not exported from the bun module. Do not import { BuildMessage } from 'bun' or similar; these types are available globally and should be used without import. This applies to all TypeScript files that target the Bun runtime within the repository.

Applied to files:

  • e2e/workbench.pw.ts
🔇 Additional comments (3)
e2e/workbench.pw.ts (3)

46-50: LGTM!

The agent selector update correctly handles the new counter agent option alongside hello.


54-64: LGTM!

Keyboard navigation is indeed more reliable for animated dropdowns. The approach of typing to filter and pressing Enter is a solid pattern for Playwright tests.


144-165: LGTM!

The typeInMonaco helper is well-structured with cross-platform keyboard handling. The assertions correctly use .first() to handle cases where text appears in multiple places.

Wait for cmdk-list to appear and add a delay for animations before
using force: true click to bypass stability checks. This should make
the dropdown selection more reliable in CI.
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
e2e/workbench.pw.ts (1)

196-198: Use consistent assertion pattern.

Line 198 uses waitForSelector while the rest of the test (lines 165, 173, 181) uses the expect().toBeVisible() pattern. Apply the same pattern for consistency.

🔎 Proposed fix
 		// After clearing, the count should reset to 1
-		// Wait for any visible element with "count": 1
-		await page.waitForSelector('text="count": 1', { state: 'visible', timeout: 15000 });
+		await expect(page.getByText('"count": 1').first()).toBeVisible({ timeout: 15000 });
🧹 Nitpick comments (2)
e2e/workbench.pw.ts (2)

102-106: Missing dropdown stabilization wait.

The agent selector test (lines 54-62) waits for [cmdk-list] and adds an animation delay before clicking, but this test clicks the option immediately. Apply the same stabilization pattern for consistency and CI reliability.

🔎 Proposed fix
 		await expect(agentSelector).toBeVisible({ timeout: 10000 });
 		await agentSelector.click();
 
+		// Wait for the dropdown to be fully opened - cmdk uses data-state attribute
+		await page.waitForSelector('[cmdk-list]', { timeout: 5000 });
+		// Give animation time to settle
+		await page.waitForTimeout(300);
+
 		const counterAgentOption = page.locator('[role="option"]:has-text("counter")');
 		await expect(counterAgentOption).toBeVisible({ timeout: 5000 });
-		await counterAgentOption.click();
+		await counterAgentOption.click({ force: true });

188-189: Consider replacing hardcoded wait with condition-based assertion.

Using waitForTimeout(1000) can lead to flakiness. Consider waiting for a visible UI change that indicates the clear completed, or at minimum, this timeout may need adjustment based on CI performance.

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d8b8539 and 1d33f98.

📒 Files selected for processing (1)
  • e2e/workbench.pw.ts
🧰 Additional context used
📓 Path-based instructions (3)
**/*.{ts,tsx,js,jsx,json}

📄 CodeRabbit inference engine (AGENTS.md)

Use Prettier for code formatting with tabs (width 3), single quotes, and semicolons

Files:

  • e2e/workbench.pw.ts
**/*.{ts,tsx}

📄 CodeRabbit inference engine (AGENTS.md)

Use TypeScript strict mode with ESNext target and bundler moduleResolution

Files:

  • e2e/workbench.pw.ts
**/*.{ts,tsx,js,jsx}

📄 CodeRabbit inference engine (AGENTS.md)

**/*.{ts,tsx,js,jsx}: Use StructuredError from @agentuity/core for error handling
Ensure all errors AND warnings must be zero before committing

Files:

  • e2e/workbench.pw.ts
🧠 Learnings (1)
📚 Learning: 2025-12-21T00:31:41.858Z
Learnt from: jhaynie
Repo: agentuity/sdk PR: 274
File: packages/cli/src/cmd/build/vite/server-bundler.ts:12-41
Timestamp: 2025-12-21T00:31:41.858Z
Learning: In Bun runtime, BuildMessage and ResolveMessage are global types and are not exported from the bun module. Do not import { BuildMessage } from 'bun' or similar; these types are available globally and should be used without import. This applies to all TypeScript files that target the Bun runtime within the repository.

Applied to files:

  • e2e/workbench.pw.ts
🔇 Additional comments (2)
e2e/workbench.pw.ts (2)

54-62: Good stabilization for animated dropdown.

The approach of waiting for the cmdk-list to appear, adding an animation delay, and using force: true is a pragmatic solution for CI reliability with animated dropdowns. This pattern effectively addresses timing issues without excessive complexity.


144-153: Well-structured helper for Monaco editor interaction.

The typeInMonaco helper encapsulates editor interaction cleanly. Using ControlOrMeta+A correctly handles cross-platform compatibility for the select-all shortcut.

@jhaynie jhaynie force-pushed the fix/workbench-config-and-agent-registry branch from e4a6b40 to e8f07dc Compare January 1, 2026 06:43
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
e2e/workbench.pw.ts (1)

176-176: Use consistent assertion pattern for better maintainability.

This line uses waitForSelector('text="count": 1', ...) which differs from the pattern established at lines 143, 151, and 159. The inconsistency makes the test harder to maintain.

Use the same expect(page.getByText(...).first()).toBeVisible() pattern for consistency.

Per the existing review comment on this line, this should be:

await expect(page.getByText('"count": 1').first()).toBeVisible({ timeout: 15000 });
🧹 Nitpick comments (4)
e2e/workbench.pw.ts (4)

54-62: Consider replacing fixed animation timeout with condition-based wait.

Line 57 uses a hardcoded 300ms timeout to wait for dropdown animation. Fixed timeouts make tests slower and less reliable—they either wait too long (wasting CI time) or too short (flaky failures).

Consider waiting for a specific animation completion state instead, such as checking for the data-state="open" attribute on the dropdown element or verifying the option is stable using Playwright's built-in actionability checks (which handle animations automatically).

🔎 Alternative approach
 // Wait for the dropdown to be fully opened - cmdk uses data-state attribute
 await page.waitForSelector('[cmdk-list]', { timeout: 5000 });
-// Give animation time to settle
-await page.waitForTimeout(300);
 
-// Click with force to bypass animation stability checks
 const helloAgentOption = page.locator('[role="option"]:has-text("hello")');
 await expect(helloAgentOption).toBeVisible({ timeout: 5000 });
-await helloAgentOption.click({ force: true });
+await helloAgentOption.click();

If the click still fails without force: true, investigate whether cmdk sets a data-state attribute that indicates animation completion and wait for that instead.


110-115: Consider more specific error handling for optional clear step.

Line 112 uses .catch(() => false) to make the clear step optional, which suppresses all errors—not just timeouts. If isVisible() fails due to page navigation errors or other issues, these will be silently ignored, potentially hiding real problems.

For optional cleanup in tests, consider explicitly checking element existence first or using a try/catch block to log unexpected errors.

🔎 More explicit pattern
 // Clear any existing thread state first
 const clearButton = page.locator('button:has-text("Clear Thread")');
-if (await clearButton.isVisible({ timeout: 2000 }).catch(() => false)) {
+try {
+	if (await clearButton.isVisible({ timeout: 2000 })) {
+		await clearButton.click();
+		await page.waitForTimeout(500);
+	}
+} catch (err) {
+	// Clear button not found - OK, starting fresh
+	console.log('Clear button not visible, skipping clear:', err);
-	await clearButton.click();
-	await page.waitForTimeout(500);
 }

122-131: Add verification that JSON was successfully typed into Monaco editor.

The typeInMonaco helper types content via keyboard but doesn't verify the text was actually entered. In CI environments or with slow Monaco initialization, keyboard events might not be reliably processed, leading to silent test failures where the wrong content (or no content) is submitted.

Add a verification step after typing to ensure the editor contains the expected JSON before the caller proceeds to submit.

🔎 Add content verification
 // Helper function to type JSON into Monaco editor
 const typeInMonaco = async (json: string) => {
 	// Click into the Monaco editor to focus it
 	await monacoEditor.click();
 	// Select all and delete existing content (use ControlOrMeta for cross-platform)
 	await page.keyboard.press('ControlOrMeta+A');
 	await page.keyboard.press('Backspace');
 	// Type the new content
 	await page.keyboard.type(json);
+	// Verify the content was entered
+	await expect(page.getByText(json).first()).toBeVisible({ timeout: 5000 });
 };

Note: If the exact text isn't visible (Monaco may format it), consider using page.evaluate() to read the editor's model content directly via Monaco's API.


166-167: Replace fixed timeout with condition-based wait after Clear Thread.

Line 167 uses a hardcoded 1000ms timeout to wait for the clear operation to complete. Fixed timeouts are unreliable—they may be too short on slow CI runners or waste time on fast machines.

Wait for a specific condition that indicates the clear completed, such as the output section being empty or a loading indicator disappearing.

🔎 Wait for clear completion signal
 // Click the Clear Thread button to reset the agent's thread state
 await expect(clearButton).toBeVisible({ timeout: 5000 });
 await clearButton.click();
 
-// Wait for the clear to complete and UI to update
-await page.waitForTimeout(1000);
+// Wait for the output section to clear (previous count text should disappear)
+await expect(page.getByText('"count": 3')).not.toBeVisible({ timeout: 5000 });

If there's no reliable DOM change, at minimum document why the fixed timeout is necessary.

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1d33f98 and e4a6b40.

📒 Files selected for processing (2)
  • e2e/workbench.pw.ts
  • packages/runtime/src/workbench.ts
🚧 Files skipped from review as they are similar to previous changes (1)
  • packages/runtime/src/workbench.ts
🧰 Additional context used
📓 Path-based instructions (3)
**/*.{ts,tsx,js,jsx,json}

📄 CodeRabbit inference engine (AGENTS.md)

Use Prettier for code formatting with tabs (width 3), single quotes, and semicolons

Files:

  • e2e/workbench.pw.ts
**/*.{ts,tsx}

📄 CodeRabbit inference engine (AGENTS.md)

Use TypeScript strict mode with ESNext target and bundler moduleResolution

Files:

  • e2e/workbench.pw.ts
**/*.{ts,tsx,js,jsx}

📄 CodeRabbit inference engine (AGENTS.md)

**/*.{ts,tsx,js,jsx}: Use StructuredError from @agentuity/core for error handling
Ensure all errors AND warnings must be zero before committing

Files:

  • e2e/workbench.pw.ts
🧠 Learnings (1)
📚 Learning: 2025-12-21T00:31:41.858Z
Learnt from: jhaynie
Repo: agentuity/sdk PR: 274
File: packages/cli/src/cmd/build/vite/server-bundler.ts:12-41
Timestamp: 2025-12-21T00:31:41.858Z
Learning: In Bun runtime, BuildMessage and ResolveMessage are global types and are not exported from the bun module. Do not import { BuildMessage } from 'bun' or similar; these types are available globally and should be used without import. This applies to all TypeScript files that target the Bun runtime within the repository.

Applied to files:

  • e2e/workbench.pw.ts

@jhaynie jhaynie merged commit f6b2780 into main Jan 1, 2026
8 of 9 checks passed
@jhaynie jhaynie deleted the fix/workbench-config-and-agent-registry branch January 1, 2026 06:47
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
e2e/workbench.pw.ts (1)

54-62: Consider verifying if force: true is still necessary.

The code waits for the dropdown animation to settle before clicking (lines 54-58), then uses force: true to bypass Playwright's actionability checks (line 62). While this improves test stability, forcing clicks can mask real interaction issues that users might encounter.

If the animation wait is sufficient, try removing force: true to ensure the element is genuinely clickable. If force is still needed, consider adding a comment explaining why.

🔎 Optional refactor to validate necessity of force click
 	// Wait for the dropdown to be fully opened - cmdk uses data-state attribute
 	await page.waitForSelector('[cmdk-list]', { timeout: 5000 });
 	// Give animation time to settle
 	await page.waitForTimeout(300);
 
-	// Click with force to bypass animation stability checks
 	const helloAgentOption = page.locator('[role="option"]:has-text("hello")');
 	await expect(helloAgentOption).toBeVisible({ timeout: 5000 });
-	await helloAgentOption.click({ force: true });
+	await helloAgentOption.click();

If tests remain stable without force: true, the simpler version is preferable. If force is still required, document why with a comment.

📜 Review details

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e4a6b40 and e8f07dc.

📒 Files selected for processing (2)
  • e2e/workbench.pw.ts
  • packages/runtime/src/workbench.ts
🧰 Additional context used
📓 Path-based instructions (4)
packages/runtime/src/**/*.{ts,tsx}

📄 CodeRabbit inference engine (packages/runtime/AGENTS.md)

packages/runtime/src/**/*.{ts,tsx}: Use ctx.logger for logging instead of console.log
Use agent.validator() for automatic input validation in route handlers
Do NOT add type annotations to handler parameters - let TypeScript infer them from schemas
Every handler must receive AgentContext with logger, tracer, and storage utilities

Files:

  • packages/runtime/src/workbench.ts
**/*.{ts,tsx,js,jsx,json}

📄 CodeRabbit inference engine (AGENTS.md)

Use Prettier for code formatting with tabs (width 3), single quotes, and semicolons

Files:

  • packages/runtime/src/workbench.ts
  • e2e/workbench.pw.ts
**/*.{ts,tsx}

📄 CodeRabbit inference engine (AGENTS.md)

Use TypeScript strict mode with ESNext target and bundler moduleResolution

Files:

  • packages/runtime/src/workbench.ts
  • e2e/workbench.pw.ts
**/*.{ts,tsx,js,jsx}

📄 CodeRabbit inference engine (AGENTS.md)

**/*.{ts,tsx,js,jsx}: Use StructuredError from @agentuity/core for error handling
Ensure all errors AND warnings must be zero before committing

Files:

  • packages/runtime/src/workbench.ts
  • e2e/workbench.pw.ts
🧠 Learnings (2)
📚 Learning: 2025-12-13T14:15:18.261Z
Learnt from: jhaynie
Repo: agentuity/sdk PR: 168
File: packages/runtime/src/session.ts:536-546
Timestamp: 2025-12-13T14:15:18.261Z
Learning: The agentuity/runtime package is Bun-only; during code reviews, do not replace Bun-native APIs (e.g., Bun.CryptoHasher, Bun.serve, and other Bun namespace APIs) with Node.js alternatives. Review changes with the assumption that runtime runs on Bun, and ensure any edits preserve Bun compatibility and do not introduce Node.js-specific fallbacks. Apply this guidance broadly to files under packages/runtime (e.g., packages/runtime/src/...); if there are conditional environment checks, document why Bun is required and avoid dereferencing Bun-only APIs in non-Bun contexts.

Applied to files:

  • packages/runtime/src/workbench.ts
📚 Learning: 2025-12-21T00:31:41.858Z
Learnt from: jhaynie
Repo: agentuity/sdk PR: 274
File: packages/cli/src/cmd/build/vite/server-bundler.ts:12-41
Timestamp: 2025-12-21T00:31:41.858Z
Learning: In Bun runtime, BuildMessage and ResolveMessage are global types and are not exported from the bun module. Do not import { BuildMessage } from 'bun' or similar; these types are available globally and should be used without import. This applies to all TypeScript files that target the Bun runtime within the repository.

Applied to files:

  • packages/runtime/src/workbench.ts
  • e2e/workbench.pw.ts
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Playwright E2E Smoke Test
  • GitHub Check: Cloud Deployment Tests
🔇 Additional comments (5)
e2e/workbench.pw.ts (3)

110-115: Good practice: Idempotent test setup with conditional state clearing.

The optional clear pattern ensures tests start from a clean state regardless of previous runs. The isVisible().catch(() => false) idiom safely handles cases where the Clear Thread button doesn't exist yet.


122-131: Well done: Cross-platform keyboard shortcut implemented correctly.

The helper function uses ControlOrMeta+A for select-all (line 127), which correctly resolves to Command on macOS and Control on Windows/Linux. This addresses the previous review concern about platform compatibility. The helper also eliminates code duplication for repeated Monaco editor interactions.


133-175: Excellent: Comprehensive state persistence testing with consistent assertions.

The test systematically verifies thread state persistence through:

  1. Three successive increments (count: 1 → 2 → 3) demonstrating state retention
  2. Clear Thread functionality to reset state
  3. Post-clear increment verifying fresh start (count: 1 again)

All assertions use the consistent pattern page.getByText(...).first().toBeVisible() (lines 143, 151, 159, 175), which handles multiple text occurrences and matches the established convention. This resolves the previous review comment about assertion consistency.

packages/runtime/src/workbench.ts (2)

222-222: LGTM! Correct async state retrieval.

The conversion to async get() is straightforward and correct. The result is properly awaited and safely handled with the Array.isArray check before returning.


93-103: The thread state push API signature is correctly used.

The async conversion and capped message history (50 messages) align well with the PR objectives. The push() method signature is push<T>(key: string, value: T, maxRecords?: number): Promise<void>, and the implementation correctly applies the maxRecords limit as a sliding window—keeping only the most recent N records by splicing from the beginning of the array.

The code correctly passes the maxMessages value as the maxRecords parameter, which maintains separate per-agent message histories capped at 50 messages each. The input and output are properly pushed with await, and the thread state is saved after each execution.

Comment on lines +165 to 177
// Clear state associated with this specific agent:
// 1. messages_${agentId} - workbench message history
// 2. Any keys starting with ${agentId}_ - agent-specific state
const allKeys = await ctx.var.thread.state.keys();
const agentPrefix = `${agentId}_`;
const messagesKey = `messages_${agentId}`;

// Remove the messages for this agent
ctx.var.thread.state.delete(agentMessagesKey);
for (const key of allKeys) {
if (key === messagesKey || key.startsWith(agentPrefix)) {
await ctx.var.thread.state.delete(key);
}
}

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Find and read the workbench.ts file to see the actual code and context
find . -name "workbench.ts" -path "*/packages/runtime/src/*"

Repository: agentuity/sdk

Length of output: 92


🏁 Script executed:

#!/bin/bash
# Check file size first to understand scope
wc -l packages/runtime/src/workbench.ts

Repository: agentuity/sdk

Length of output: 94


🏁 Script executed:

#!/bin/bash
# Read the specific section and surrounding context
sed -n '155,185p' packages/runtime/src/workbench.ts

Repository: agentuity/sdk

Length of output: 1018


🏁 Script executed:

#!/bin/bash
# Search for the keys() method definition in the codebase to understand its return type
rg -n "keys\s*\(\s*\)" -A 3 --type=ts | head -50

Repository: agentuity/sdk

Length of output: 3396


🏁 Script executed:

#!/bin/bash
# Look for thread state interface/class definition
rg -n "thread\.state" -B 2 -A 2 --type=ts | head -50

Repository: agentuity/sdk

Length of output: 3940


🏁 Script executed:

#!/bin/bash
# Check if StructuredError is imported in workbench.ts
head -30 packages/runtime/src/workbench.ts | grep -i "structured\|import"

Repository: agentuity/sdk

Length of output: 558


🏁 Script executed:

#!/bin/bash
# Search for StructuredError usage patterns in runtime files
rg "StructuredError" packages/runtime/src/ -A 2 | head -40

Repository: agentuity/sdk

Length of output: 2141


🏁 Script executed:

#!/bin/bash
# Check error handling patterns in other route handlers
rg "catch\s*\(" packages/runtime/src/workbench.ts -B 2 -A 3

Repository: agentuity/sdk

Length of output: 1029


🏁 Script executed:

#!/bin/bash
# Check logging patterns in this file
rg "ctx\.logger|console\." packages/runtime/src/workbench.ts

Repository: agentuity/sdk

Length of output: 39


Error handling must use StructuredError per coding guidelines.

The ctx.var.thread.state.keys() returns a snapshot array rather than a live iterator, making the dynamic enumeration and deletion approach safe and correct.

However, the catch block (line 179) violates the coding guideline requiring StructuredError from @agentuity/core for all error handling. Replace the plain error object with a properly typed StructuredError to match patterns used elsewhere in the codebase (see agent.ts, _context.ts).

🤖 Prompt for AI Agents
In packages/runtime/src/workbench.ts around lines 165 to 177 (catch at ~179),
replace the plain error handling in the catch block with a StructuredError from
@agentuity/core: import StructuredError, then construct and throw (or log) a
StructuredError that includes a clear message (e.g., "Failed clearing agent
state"), a machine-readable code (e.g., "WORKBENCH_CLEAR_STATE_FAILED"), the
original error as the cause, and contextual metadata such as agentId and the
keys attempted to delete; ensure the original error is preserved via the cause
field and match the project pattern used in agent.ts and _context.ts.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants