Skip to content

Conversation

@miguelg719
Copy link
Collaborator

@miguelg719 miguelg719 commented Jan 25, 2026

Add network capture commands that write HTTP requests/responses to the filesystem, enabling agents to use standard file tools for inspection.

Features:

  • network on/off - Enable/disable capture to temp directory
  • network path - Get capture directory path
  • network clear - Clear captured requests

Captured requests are saved as directories with separate request.json and response.json files, making them easily inspectable with grep, jq, cat, and other standard tools.

why

what changed

test plan


Summary by cubic

Adds network capture to the CLI, writing HTTP requests and responses to a temp directory so agents can inspect traffic with standard file tools.

  • New Features
    • stagehand network on/off/path/clear
    • Saves each request in its own directory with request.json and response.json
    • Records method, URL, headers, body, status, mime type, duration, and errors
    • Directory naming: 001---
    • Capture root: /tmp/stagehand--network; path returned by network on/path

Written for commit 5803ba7. Summary will update on new commits. Review in cubic

Add network capture commands that write HTTP requests/responses to the
filesystem, enabling agents to use standard file tools for inspection.

Features:
- network on/off - Enable/disable capture to temp directory
- network path - Get capture directory path
- network clear - Clear captured requests

Captured requests are saved as directories with separate request.json
and response.json files, making them easily inspectable with grep,
jq, cat, and other standard tools.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
@changeset-bot
Copy link

changeset-bot bot commented Jan 25, 2026

⚠️ No Changeset found

Latest commit: 5803ba7

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

@miguelg719 miguelg719 marked this pull request as ready for review January 25, 2026 02:43
@miguelg719 miguelg719 merged commit 512739e into cli Jan 25, 2026
5 of 6 checks passed
@greptile-apps
Copy link
Contributor

greptile-apps bot commented Jan 25, 2026

Greptile Overview

Greptile Summary

Adds network traffic capture functionality to the CLI, writing HTTP request/response data to filesystem directories for agent inspection using standard tools.

Key Changes:

  • Implemented network on/off/path/clear commands that enable CDP Network domain and capture traffic to /tmp/stagehand-<session>-network/
  • Each request saved as a directory (001-GET-domain-path/) with separate request.json and response.json files
  • Added event listeners for Network.requestWillBeSent, Network.responseReceived, and Network.loadingFinished CDP events
  • Documentation and TODO updates reflect the new feature

Critical Issues Found:

  • Response data loss bug: The Network.loadingFinished handler creates response objects with default values (status: 0, empty headers) instead of using the actual response data cached from Network.responseReceived, causing all saved response files to be missing critical HTTP response information
  • Missing cleanup: Network capture directories are not removed during daemon shutdown, leading to temp directory accumulation over time

Confidence Score: 2/5

  • This PR has critical bugs that will break core functionality
  • The response data loss bug means the feature won't work as intended - saved response files will be missing HTTP status codes, headers, and MIME types. The cleanup issue causes resource leaks. Both issues need to be fixed before merging.
  • Pay close attention to packages/cli/src/index.ts, specifically the Network.loadingFinished handler (lines 244-283) and the cleanupStaleFiles function (lines 66-71)

Important Files Changed

Filename Overview
packages/cli/README.md Documentation added for network capture feature with clear examples and agent workflow
packages/cli/TODO.md Marked network capture tasks as completed
packages/cli/src/index.ts Network capture implementation with response data loss bug in loadingFinished handler and missing cleanup

Sequence Diagram

sequenceDiagram
    participant User
    participant CLI
    participant Daemon
    participant CDP
    participant Browser
    participant FS as Filesystem

    User->>CLI: network on
    CLI->>Daemon: network_enable command
    Daemon->>FS: mkdir network capture dir
    Daemon->>CDP: Network.enable
    Daemon->>Daemon: Setup event listeners
    Daemon-->>CLI: {enabled: true, path: "/tmp/..."}
    CLI-->>User: Network capture enabled

    User->>CLI: open https://example.com
    CLI->>Daemon: open command
    Daemon->>Browser: navigate to URL
    
    Browser->>CDP: Network.requestWillBeSent event
    CDP->>Daemon: Request details
    Daemon->>Daemon: Store in pendingRequests
    Daemon->>FS: Write request.json

    Browser->>CDP: Network.responseReceived event
    CDP->>Daemon: Response headers/status
    Daemon->>Daemon: Store response info (unused)

    Browser->>CDP: Network.loadingFinished event
    CDP->>Daemon: Request completed
    Daemon->>CDP: Network.getResponseBody
    CDP-->>Daemon: Response body
    Daemon->>FS: Write response.json
    Daemon->>Daemon: Cleanup maps

    User->>CLI: network path
    CLI->>Daemon: network_path command
    Daemon-->>CLI: {path: "/tmp/...", enabled: true}
    CLI-->>User: Capture directory path

    User->>FS: Read captured request/response files
    FS-->>User: JSON data for inspection

    User->>CLI: network clear
    CLI->>Daemon: network_clear command
    Daemon->>FS: Remove all request directories
    Daemon->>Daemon: Reset counter & pending requests
    Daemon-->>CLI: {cleared: true}
    CLI-->>User: Requests cleared

    User->>CLI: network off
    CLI->>Daemon: network_disable command
    Daemon->>CDP: Network.disable
    Daemon-->>CLI: {enabled: false}
    CLI-->>User: Network capture disabled
Loading

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3 files reviewed, 3 comments

Edit Code Review Agent Settings | Greptile

Comment on lines +267 to +275
const responseData = {
id: params.requestId,
status: 0, // Will be filled from cached data if available
statusText: "",
headers: {} as Record<string, string>,
mimeType: "",
body,
duration,
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Response status and headers from Network.responseReceived are being lost. The responseData object is initialized with default values (status: 0, empty headers) but never populated with the actual response info that was stored in the Network.responseReceived handler.

The responseReceived handler stores response data in params._responseInfo (line 240), but this loadingFinished handler doesn't retrieve it. This means all saved response.json files will have status: 0 and empty headers/mimeType instead of the actual HTTP response data.

Suggested change
const responseData = {
id: params.requestId,
status: 0, // Will be filled from cached data if available
statusText: "",
headers: {} as Record<string, string>,
mimeType: "",
body,
duration,
};
const startTime = requestStartTimes.get(params.requestId) || Date.now();
const duration = Date.now() - startTime;
// Try to get response body
let responseData = {
id: params.requestId,
status: 0,
statusText: "",
headers: {} as Record<string, string>,
mimeType: "",
body: null as string | null,
duration,
};
Prompt To Fix With AI
This is a comment left during a code review.
Path: packages/cli/src/index.ts
Line: 267:275

Comment:
Response status and headers from `Network.responseReceived` are being lost. The `responseData` object is initialized with default values (`status: 0`, empty headers) but never populated with the actual response info that was stored in the `Network.responseReceived` handler.

The `responseReceived` handler stores response data in `params._responseInfo` (line 240), but this `loadingFinished` handler doesn't retrieve it. This means all saved `response.json` files will have `status: 0` and empty headers/mimeType instead of the actual HTTP response data.

```suggestion
      const startTime = requestStartTimes.get(params.requestId) || Date.now();
      const duration = Date.now() - startTime;

      // Try to get response body
      let responseData = {
        id: params.requestId,
        status: 0,
        statusText: "",
        headers: {} as Record<string, string>,
        mimeType: "",
        body: null as string | null,
        duration,
      };
```

How can I resolve this? If you propose a fix, please make it concise.

async function writeRequestToFs(request: PendingRequest): Promise<string | null> {
if (!networkDir) return null;

const dirName = getRequestDirName(networkCounter++, request.method, request.url);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

networkCounter is incremented on every request, which means directory names will continue incrementing even after network clear is called. While this doesn't break functionality, it may be confusing - consider whether counter should persist or reset on clear.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Prompt To Fix With AI
This is a comment left during a code review.
Path: packages/cli/src/index.ts
Line: 448:448

Comment:
`networkCounter` is incremented on every request, which means directory names will continue incrementing even after `network clear` is called. While this doesn't break functionality, it may be confusing - consider whether counter should persist or reset on clear.

<sub>Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!</sub>

How can I resolve this? If you propose a fix, please make it concise.

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Jan 25, 2026

Additional Comments (1)

packages/cli/src/index.ts
Network capture directory is not cleaned up on daemon shutdown. The cleanupStaleFiles function removes socket, PID, and WebSocket files but doesn't remove the network capture directory created by getNetworkDir(). This could accumulate temp directories over time.

async function cleanupStaleFiles(session: string): Promise<void> {
  try { await fs.unlink(getSocketPath(session)); } catch {}
  try { await fs.unlink(getPidPath(session)); } catch {}
  try { await fs.unlink(getWsPath(session)); } catch {}
  try { await fs.unlink(getChromePidPath(session)); } catch {}
  try { await fs.rm(getNetworkDir(session), { recursive: true }); } catch {}
}
Prompt To Fix With AI
This is a comment left during a code review.
Path: packages/cli/src/index.ts
Line: 66:71

Comment:
Network capture directory is not cleaned up on daemon shutdown. The `cleanupStaleFiles` function removes socket, PID, and WebSocket files but doesn't remove the network capture directory created by `getNetworkDir()`. This could accumulate temp directories over time.

```suggestion
async function cleanupStaleFiles(session: string): Promise<void> {
  try { await fs.unlink(getSocketPath(session)); } catch {}
  try { await fs.unlink(getPidPath(session)); } catch {}
  try { await fs.unlink(getWsPath(session)); } catch {}
  try { await fs.unlink(getChromePidPath(session)); } catch {}
  try { await fs.rm(getNetworkDir(session), { recursive: true }); } catch {}
}
```

How can I resolve this? If you propose a fix, please make it concise.

Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 3 files

Confidence score: 3/5

  • Potential user-impacting gap: packages/cli/src/index.ts captures response metadata in responseReceived but never uses it when writing response.json, which could omit expected status/headers information.
  • Severity is moderate (7/10) with high confidence, so there is some regression risk if consumers rely on complete response metadata.
  • Pay close attention to packages/cli/src/index.ts - response metadata captured but not written into response output.
Prompt for AI agents (all issues)

Check if these issues are valid — if so, understand the root cause of each and fix them.


<file name="packages/cli/src/index.ts">

<violation number="1" location="packages/cli/src/index.ts:269">
P1: Response metadata (status, statusText, headers, mimeType) is captured in `responseReceived` but never used when writing response.json in `loadingFinished`. The data stored on `(params as any)._responseInfo` is inaccessible because each handler receives a different `params` object. Store response info in a Map keyed by requestId (similar to `requestStartTimes`) to preserve it across handlers.</violation>
</file>
Architecture diagram
sequenceDiagram
    participant CLI as Agent/CLI
    participant Daemon as Stagehand Daemon
    participant CDP as Chrome (CDP)
    participant FS as FileSystem

    Note over CLI,FS: NEW: Network Capture Workflow

    %% 1. Enable Phase
    CLI->>Daemon: execute("network_enable")
    Daemon->>FS: NEW: mkdir /tmp/stagehand-{session}-network
    Daemon->>CDP: Network.enable
    Note right of Daemon: Registers listeners:<br/>requestWillBeSent<br/>responseReceived<br/>loadingFinished
    Daemon-->>CLI: { enabled: true, path: "..." }

    %% 2. Capture Phase
    par Traffic Capture Loop
        %% Request Start
        CDP->>Daemon: Network.requestWillBeSent
        Daemon->>Daemon: Generate Dir: 001-{METHOD}-{domain}-{path}
        Daemon->>Daemon: Cache request start time & ID
        Daemon->>FS: NEW: write request.json
        
        %% Response Headers
        CDP->>Daemon: Network.responseReceived
        Daemon->>Daemon: Cache response meta (status, headers, mime)
        
        %% Completion
        alt Request Finished Successfully
            CDP->>Daemon: Network.loadingFinished
            Daemon->>CDP: Network.getResponseBody
            CDP-->>Daemon: { body, base64Encoded }
            Daemon->>Daemon: Combine meta + body + duration
            Daemon->>FS: NEW: write response.json
        else Request Failed
            CDP->>Daemon: Network.loadingFailed
            Daemon->>FS: NEW: write response.json (with error)
        end
    end

    %% 3. Retrieval (Implicit)
    Note over CLI,FS: Agent reads files directly from FS path
Loading

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.


const responseData = {
id: params.requestId,
status: 0, // Will be filled from cached data if available
Copy link
Contributor

@cubic-dev-ai cubic-dev-ai bot Jan 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: Response metadata (status, statusText, headers, mimeType) is captured in responseReceived but never used when writing response.json in loadingFinished. The data stored on (params as any)._responseInfo is inaccessible because each handler receives a different params object. Store response info in a Map keyed by requestId (similar to requestStartTimes) to preserve it across handlers.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At packages/cli/src/index.ts, line 269:

<comment>Response metadata (status, statusText, headers, mimeType) is captured in `responseReceived` but never used when writing response.json in `loadingFinished`. The data stored on `(params as any)._responseInfo` is inaccessible because each handler receives a different `params` object. Store response info in a Map keyed by requestId (similar to `requestStartTimes`) to preserve it across handlers.</comment>

<file context>
@@ -177,6 +181,139 @@ async function runDaemon(session: string, headless: boolean): Promise<void> {
+
+      const responseData = {
+        id: params.requestId,
+        status: 0, // Will be filled from cached data if available
+        statusText: "",
+        headers: {} as Record<string, string>,
</file context>
Fix with Cubic

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants