Feat: session recording agent's browser sessions #1731

malhotra5 · 2026-01-14T22:22:36Z

Summary

Adds support for injecting custom JavaScript into browser sessions via CDP's Page.addScriptToEvaluateOnNewDocument. This enables session recording tools like rrweb to capture agent browser interactions.

Changes

Added inject_scripts parameter to BrowserToolExecutor constructor
Added set_inject_scripts() and _inject_scripts_to_session() methods to CustomBrowserUseServer
Scripts are injected after browser session initialization and run before page scripts on every new document

Usage

from openhands.tools.browser_use import BrowserToolExecutor

RRWEB_SCRIPT = """
(function() {
    var s = document.createElement('script');
    s.src = 'https://cdn.jsdelivr.net/npm/@rrweb/record@latest/dist/record.umd.min.cjs';
    document.head.appendChild(s);
})();
"""

executor = BrowserToolExecutor(
    inject_scripts=[RRWEB_SCRIPT]
)

Closes #1724

@malhotra5 can click here to continue refining the PR

Agent Server images for this PR

• GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server

Variants & Base Images

Variant	Architectures	Base Image	Docs / Tags
java	amd64, arm64	`eclipse-temurin:17-jdk`	Link
python	amd64, arm64	`nikolaik/python-nodejs:python3.12-nodejs22`	Link
golang	amd64, arm64	`golang:1.21-bookworm`	Link

Pull (multi-arch manifest)

# Each variant is a multi-arch manifest supporting both amd64 and arm64
docker pull ghcr.io/openhands/agent-server:4ae620e-python

Run

docker run -it --rm \
  -p 8000:8000 \
  --name agent-server-4ae620e-python \
  ghcr.io/openhands/agent-server:4ae620e-python

All tags pushed for this build

ghcr.io/openhands/agent-server:4ae620e-golang-amd64
ghcr.io/openhands/agent-server:4ae620e-golang_tag_1.21-bookworm-amd64
ghcr.io/openhands/agent-server:4ae620e-golang-arm64
ghcr.io/openhands/agent-server:4ae620e-golang_tag_1.21-bookworm-arm64
ghcr.io/openhands/agent-server:4ae620e-java-amd64
ghcr.io/openhands/agent-server:4ae620e-eclipse-temurin_tag_17-jdk-amd64
ghcr.io/openhands/agent-server:4ae620e-java-arm64
ghcr.io/openhands/agent-server:4ae620e-eclipse-temurin_tag_17-jdk-arm64
ghcr.io/openhands/agent-server:4ae620e-python-amd64
ghcr.io/openhands/agent-server:4ae620e-nikolaik_s_python-nodejs_tag_python3.12-nodejs22-amd64
ghcr.io/openhands/agent-server:4ae620e-python-arm64
ghcr.io/openhands/agent-server:4ae620e-nikolaik_s_python-nodejs_tag_python3.12-nodejs22-arm64
ghcr.io/openhands/agent-server:4ae620e-golang
ghcr.io/openhands/agent-server:4ae620e-java
ghcr.io/openhands/agent-server:4ae620e-python

About Multi-Architecture Support

Each variant tag (e.g., 4ae620e-python) is a multi-arch manifest supporting both amd64 and arm64
Docker automatically pulls the correct architecture for your platform
Individual architecture tags (e.g., 4ae620e-python-amd64) are also available if needed

Add inject_scripts parameter to BrowserToolExecutor to allow injecting custom JavaScript into every new document via CDP's Page.addScriptToEvaluateOnNewDocument. This enables session recording tools like rrweb to be injected into browser sessions for recording agent interactions. Co-authored-by: openhands <openhands@all-hands.dev>

- Always inject rrweb loader script on browser session init - Add start_recording() method that calls rrweb.record() - Add stop_recording() method that stops recording and returns events as JSON - Add BrowserStartRecordingAction/Tool and BrowserStopRecordingAction/Tool - Recording uses CDP Runtime.evaluate to execute JS in page context Co-authored-by: openhands <openhands@all-hands.dev>

- Add unit tests for start_recording and stop_recording action routing - Add E2E tests for recording functionality: - test_start_recording: verify recording can be started - test_recording_captures_events: verify events are captured - test_recording_save_to_file: verify recording JSON can be saved - Update test_browser_toolset.py to expect 14 tools (including recording tools) - Fix rrweb loader script to use correct CDN URL and add fallback stub - Fix rrweb.record reference (UMD exports to window.rrweb not rrwebRecord) Co-authored-by: openhands <openhands@all-hands.dev>

Add example script demonstrating how to use the browser session recording feature with rrweb: - Shows how to start/stop recording using browser_start_recording and browser_stop_recording tools - Demonstrates browsing multiple sites while recording - Saves recording to JSON file for later replay - Includes instructions for replaying with rrweb-player Co-authored-by: openhands <openhands@all-hands.dev>

Recording improvements: - Add automatic retry (10 attempts, 500ms delay) when rrweb isn't loaded - Improve fallback stub to capture actual DOM content: - Full DOM serialization in FullSnapshot event - MutationObserver for incremental snapshots - Scroll and mouse event listeners - Add event_types summary in stop_recording response - Add using_stub flag to indicate if fallback was used - Improved logging for recording start/stop Test improvements: - Simplified tests since retry is now built-in - Added event_types verification in tests - Added stub status reporting Co-authored-by: openhands <openhands@all-hands.dev>

Root cause: jsdelivr CDN returns Content-Type: application/node for .cjs files, which browsers refuse to execute as JavaScript. The .min.js alternative from jsdelivr uses ES module format which doesn't create a global window.rrweb object. Solution: Switch to unpkg CDN which returns Content-Type: text/javascript for .cjs files, allowing browsers to execute the UMD bundle correctly. Co-authored-by: openhands <openhands@all-hands.dev>

Recording now continues across page navigations by: 1. Flushing events from browser to Python storage before navigation 2. Automatically restarting recording on the new page after navigation 3. Combining all events when stop_recording is called Changes: - Add _recording_events list on Python side to store events - Add _flush_recording_events() to save browser events before navigation - Add _restart_recording_on_new_page() to resume recording after navigation - Update navigate(), go_back(), click() to flush before navigation - Update _stop_recording() to combine events from all pages - Add pages_recorded count to stop_recording response Co-authored-by: openhands <openhands@all-hands.dev>

Changes: - _stop_recording now saves events to a timestamped JSON file instead of returning the full events array to the agent - Recording file saved to full_output_save_dir (e.g., browser_recording_20260115_001313.json) - Returns concise message: 'Recording stopped. Captured X events from Y page(s). Saved to: path' - File contains both events array and metadata (count, pages, event_types, etc.) - Fixed bug in event type counting (was using type_num instead of type_name) Co-authored-by: openhands <openhands@all-hands.dev>

- Set persistence_dir on Conversation so recordings are saved - Update prompt to reflect auto-save behavior (no need to manually save) - Add RECORDING_DIR variable to show where recordings go Co-authored-by: openhands <openhands@all-hands.dev>

…penHands/software-agent-sdk into feat/browser-session-recording

When rrweb fails to load from CDN, instead of using a minimal fallback stub that provides degraded functionality, now we: 1. Set a __rrweb_load_failed flag when CDN load fails 2. Check this flag when starting recording 3. Return a clear error message to the agent explaining that recording could not be started due to CDN load failure This simplifies the code and makes failures explicit rather than silently degrading functionality. Co-authored-by: openhands <openhands@all-hands.dev>

examples/01_standalone_sdk/33_browser_session_recording.py

+    - Online viewer: https://www.rrweb.io/demo/
+"""
+
+import glob


tests/tools/browser_use/test_browser_executor_e2e.py

+                if executor:
+                    try:
+                        executor.close()
+                    except Exception:


Changes: - Flush events every 5 seconds (RECORDING_FLUSH_INTERVAL_SECONDS) - Also flush when events exceed 1 MB (RECORDING_FLUSH_SIZE_MB) - Save events to numbered JSON files (1.json, 2.json, etc.) instead of appending to a single file - Move save_dir parameter from stop_recording to start_recording - Add background task for periodic flushing - Track total events and file count across the recording session This improves performance by: 1. Avoiding memory buildup during long recording sessions 2. Writing smaller, incremental files instead of one large file 3. Spreading I/O across the recording duration Co-authored-by: openhands <openhands@all-hands.dev>

Move all inline JavaScript code to named constants at the top of server.py for better readability and maintainability: - RRWEB_LOADER_JS: Script injected into every page to load rrweb from CDN - FLUSH_EVENTS_JS: Collects and clears events from browser - START_RECORDING_SIMPLE_JS: Start recording (used after navigation) - START_RECORDING_JS: Start recording with load failure check - STOP_RECORDING_JS: Stop recording and collect remaining events Also reorganized the file with clear section headers for: - Configuration Constants - Injected JavaScript Code - CustomBrowserUseServer Class Co-authored-by: openhands <openhands@all-hands.dev>

When saving events to numbered JSON files, check if the file already exists and increment the counter until an unused filename is found. This handles cases where files already exist from previous recordings in the same directory. Co-authored-by: openhands <openhands@all-hands.dev>

openhands-agent and others added 17 commits January 14, 2026 22:22

Update 34_browser_session_recording.py

8aec4f9

Update 34_browser_session_recording.py

0a7d3ee

Update 34_browser_session_recording.py

87c36a0

Update 34_browser_session_recording.py

de8481a

Merge branch 'feat/browser-session-recording' of https://github.com/O…

d227c50

…penHands/software-agent-sdk into feat/browser-session-recording

fix persistence path check

49e360a

Update 33_browser_session_recording.py

326251f

github-code-quality bot found potential problems Jan 17, 2026

View reviewed changes

examples/01_standalone_sdk/33_browser_session_recording.py

- Online viewer: https://www.rrweb.io/demo/

"""

import glob

tests/tools/browser_use/test_browser_executor_e2e.py

if executor:

try:

executor.close()

except Exception:

openhands-agent added 3 commits January 17, 2026 08:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feat: session recording agent's browser sessions #1731

Feat: session recording agent's browser sessions #1731

Uh oh!

malhotra5 commented Jan 14, 2026 •

edited by github-actions bot

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Feat: session recording agent's browser sessions #1731

Are you sure you want to change the base?

Feat: session recording agent's browser sessions #1731

Uh oh!

Conversation

malhotra5 commented Jan 14, 2026 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Usage

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

malhotra5 commented Jan 14, 2026 •

edited by github-actions bot

Loading