-
Notifications
You must be signed in to change notification settings - Fork 115
Fix MCP HTTP session persistence across tool calls #1740
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
This PR fixes the issue where MCP HTTP connections create new sessions per tool call, breaking session-based authentication. ## Changes ### Core Fix: Persistent MCP Connections - Modified MCPToolExecutor to maintain persistent connections instead of opening/closing for each tool call - Added connection reuse with reference counting in MCPClient - Added session ID tracking for potential session resumption ### Session Management Infrastructure - Added mcp_sessions field to ConversationState for session persistence - Created MCPSessionManager for centralized session tracking - Added cleanup logic in LocalConversation.close() to release MCP sessions ### Files Changed - openhands-sdk/openhands/sdk/mcp/client.py - Added session tracking, reentrant context manager - openhands-sdk/openhands/sdk/mcp/tool.py - Persistent connection in MCPToolExecutor - openhands-sdk/openhands/sdk/mcp/session_manager.py - New session manager - openhands-sdk/openhands/sdk/conversation/state.py - Added mcp_sessions field - openhands-sdk/openhands/sdk/conversation/impl/local_conversation.py - MCP cleanup ### Tests - Added comprehensive tests using live MCP servers (no mocks) - Tests verify connection reuse, session stability, and proper cleanup Fixes #1739 Co-authored-by: openhands <openhands@all-hands.dev>
- Use getattr for dynamic attribute access in MCPClient._capture_session_id - Add proper type narrowing with isinstance assertions in tests - Add Literal import for transport type annotation - Remove unused variable in test Co-authored-by: openhands <openhands@all-hands.dev>
Keep MCP connections open after listing tools instead of reconnecting for each tool call. This is the simplest fix for the session-based authentication issue. Changes: - utils.py: Keep connection open after _list_tools_and_keep_connected() - tool.py: Remove async with from executor since client is already connected - client.py: Revert to original simple implementation Key insight: The cleanest fix is proper object lifecycle - create ONE client per server, keep it open, close it when done. No reference counting, no session managers, no global state. Related issue: #1739 Co-authored-by: openhands <openhands@all-hands.dev>
Multiple tools share the same client, so close() may be called multiple times during LocalConversation.close() cleanup. Added _closed flag to ensure safe multiple calls. Co-authored-by: openhands <openhands@all-hands.dev>
This test uses a server that stores state keyed by MCP session ID. - Without fix: Each tool call creates new session, state is lost - With fix: Same session is reused, state is preserved Test confirms the exact issue reported by user (session-based auth breaking). Co-authored-by: openhands <openhands@all-hands.dev>
|
@OpenHands fix pre-commit |
|
I'm on it! neubig can track my progress at all-hands.dev |
- Fix line length issues (E501) by breaking long strings - Add assertions for executor not None to fix pyright type errors - Remove unused import MCPToolAction
|
I've fixed the pre-commit issues in PR #1740. The changes have been pushed to the Summary of ChangesFixed pre-commit failures in
All pre-commit checks now pass:
|
all-hands-bot
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The session persistence fix works (tests pass), but there are concerns about the approach that should be addressed. Key issues: context manager protocol violation and relying on del for cleanup. Details in inline comments.
|
@OpenHands try to reflect the comments above without making the code overly complex. |
|
I'm on it! neubig can track my progress at all-hands.dev |
…xt manager support - Add explicit connect()/disconnect() methods to MCPClient as alternative to calling __aenter__/__aexit__ directly (addresses protocol violation concern) - Add context manager support (__enter__/__exit__) to MCPToolExecutor for deterministic cleanup - Improve exception handling in utils.py to not mask original exceptions when cleanup fails - Improve error message for disconnected client with more context Co-authored-by: openhands <openhands@all-hands.dev>
|
I've addressed the review comments on PR #1740 without making the code overly complex. The changes have been pushed to the Summary of ChangesAddressed all 4 review comments from
Files Changed
All 61 MCP tests pass and all pre-commit checks pass. |
|
@OpenHands double-check the comment by jpshackleford and fix it |
|
I'm on it! neubig can track my progress at all-hands.dev |
Address jpshackelford's review comment: move self._closed = True to after the cleanup operations so that if cleanup fails, future calls to sync_close() can retry. This is safe because both underlying close methods are idempotent. Co-authored-by: openhands <openhands@all-hands.dev>
|
I've addressed jpshackelford's review comment on PR #1740. The change has been pushed to the Summary of ChangesFixed the
Verification
The fix is minimal (4 lines added, 1 line removed) and directly addresses the reviewer's concern about cleanup reliability. |
|
@OpenHands please clean up the remaining unresolved comments and resolve them |
|
I'm on it! neubig can track my progress at all-hands.dev |
- utils.py: Log warning when MCP client cleanup fails during error handling - test_stateful_mcp.py: Log exception when server thread encounters an error This addresses the CodeQL 'empty except' warnings while preserving the original exception handling behavior. Co-authored-by: openhands <openhands@all-hands.dev>
|
I've addressed the remaining unresolved comments on PR #1740. The changes have been pushed to the Summary of ChangesFixed the two "empty except" CodeQL warnings from
Verification
The fix is minimal (6 lines added, 3 lines removed) and directly addresses the CodeQL warnings while preserving the original exception handling behavior. |
|
@OpenHands Do a /codereview-roasted on this PR. |
|
I'm on it! enyst can track my progress at all-hands.dev |
|
You asked for a /codereview-roasted of PR #1740 (“Fix MCP HTTP session persistence across tool calls”). I did a read-through of the diff against
Checklist against your request:
No further action taken since this was a review request only. |
|
(OpenHands-GPT-5.2 speaking) 🔴 Taste rating: Needs improvement You fixed the symptom (session persistence) by… changing who’s responsible for not leaking connections. Congrats, now every caller gets to play “guess when to close the socket”. This PR is close to good taste (“one client, keep it open”), but the lifecycle/ownership story is still sloppy and the tests are doing a lot of cargo-cult threading to prove something that should be structurally obvious. [CRITICAL ISSUES] (must fix)Ownership is unclear → leaks are inevitableFile: You create one That’s not an API, that’s a dare. Why it’s bad
Concrete improvement
Executor exposes
|
|
|
@OpenHands consider this criticism, if the alternative seems reasonable, implement it and make sure that tests still pass and that you can still demonstrate that the playwright MCP based live testing works. |
|
I'm on it! neubig can track my progress at all-hands.dev |
Address review criticism about unclear ownership and foot-gun API: 1. Create MCPToolset class that owns the MCP client connection - Provides .tools property to access the list of tools - Provides .client property for advanced use cases - Provides .close() method for explicit cleanup - Supports context manager protocol for automatic cleanup 2. Update create_mcp_tools() to return MCPToolset instead of list - Clear ownership: the toolset owns the connection - Recommended usage: with create_mcp_tools(config) as toolset: - Manual usage: toolset = create_mcp_tools(config); try/finally 3. Remove close() from MCPToolExecutor - Eliminates the foot-gun that could close shared client - Cleanup is now only through MCPToolset.close() 4. Update all tests to use new MCPToolset API - Use context manager for automatic cleanup - All 61 MCP tests pass This addresses the key criticism: 'You solved persistence by keeping the connection open, but you didn't solve lifecycle; you just pushed the burden onto the user via "close any executor when done".' Now the API is clear: MCPToolset owns the connection, use it as a context manager or call close() when done. Co-authored-by: openhands <openhands@all-hands.dev>
SummaryI've implemented the review criticism by introducing an Changes Made1. Created
2. Updated
3. Removed
4. Updated all tests to use the new API with context managers Verification
Key Insight AddressedThe review criticism was: "You solved persistence by keeping the connection open, but you didn't solve lifecycle; you just pushed the burden onto the user via 'close any executor when done'." Now the API is clear:
The PR is updated at: #1740 |
Remove MCPToolset class and consolidate to one connection management approach: 1. create_mcp_tools() returns list[MCPToolDefinition] 2. Tools share a persistent MCP client connection for session state 3. Connection is cleaned up via executor.close() which is: - Called by Conversation.close() on all tools automatically - Idempotent (safe to call multiple times from shared tools) - Also cleaned up via __del__ as fallback Internal helper _create_stateful_toolset() establishes the persistent connection and creates tools that share it. All 61 MCP tests pass. Playwright MCP live testing verified session persistence. Co-authored-by: openhands <openhands@all-hands.dev>
5101547 to
5e50b4b
Compare
- Add MCPToolset class that owns the MCP client connection
- Support context manager for automatic cleanup
- Make MCPToolset iterable (acts like a list)
- Remove close() from MCPToolExecutor (MCPToolset handles cleanup)
- Update all tests to use context manager pattern
Usage:
with create_mcp_tools(config) as toolset:
for tool in toolset:
# use tool
# Connection automatically closed
Co-authored-by: openhands <openhands@all-hands.dev>
- Remove unused disconnect() method from MCPClient - Fix broad exception handling: use BaseException instead of Exception - Remove redundant is_connected() check after connect() - Add __del__ to MCPToolset for GC safety - Restore explanatory comment about MCPToolAction - Extract MockMCPClient to shared conftest.py - Consolidate duplicate test servers into reusable MCPTestServer Reduces code duplication across test files (-182 lines). Co-authored-by: openhands <openhands@all-hands.dev>
Instead of a separate MCPToolset wrapper class, the MCPClient now:
- Has a 'tools' property containing the list of tools
- Is a sync context manager (with client:)
- Is iterable/indexable like a list
Usage:
with create_mcp_tools(config) as client:
for tool in client: # or client.tools
...
# Connection automatically closed
This eliminates the MCPToolset class entirely (-60 lines).
Co-authored-by: openhands <openhands@all-hands.dev>
The core fix: keep the MCP connection open across tool calls instead of
reconnecting each time (which lost session state like auth tokens).
Changes to client.py:
- Add _tools list and tools property
- Add connect() method
- Add _closed flag for idempotent sync_close()
- Add sync context manager (__enter__/__exit__)
- Add iteration support (__iter__/__len__/__getitem__)
Changes to tool.py:
- Remove 'async with self.client:' - use already-connected client
- Check is_connected() and raise RuntimeError if not
Changes to utils.py:
- New _connect_and_list_tools() that keeps connection open
- create_mcp_tools() returns MCPClient (iterable, so backward compatible)
- Add cleanup on error (sync_close in except blocks)
Usage:
with create_mcp_tools(config) as client:
for tool in client: # client is iterable
...
# Connection closed automatically
Co-authored-by: openhands <openhands@all-hands.dev>
|
Looks like there are a few issues preventing this PR from being merged!
If you'd like me to help, just leave a comment, like Feel free to include any additional details that might help me get this PR into a better state. You can manage your notification settings |
Co-authored-by: openhands <openhands@all-hands.dev>
Co-authored-by: openhands <openhands@all-hands.dev>
Consistent with other MCP tests that define their server fixtures inline. Co-authored-by: openhands <openhands@all-hands.dev>
all-hands-bot
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR successfully implements MCP session persistence, but there are several issues that need attention, particularly around resource management and API design. See inline comments for details.
- Update MCPClient docstring to clarify lifecycle management - Return copy of tools list from tools property to prevent modification - Add logging when sync_close fails during error cleanup Co-authored-by: openhands <openhands@all-hands.dev>
Summary
Simplified fix for MCP session persistence. Keeps MCP connections open after listing tools instead of reconnecting for each tool call.
Changes
utils.py: Keep connection open after_list_tools_and_keep_connected()tool.py: Removeasync withfrom executor since client is already connectedclient.py: Reverted to original simple implementationKey Insight
The cleanest fix is proper object lifecycle:
No reference counting, no session managers, no global state. Just ~100 lines changed instead of 900+.
Test Results
All 58 MCP tests pass.
Live Testing with Playwright MCP
Verified session persistence with the Playwright MCP server (a stateful browser automation server):
https://www.openhands.dev/blogusingbrowser_navigatebrowser_snapshotto verify it was still on the same pageResult: ✅ SUCCESS - The browser state was maintained across multiple tool calls. The agent confirmed it was still on
https://www.openhands.dev/blogwith the correct page content.This proves the MCP session persistence fix works correctly - without the fix, the browser context would have been lost when the MCP connection was closed and reopened between tool calls.
Fixes: #1739
Agent Server images for this PR
• GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server
Variants & Base Images
eclipse-temurin:17-jdknikolaik/python-nodejs:python3.12-nodejs22golang:1.21-bookwormPull (multi-arch manifest)
# Each variant is a multi-arch manifest supporting both amd64 and arm64 docker pull ghcr.io/openhands/agent-server:f762432-pythonRun
All tags pushed for this build
About Multi-Architecture Support
f762432-python) is a multi-arch manifest supporting both amd64 and arm64f762432-python-amd64) are also available if needed