Skip to content

Encountered a problem "Unclosed client session" #471

@wolf-yang

Description

@wolf-yang

Question:
When I am performing the RAG training task, error prompts frequently appear:

client_session: <aiohttp.client.ClientSession object at 0x7ff7e5760130>
ERROR:2026-01-24 05:17:38,366:Unclosed client session

This results in resource leakage:
ERROR:2026-01-24 08:17:42,169:MCP server error during rollout: Timed out while waiting for response to ClientRequest. Waited 5.0 seconds.
Ultimately, MCP timeout + resource leak → numerous rollout failures, no triplets

Solution:
I found that the unclosed aiohttp.client.ClientSession is created and cached by LiteLLM when it calls the HTTP request initiated by you through LitellmModel, and it has never been closed.
The solution adopted is to switch to httpx
export DISABLE_AIOHTTP_TRANSPORT=True
However, the training will become slower,

Before modification: Training Progress: 0%|          | 319/125000 [4:52:35<1796:12:54, 51.86s/it]
After modification, Training Progress: 0%|          | 319/125000 [10:27:08<2410:59:31, 69.61s/it]

I previously attempted to add the following code in the async def training_rollout_async function in agent-lightning/examples/rag/rag_agent.py, but it was unsuccessful. I'll try again later when needed

   finally:
            if runner_task and not runner_task.done():
                try:
                    runner_task.cancel()
                    await asyncio.wait_for(runner_task, timeout=2.0)
                except (asyncio.TimeoutError, asyncio.CancelledError, Exception):
                    pass
             try:
                 from litellm.llms.custom_httpx.async_client_cleanup import (
                     close_litellm_async_clients,
                 )
                 await close_litellm_async_clients()
             except Exception:
                 pass

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions