fix: cleanup resources properly on `BaseSession::_receive_loop` cleanup #1817

challenger71498 · 2025-12-26T04:44:29Z

This PR fixes Session cleanup is being failed, caused by task cancellation via __aexit__ not handled properly.

Motivation and Context

Session cleanups the receive streams via finally clause of BaseSession::_receive_loop.

When the session is being closed, or leaving the async context(which calls __aexit__), it cancels the TaskGroup.
When a TaskGroup is being cancelled, it cancels every task children.
The parent of task we are using in finally clause, such as MemoryObjectStream::send is its base coroutine _receive_loop, IS the TaskGroup which is being closed.
Therefore the cleanup process is being cancelled while closing, leaving all receive streams not being closed, causing stream leak.

I fixed this problem by adding CancelScope with shield=True on finally clause, ensures the cleanup logic to be executed properly even TaskGroup cancel is requested.

Also, while cleaning up receive streams, we loop streams, therefore stream dictionary must not change.

If you change dictionary while looping, it causes runtime error, which should not happen in production.
BaseSession AS-IS does not protect dictionary change on cleanup.
To handle this minor issue, I added a simple flag to signal that the session is cleaning up the resources.

How Has This Been Tested?

I found this issue while debugging the problem, using ADK AgentTool with MCP causes an abtruse error:

RuntimeError: Attempted to exit cancel scope in a different task than it was entered in

After a week of debugging, I realised the ultimate fix to this problem is quite huge and need fixes both MCP SDK and Google ADK, and this fix is one of them.

I added a simple test which triggers this issue.

In this test..
- Server never responses
- Creates the ClientSession and request 2 ping requests to server asynchronously
- While 2 tasks are waiting (since the server never responses), leaves the async with block.
- ClientSession is being closed
- Assert that total count of response stream is equal to 0, not 2
- Assert that the closed ClientSession fails to send ping
BaseSession AS-IS fails this test, response stream count is still 2, since the cleanup logic is being cancelled.

Breaking Changes

No breaking changes at all since this is just a minor bugfix.
There might be a side-effect, since we are now execute finally block until every stream is being closed, the __aexit__ will be blocked until finally block finishes its execution.
- If the finally block is being executed too long, user might think it is hanging.
- But I don't think this really be a problem, stream is running on memory not an actual network, also the amount of the stream won't be that huge, expecting ~1000, then it will be fine.
- If it IS being a problem, we could just run closing logics in parallel. (I don't want that much change in this PR though.)

Types of changes

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to change)
Documentation update

Checklist

I have read the MCP Documentation
My code follows the repository's style guidelines
New and existing tests pass locally
I have added appropriate error handling
I have added or updated documentation as needed No docs needed to be updated

Additional context

This PR MAY resolve issues below:

I think this problem is related to task leak (or indefinite-waiting) at Google ADK, since in ADK the tool call await forever until the receive stream responds.

Session must close _response_streams properly when exiting async with scope. Session AS-IS fails the test since the cleanup logic is being cancelled forcefully.

A cleanup logic should not be cancelled via CancelScope. Therefore 'shield=True' parameter is required.

Since we are using for-loop while cleaning up _response_streams dict, the dict must not be changed on cleanup. We could handle this issue easily by adding a simple flag.

challenger71498 added 3 commits December 26, 2025 12:44

test: add case for cleanup on __aexit__

5809089

Session must close _response_streams properly when exiting async with scope. Session AS-IS fails the test since the cleanup logic is being cancelled forcefully.

fix: guard cleanup logic with shield=True

606e4c8

A cleanup logic should not be cancelled via CancelScope. Therefore 'shield=True' parameter is required.

feat: add a flag on cleanup to protect change on dict

7534b37

Since we are using for-loop while cleaning up _response_streams dict, the dict must not be changed on cleanup. We could handle this issue easily by adding a simple flag.

This was referenced Dec 26, 2025

Possible ressource leak / race condition in streamable_http_client #1805

Open

cannot get response from await session.call_tool() #262

Open

feat: start and close ClientSession in a single task in McpSessionManager google/adk-python#4025

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: cleanup resources properly on `BaseSession::_receive_loop` cleanup #1817

fix: cleanup resources properly on `BaseSession::_receive_loop` cleanup #1817

challenger71498 commented Dec 26, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

fix: cleanup resources properly on BaseSession::_receive_loop cleanup #1817

Are you sure you want to change the base?

fix: cleanup resources properly on BaseSession::_receive_loop cleanup #1817

Conversation

challenger71498 commented Dec 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation and Context

How Has This Been Tested?

Breaking Changes

Types of changes

Checklist

Additional context

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

fix: cleanup resources properly on `BaseSession::_receive_loop` cleanup #1817

fix: cleanup resources properly on `BaseSession::_receive_loop` cleanup #1817

challenger71498 commented Dec 26, 2025 •

edited

Loading