Skip to content

Conversation

@enyst
Copy link
Collaborator

@enyst enyst commented Jan 20, 2026

Follow-up to #1682.

  • Subscription mode requires for Responses; LiteLLM returns a streaming iterator.
  • Drain the iterator and use the completed response so Agent/Conversation can proceed.
  • Emit a one-time info log when subscription mode is active (helps confirm runtime path).

Subscription auth requires Responses streaming; LiteLLM returns a streaming iterator instead of a ResponsesAPIResponse. Drain the iterator and use the completed response, and log once when subscription mode is active.
@enyst enyst closed this Jan 20, 2026
Copy link
Collaborator

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR successfully implements subscription mode streaming support for the Responses API. However, there are several issues that should be addressed, particularly around error handling, type validation, and exception usage. See inline comments for details.

Comment on lines +808 to +810
assert isinstance(completed_resp, ResponsesAPIResponse), (
f"Expected ResponsesAPIResponse, got {type(completed_resp)}"
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Critical: Using assert for runtime type validation is problematic. Assertions can be disabled with Python's -O (optimize) flag, which would skip this check in production and could lead to silent failures.

Replace with an explicit check and proper exception:

Suggested change
assert isinstance(completed_resp, ResponsesAPIResponse), (
f"Expected ResponsesAPIResponse, got {type(completed_resp)}"
)
if not isinstance(completed_resp, ResponsesAPIResponse):
raise TypeError(
f"Expected ResponsesAPIResponse, got {type(completed_resp)}"
)

Comment on lines +787 to +790
if not isinstance(ret, SyncResponsesAPIStreamingIterator):
raise AssertionError(
f"Expected Responses stream iterator, got {type(ret)}"
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟠 Important: Raising AssertionError for validation failures is unconventional. AssertionError should typically be reserved for programmer errors and invariants, not runtime type validation. Consider using TypeError or a custom exception like LLMResponseError instead.

Suggested change
if not isinstance(ret, SyncResponsesAPIStreamingIterator):
raise AssertionError(
f"Expected Responses stream iterator, got {type(ret)}"
)
if not isinstance(ret, SyncResponsesAPIStreamingIterator):
raise TypeError(
f"Expected Responses stream iterator, got {type(ret)}"
)

Comment on lines +792 to +794
# Drain stream until a completed response is captured.
for _ in ret:
pass
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟠 Important: The stream draining loop has no timeout or error handling. If the stream:

  • Hangs or yields indefinitely
  • Raises an exception during iteration
  • Takes an unusually long time

This could cause the entire request to hang or fail unexpectedly. Consider adding a timeout mechanism or wrapping in a try-except to provide better error messages if streaming fails.

Comment on lines +814 to 816
raise AssertionError(
f"Expected ResponsesAPIResponse, got {type(ret)}"
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟠 Important: Same issue as above - using AssertionError for runtime validation. This should be a TypeError or similar exception.

Suggested change
raise AssertionError(
f"Expected ResponsesAPIResponse, got {type(ret)}"
)
raise TypeError(
f"Expected ResponsesAPIResponse, got {type(ret)}"
)

Comment on lines +783 to +785
from litellm.responses.streaming_iterator import (
SyncResponsesAPIStreamingIterator,
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Suggestion: This import is performed inside the execution path. If the import fails (e.g., due to litellm version mismatch or API changes), the error message might not be immediately clear to developers.

Consider either:

  1. Importing at module level if this is a stable API
  2. Wrapping in try-except with a more descriptive error:
try:
    from litellm.responses.streaming_iterator import (
        SyncResponsesAPIStreamingIterator,
    )
except ImportError as e:
    raise ImportError(
        f"Failed to import SyncResponsesAPIStreamingIterator. "
        f"This may indicate an incompatible litellm version: {e}"
    ) from e

"Streaming callbacks are not supported for Responses API yet"
)

streaming_requested = bool(call_kwargs.get("stream", False) or self.stream)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟢 Nit: Consider adding parentheses for clarity, since the or operation happens before the bool() conversion:

Suggested change
streaming_requested = bool(call_kwargs.get("stream", False) or self.stream)
streaming_requested = bool(call_kwargs.get("stream", False)) or self.stream

Actually, self.stream is already a bool, so this could be simplified to:

streaming_requested = call_kwargs.get("stream", False) or self.stream

@enyst enyst reopened this Jan 20, 2026
@enyst enyst closed this Jan 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants