Skip to content

fix(models): surface empty LiteLlm streaming completions as error event#6195

Open
kevin-hs-sohn wants to merge 1 commit into
google:mainfrom
kevin-hs-sohn:fix/lite-llm-empty-stream-surface-error
Open

fix(models): surface empty LiteLlm streaming completions as error event#6195
kevin-hs-sohn wants to merge 1 commit into
google:mainfrom
kevin-hs-sohn:fix/lite-llm-empty-stream-surface-error

Conversation

@kevin-hs-sohn

Copy link
Copy Markdown

Why

Streaming completions where the provider returns a finish_reason but no text + no tool calls currently produce zero yielded LlmResponse events: aggregated_llm_response only gets set when (text or reasoning_parts) is truthy, and aggregated_llm_response_with_tool_call needs a function_call. With neither, the loop exits and the downstream Runner observes a silent successful empty stream.

This pattern is reported across multiple stalled fix attempts:

It hits providers under real conditions: anthropic content_filter, gemini 2.5-flash-lite STOP-with-empty after tool calls, 0-token completions under safety, model_not_found responses normalized to stop. From the user's perspective the agent "successfully" ends a turn with no visible output. Downstream agent frameworks have no actionable signal to retry / surface / escalate.

Change

  • Track last_finish_reason + last_model_version across the stream.
  • After both aggregated_llm_response and aggregated_llm_response_with_tool_call checks, if BOTH are None AND a finish_reason was observed, yield ONE LlmResponse with error_code set to the mapped finish_reason, error_message describing the failure mode, and model_version preserved. usage_metadata + grounding_metadata attach to that response.
  • Minimum-surface: the guard only fires when stream produced no aggregated response AND a finish_reason was observed. Streams that genuinely yield nothing (test doubles, empty iterators) stay byte-identical.

Tests

tests/unittests/models/test_litellm.py (4 new):

  • content_filter-empty → surfaces with SAFETY error_code
  • stop-empty → surfaces with STOP finish_reason + error_message
  • Normal text stream → empty-guard does NOT fire (regression guard)
  • Literally-empty stream (no chunks, no finish_reason) → byte-identical zero responses

281 lite_llm tests pass + 1 skip; 0 regressions.

Context

Filed after several months of stalled PRs in this area (#5512 closed without merge, #5636 closed without merge, #5006 / #3699 open). Submitting a fresh attempt that addresses the same class of bugs with minimum-surface logic + thorough regression coverage. Happy to revise based on review.

Co-Authored-By: Kevin Sohn kevin@openmagi.ai

@google-cla

google-cla Bot commented Jun 23, 2026

Copy link
Copy Markdown

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

Streaming completions where the provider returns a finish_reason but
no text + no tool calls currently produce ZERO yielded LlmResponse
events: ``aggregated_llm_response`` only gets set when
``(text or reasoning_parts)`` is truthy, and ``aggregated_llm_response_with_tool_call``
needs a function_call. With neither, the loop simply exits and the
downstream Runner observes a silent successful empty stream.

This pattern is reported across multiple stalled fix attempts:

  * google#5394 — AnthropicLlm never populates finish_reason on LlmResponse
  * google#5006 — retry with resume message when model returns empty response
  * google#5636 — surface error when model returns STOP with empty content
  * google#3618 / google#3699 — Handle empty message in LiteLLM response

It hits providers under several real conditions: anthropic
content_filter, gemini 2.5-flash-lite STOP-with-empty after tool calls,
0-token completions under safety, model_not_found responses normalized
to stop, etc. From the user's perspective the agent "successfully" ends
a turn with no visible output.

Fix
- Track ``last_finish_reason`` + ``last_model_version`` across the
  stream so we can attribute the empty response.
- After both ``aggregated_llm_response`` and
  ``aggregated_llm_response_with_tool_call`` checks, if BOTH are None
  AND a finish_reason was observed, yield ONE LlmResponse with
  ``error_code`` set to the mapped finish_reason, ``error_message``
  describing the failure mode, and the provider's ``model_version``
  preserved. ``usage_metadata`` + ``grounding_metadata`` (if any)
  attach to that response so callers do not lose them.
- Minimum-surface change: the guard only fires when the stream
  produced no aggregated response AND a finish_reason was observed.
  Streams that genuinely yield nothing (test doubles, empty
  iterators) stay byte-identical.

Tests
- tests/unittests/models/test_litellm.py adds 4 cases:
  * content_filter-empty → surfaces with SAFETY error_code
  * stop-empty → surfaces with STOP finish_reason + error_message
  * normal text stream → empty-guard does NOT fire (regression)
  * literally-empty stream (no chunks, no finish_reason) →
    byte-identical zero responses

281 lite_llm tests pass + 1 skip; 0 regressions.
@kevin-hs-sohn kevin-hs-sohn force-pushed the fix/lite-llm-empty-stream-surface-error branch from 0d97329 to 7240cdb Compare June 23, 2026 01:09
@kevin-hs-sohn

Copy link
Copy Markdown
Author

I signed the CLA. Re-pushed without the Co-Authored-By trailer so the check only sees my GitHub-account email.

@adk-bot adk-bot added the models [Component] Issues related to model support label Jun 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

models [Component] Issues related to model support

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants