fix(openai): Handles Bedrock-style context overflow errors for OpenAI-compatible endpoints#1529
Conversation
Adds logic to recognize and handle context window overflow errors that are returned in a Bedrock-style format by OpenAI-compatible endpoints that wrap Bedrock models. Handles OpenAI rate limit errors consistently Ensures rate limit errors from OpenAI are consistently treated as throttling exceptions across different code paths. This prevents potential misinterpretation of rate limits as context window overflows. Fixes Bedrock-style overflow detection when rate limited.
Updates the model to recognize a wider range of context overflow error messages. This change expands error handling to include messages from various providers, ensuring more robust detection of context window overflows when using OpenAI-compatible endpoints.
1. test_stream_alternative_context_overflow_messages (3 parametrized tests) - Tests that stream() method properly converts APIError with alternative overflow messages to ContextWindowOverflowException - Parametrized to test all 3 error message patterns 2. test_structured_output_alternative_context_overflow_messages (3 parametrized tests) - Tests that structured_output() method properly converts APIError with alternative overflow messages - Parametrized to test all 3 error message patterns 3. test_stream_api_error_passthrough (1 test) - Tests that APIError without overflow messages passes through unchanged - Ensures we don't accidentally catch unrelated API errors Test Coverage The new tests verify: - ✅ All 3 alternative overflow messages are detected - ✅ Both stream() and structured_output() methods handle the errors - ✅ Errors are properly converted to ContextWindowOverflowException - ✅ Original exception is preserved as __cause__ - ✅ Non-overflow APIError exceptions pass through unchanged
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
|
The change seems good to me. Linters are failing though, you can run these commands to auto-fix them |
These messages are often returned by OpenAI-compatible endpoints that wrap other providers.
@mkmeral Another review, please. |
dbschmigelski
left a comment
There was a problem hiding this comment.
Blocking to discuss with the team regarding long term impacts of this and who should be contacted to correct their adherence to the interface
dbschmigelski
left a comment
There was a problem hiding this comment.
@dinindunz is this not a case where Databricks is not properly adhering to the interface?
There already exists a mechanism for detecting context overflow so could you shed some light on why you believe this is a Strands issue and not a Databricks one?
@dbschmigelski it's not just databricks issue, but i think it's like an API shape and provider mismatch issue. There are multiple services that provide LLM over OpenAI API schema, and while inputs are well defined, error messages are not. So I think this is something the SDK should solve to make developers lives easier |
|
Hi discussed offline. We are going to go ahead and merge this to unblock. But, to address the leakiness, or rather the need for the leakiness, we are going to reach out internally to Bedrock. For this PR to work it seems like they aren't doing two things.
From there we'll see if there are updates needed in Bedrock and/or OpenAI's specification with the eventual goal of removing this Bedrock specific code. Thanks |
|
The code coverage is technically failing, but I personally think you have good coverage of the feature. That percentage value is less of an issue. I'll merge this now |
…-compatible endpoints (strands-agents#1529)
Description
Fixes context overflow detection for OpenAI-compatible endpoints that wrap Bedrock models (e.g., Databricks Model Serving).
Problem:
The OpenAI provider only caught BadRequestError with code "context_length_exceeded". When using Databricks endpoints serving Bedrock models, context overflow errors come as APIError with Bedrock-style messages like "Input is too long for requested model". These were not converted to ContextWindowOverflowException, preventing SummarizingConversationManager and other conversation managers from triggering.
Solution:
Impact:
Agents using OpenAI-compatible endpoints that wrap Bedrock models can now properly handle context overflow with conversation managers.
Related Issues
Closes #1528
Documentation PR
N/A
Type of Change
Bug fix
Testing
How have you tested the change? Verify that the changes do not break functionality or introduce warnings in consuming repositories: agents-docs, agents-tools, agents-cli
hatch run prepareChecklist
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.