GPT models 1M context#11838
Conversation
…arpdotdev/warp into dagm/allow-gpt-models-1M-context
|
I'm starting a first review of this pull request. You can view the conversation on Warp. I completed the review and no human review was requested for this pull request. Comment Powered by Oz |
| base: params.model.into(), | ||
| cli_agent: params.cli_agent_model.into(), | ||
| computer_use_agent: params.computer_use_model.into(), | ||
| base_model_context_window_limit: if FeatureFlag::ConfigurableContextWindow |
There was a problem hiding this comment.
@coolcom200 same question here.
Is it okay to remove this feature flag now?
There was a problem hiding this comment.
Yes should be safe to remove.
There was a problem hiding this comment.
Overview
This PR replaces the old configurable context window gate with GPT-specific OpenAI gating, relies on server-provided model metadata for context-window support, adds long-context pricing warnings in the relevant settings/editor surfaces, and covers the new gating/sanitization behavior with unit tests. No approved spec context was available, and the supplemental security pass did not find security concerns.
Concerns
- One non-blocking UI consistency issue: the warning path should use the same clamped effective limit as the request path so stale persisted values do not show an OpenAI pricing warning when the model's current max is below the warning threshold.
Verdict
Found: 0 critical, 0 important, 1 suggestions
Approve with nits
Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).
Powered by Oz
| Some( | ||
| self.context_window_limit | ||
| .unwrap_or(llm.context_window.default_max), | ||
| ), |
There was a problem hiding this comment.
💡 [SUGGESTION] Clamp the warning input the same way context_window_limit_for_request does. A stale persisted override above the current model max would make this warning appear even though the outgoing request and slider are constrained to the lowered range.
| Some( | |
| self.context_window_limit | |
| .unwrap_or(llm.context_window.default_max), | |
| ), | |
| Some( | |
| self.context_window_limit | |
| .unwrap_or(llm.context_window.default_max) | |
| .clamp(llm.context_window.min, llm.context_window.max), | |
| ), |
Co-Authored-By: Oz <oz-agent@warp.dev>
|
I'm starting a first review of this pull request. You can view the conversation on Warp. I completed the review and no human review was requested for this pull request. Comment Powered by Oz |
There was a problem hiding this comment.
Overview
This PR replaces the old broad configurable-context-window flag with GPT-specific gating, routes request context-window limits through model metadata sanitization, and adds OpenAI long-context pricing warnings in the AI settings/profile UI. It also updates the reusable warning box to support formatted hyperlink titles and adds unit coverage for the new context-window helper behavior.
Concerns
- No blocking correctness, security, or spec-alignment concerns found in the annotated diff.
Verdict
Found: 0 critical, 0 important, 0 suggestions
Approve
Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).
Powered by Oz
…arpdotdev/warp into dagm/allow-gpt-models-1M-context # Conflicts: # app/src/view_components/warning_box.rs
| /// Enables configurable expanded context windows for eligible GPT models. | ||
| GPTConfigurableContextWindow, |
There was a problem hiding this comment.
ooc do we need a client-side feature flag for this? the server already drives whether long context for GPT models is enabled -- it feels sufficient to let the server handle that decision
There was a problem hiding this comment.
I think the server should be the source of truth for which models are and are not configurable. We can add a flag on the client side just as a control for rollout and clean it up later though.
There was a problem hiding this comment.
we can already control the rollout via the server, and avoiding the client flag seems like it would make some of the client code simpler.
not a huge deal though, up to you!
Can you send me the conversation ID? I can take a look. It's working on my end with the server changes https://www.loom.com/share/491850b7038241519ff7a45612af5e5c |
| @@ -303,13 +303,7 @@ impl RequestParams { | |||
| .active_profile(terminal_view_id, app) | |||
| .data() | |||
| .clone(); | |||
There was a problem hiding this comment.
do we still need the clone here if we're going to call context_window_limit_for_request to get the limit directly? feels like we could simplify this part of the code now
| } | ||
| } | ||
|
|
||
| pub(crate) fn sanitize_context_window_limit_for_request( |
There was a problem hiding this comment.
this method seems like it's only used in context_window_limit_for_request? I have a slight preference for inlining the method, feels like it'd be a bit easier to read.
| pub fn with_formatted_title(mut self, formatted_title: FormattedTextInline) -> Self { | ||
| self.formatted_title = Some(formatted_title); | ||
| self | ||
| } |
There was a problem hiding this comment.
the API is a bit awkward here since to construct a WarningBoxConfig with a formatted title, you have to go through the new constructor and pass in a non-formatted title that will get overwritten.
we have a couple other UI components where the label can be either text or some custom element. a good example of this is RadioButtonItem. it's a bit cleaner since we can use an enum to encode the "text or custom element" constraint in code.
|
|
||
| selected_limit.map(|limit| limit.clamp(llm.context_window.min, llm.context_window.max)) | ||
| } | ||
| pub(crate) fn has_effective_configurable_context_window( |
There was a problem hiding this comment.
nit: what does it mean semantically to have an "effective" configurable context window. could we name the method has_effective_configurable_context_window?
| use crate::server::sync_queue::QueueItem; | ||
| use crate::settings::AISettings; | ||
| use crate::workspaces::user_workspaces::UserWorkspaces; | ||
| pub const LONG_CONTEXT_WARNING_THRESHOLD: u32 = 272_000; |
There was a problem hiding this comment.
we can do this in another PR, but I think we should pass the threshold value down from the server. we might need to modify the graphql/protos to pass that info alongside the rest of the context window values.
having this be server driven makes it easier to support new models with different thresholds in the future. imagine OpenAI releases a new model with a different threshold value. we want to avoid needing an out-of-band client release just to support the new model.
can we specify in a comment that this threshold is only relevant for GPT 5.4 + 5.5 for now, and wire the threshold value down from the server in a follow-up PR?
| /// Enables configurable expanded context windows for eligible GPT models. | ||
| GPTConfigurableContextWindow, |
There was a problem hiding this comment.
we can already control the rollout via the server, and avoiding the client flag seems like it would make some of the client code simpler.
not a huge deal though, up to you!
I tried again on your updated server branch, and it worked fine. Might've had something weird in my local setup, ignore that error! |

Description
gpt-5.4,gpt-5.5,grape)>272K(OpenAI threshold)GPTConfigurableContextWindowflagconfigurable_context_windowTesting
./script/runScreenshots / Videos
Default
Warning
Agent Mode