Skip to content

GPT models 1M context#11838

Open
dagmfactory wants to merge 26 commits into
masterfrom
dagm/allow-gpt-models-1M-context
Open

GPT models 1M context#11838
dagmfactory wants to merge 26 commits into
masterfrom
dagm/allow-gpt-models-1M-context

Conversation

@dagmfactory
Copy link
Copy Markdown
Contributor

@dagmfactory dagmfactory commented May 28, 2026

Description

  • Enabling 1M Context for supported GPT models (gpt-5.4, gpt-5.5, grape)
  • OpenAI specific long-context pricing warning when context window >272K (OpenAI threshold)
  • Server as source of truth for which models support configurable context windows
  • change is behind GPTConfigurableContextWindow flag
  • removing stale flag configurable_context_window

Testing

  • I have manually tested my changes locally with ./script/run

Screenshots / Videos

Default

Screenshot 2026-05-28 at 3 16 36 PM

Warning

Screenshot 2026-05-28 at 3 17 50 PM

Agent Mode

  • Warp Agent Mode - This PR was created via Warp's AI Agent Mode

@cla-bot cla-bot Bot added the cla-signed label May 28, 2026
@dagmfactory dagmfactory marked this pull request as ready for review May 28, 2026 20:05
@oz-for-oss
Copy link
Copy Markdown
Contributor

oz-for-oss Bot commented May 28, 2026

@dagmfactory

I'm starting a first review of this pull request.

You can view the conversation on Warp.

I completed the review and no human review was requested for this pull request.

Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).

Powered by Oz

base: params.model.into(),
cli_agent: params.cli_agent_model.into(),
computer_use_agent: params.computer_use_model.into(),
base_model_context_window_limit: if FeatureFlag::ConfigurableContextWindow
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@coolcom200 same question here.
Is it okay to remove this feature flag now?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes should be safe to remove.

Copy link
Copy Markdown
Contributor

@oz-for-oss oz-for-oss Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overview

This PR replaces the old configurable context window gate with GPT-specific OpenAI gating, relies on server-provided model metadata for context-window support, adds long-context pricing warnings in the relevant settings/editor surfaces, and covers the new gating/sanitization behavior with unit tests. No approved spec context was available, and the supplemental security pass did not find security concerns.

Concerns

  • One non-blocking UI consistency issue: the warning path should use the same clamped effective limit as the request path so stale persisted values do not show an OpenAI pricing warning when the model's current max is below the warning threshold.

Verdict

Found: 0 critical, 0 important, 1 suggestions

Approve with nits

Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).

Powered by Oz

Comment on lines +483 to +486
Some(
self.context_window_limit
.unwrap_or(llm.context_window.default_max),
),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 [SUGGESTION] Clamp the warning input the same way context_window_limit_for_request does. A stale persisted override above the current model max would make this warning appear even though the outgoing request and slider are constrained to the lowered range.

Suggested change
Some(
self.context_window_limit
.unwrap_or(llm.context_window.default_max),
),
Some(
self.context_window_limit
.unwrap_or(llm.context_window.default_max)
.clamp(llm.context_window.min, llm.context_window.max),
),

@dagmfactory dagmfactory marked this pull request as draft May 28, 2026 22:28
@dagmfactory dagmfactory marked this pull request as ready for review June 2, 2026 17:38
@oz-for-oss
Copy link
Copy Markdown
Contributor

oz-for-oss Bot commented Jun 2, 2026

@dagmfactory

I'm starting a first review of this pull request.

You can view the conversation on Warp.

I completed the review and no human review was requested for this pull request.

Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).

Powered by Oz

Copy link
Copy Markdown
Contributor

@oz-for-oss oz-for-oss Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overview

This PR replaces the old broad configurable-context-window flag with GPT-specific gating, routes request context-window limits through model metadata sanitization, and adds OpenAI long-context pricing warnings in the AI settings/profile UI. It also updates the reusable warning box to support formatted hyperlink titles and adds unit coverage for the new context-window helper behavior.

Concerns

  • No blocking correctness, security, or spec-alignment concerns found in the annotated diff.

Verdict

Found: 0 critical, 0 important, 0 suggestions

Approve

Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).

Powered by Oz

@dagmfactory dagmfactory requested a review from danielpeng2 June 2, 2026 17:49
…arpdotdev/warp into dagm/allow-gpt-models-1M-context

# Conflicts:
#	app/src/view_components/warning_box.rs
Copy link
Copy Markdown
Member

@danielpeng2 danielpeng2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI the screenshots in the description aren't loading.

I tried using long context with your server change and I keep getting errors?
Image

Also a product nit: can we make the warning in the settings appear as you drag, not only after you release the slider?

Comment on lines +874 to +875
/// Enables configurable expanded context windows for eligible GPT models.
GPTConfigurableContextWindow,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ooc do we need a client-side feature flag for this? the server already drives whether long context for GPT models is enabled -- it feels sufficient to let the server handle that decision

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the server should be the source of truth for which models are and are not configurable. We can add a flag on the client side just as a control for rollout and clean it up later though.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can already control the rollout via the server, and avoiding the client flag seems like it would make some of the client code simpler.

not a huge deal though, up to you!

@dagmfactory
Copy link
Copy Markdown
Contributor Author

dagmfactory commented Jun 2, 2026

can we make the warning in the settings appear as you drag, not only after you release the slider

Can you send me the conversation ID? I can take a look. It's working on my end with the server changes https://www.loom.com/share/491850b7038241519ff7a45612af5e5c

Comment thread app/src/ai/agent/api.rs
@@ -303,13 +303,7 @@ impl RequestParams {
.active_profile(terminal_view_id, app)
.data()
.clone();
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we still need the clone here if we're going to call context_window_limit_for_request to get the limit directly? feels like we could simplify this part of the code now

}
}

pub(crate) fn sanitize_context_window_limit_for_request(
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this method seems like it's only used in context_window_limit_for_request? I have a slight preference for inlining the method, feels like it'd be a bit easier to read.

Comment on lines +85 to +88
pub fn with_formatted_title(mut self, formatted_title: FormattedTextInline) -> Self {
self.formatted_title = Some(formatted_title);
self
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the API is a bit awkward here since to construct a WarningBoxConfig with a formatted title, you have to go through the new constructor and pass in a non-formatted title that will get overwritten.

we have a couple other UI components where the label can be either text or some custom element. a good example of this is RadioButtonItem. it's a bit cleaner since we can use an enum to encode the "text or custom element" constraint in code.


selected_limit.map(|limit| limit.clamp(llm.context_window.min, llm.context_window.max))
}
pub(crate) fn has_effective_configurable_context_window(
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: what does it mean semantically to have an "effective" configurable context window. could we name the method has_effective_configurable_context_window?

use crate::server::sync_queue::QueueItem;
use crate::settings::AISettings;
use crate::workspaces::user_workspaces::UserWorkspaces;
pub const LONG_CONTEXT_WARNING_THRESHOLD: u32 = 272_000;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can do this in another PR, but I think we should pass the threshold value down from the server. we might need to modify the graphql/protos to pass that info alongside the rest of the context window values.

having this be server driven makes it easier to support new models with different thresholds in the future. imagine OpenAI releases a new model with a different threshold value. we want to avoid needing an out-of-band client release just to support the new model.

can we specify in a comment that this threshold is only relevant for GPT 5.4 + 5.5 for now, and wire the threshold value down from the server in a follow-up PR?

Comment on lines +874 to +875
/// Enables configurable expanded context windows for eligible GPT models.
GPTConfigurableContextWindow,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can already control the rollout via the server, and avoiding the client flag seems like it would make some of the client code simpler.

not a huge deal though, up to you!

@danielpeng2
Copy link
Copy Markdown
Member

can we make the warning in the settings appear as you drag, not only after you release the slider

Can you send me the conversation ID? I can take a look. It's working on my end with the server changes https://www.loom.com/share/491850b7038241519ff7a45612af5e5c

I tried again on your updated server branch, and it worked fine. Might've had something weird in my local setup, ignore that error!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants