You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Sentry: TAURI-RUST-HXF (project tauri-rust)
Events: 2232 · Users affected: 1 · First seen: 2026-06-27 · Last seen: 2026-07-02 (active)
Symptom
groq API error (413 Payload Too Large): {"error":{"message":"Request too large for model openai/gpt-oss-120b in organization ... service tier on_demand on tokens per minute (TPM): Limit 8000, Requested 42084 ..."}}
The subconscious background agent ticks every 5-30 min. This user configured its provider as groq gpt-oss-120b on the on_demand (free) tier, whose tokens-per-minute cap is 8000. A subconscious turn builds ~42k tokens of context — 5x over the per-minute rate cap, so groq rejects every call with a 413. This is a rate cap, not a context-window limit, so context-trimming cannot make the request fit.
Mis-classified as an unexpected crash — a raw direct-provider 413/TPM rejection is the user's account rate tier (no lever on our side), but it isn't matched by any user-state/transient classifier, so it pages as an unexpected error.
Fix
Halt-on-first (RCA): subconscious loop recognizes a permanent provider-config rejection and stops re-running while that same provider/model config is set; auto-recovers on config change.
Classify + demote (defense-in-depth): a single typed matcher (is_provider_rate_cap_exceeded) recognizes a direct-provider "413 / tokens per minute (TPM)" rejection as user-config; demotes it from a paging crash to expected. Managed-backend 413 guard-leaks still page (unchanged).
Bug shape
Async-without-contracts / permanent-rejection re-reported per tick (cron-billing-flood family, #3913).
Source
Sentry: TAURI-RUST-HXF (project
tauri-rust)Events: 2232 · Users affected: 1 · First seen: 2026-06-27 · Last seen: 2026-07-02 (active)
Symptom
groq API error (413 Payload Too Large): {"error":{"message":"Request too large for model openai/gpt-oss-120b in organization ... service tier on_demand on tokens per minute (TPM): Limit 8000, Requested 42084 ..."}}Tags:
domain=agent,provider=subconscious,model=openai/gpt-oss-120b,operation=provider_chat,os=windows,release=openhuman@0.58.0.Root cause
The subconscious background agent ticks every 5-30 min. This user configured its provider as groq
gpt-oss-120bon theon_demand(free) tier, whose tokens-per-minute cap is 8000. A subconscious turn builds ~42k tokens of context — 5x over the per-minute rate cap, so groq rejects every call with a 413. This is a rate cap, not a context-window limit, so context-trimming cannot make the request fit.Two defects fall out:
provider_chatboundary re-reports it to Sentry each time (2232 events), while also burning the user's groq quota. Same shape as the cron billing-loop flood fixed in fix(cron): stop cron billing-state Sentry floods — 402 credits + 400 budget (TAURI-RUST-514 / -BMW) #3913.Fix
is_provider_rate_cap_exceeded) recognizes a direct-provider "413 / tokens per minute (TPM)" rejection as user-config; demotes it from a paging crash to expected. Managed-backend 413 guard-leaks still page (unchanged).Bug shape
Async-without-contracts / permanent-rejection re-reported per tick (cron-billing-flood family, #3913).
Reproduces on
gpt-oss-120bfree/on_demand tier (8000 TPM)