Skip to content

fix(subconscious): halt + demote on permanent provider rate-cap 413 (groq TPM flood) #4404

Description

@oxoxDev

Source

Sentry: TAURI-RUST-HXF (project tauri-rust)
Events: 2232 · Users affected: 1 · First seen: 2026-06-27 · Last seen: 2026-07-02 (active)

Symptom

groq API error (413 Payload Too Large): {"error":{"message":"Request too large for model openai/gpt-oss-120b in organization ... service tier on_demand on tokens per minute (TPM): Limit 8000, Requested 42084 ..."}}

Tags: domain=agent, provider=subconscious, model=openai/gpt-oss-120b, operation=provider_chat, os=windows, release=openhuman@0.58.0.

Root cause

The subconscious background agent ticks every 5-30 min. This user configured its provider as groq gpt-oss-120b on the on_demand (free) tier, whose tokens-per-minute cap is 8000. A subconscious turn builds ~42k tokens of context — 5x over the per-minute rate cap, so groq rejects every call with a 413. This is a rate cap, not a context-window limit, so context-trimming cannot make the request fit.

Two defects fall out:

  1. Per-tick re-report flood — the tick loop re-fires the identical, permanently-doomed request every interval and the provider_chat boundary re-reports it to Sentry each time (2232 events), while also burning the user's groq quota. Same shape as the cron billing-loop flood fixed in fix(cron): stop cron billing-state Sentry floods — 402 credits + 400 budget (TAURI-RUST-514 / -BMW) #3913.
  2. Mis-classified as an unexpected crash — a raw direct-provider 413/TPM rejection is the user's account rate tier (no lever on our side), but it isn't matched by any user-state/transient classifier, so it pages as an unexpected error.

Fix

  • Halt-on-first (RCA): subconscious loop recognizes a permanent provider-config rejection and stops re-running while that same provider/model config is set; auto-recovers on config change.
  • Classify + demote (defense-in-depth): a single typed matcher (is_provider_rate_cap_exceeded) recognizes a direct-provider "413 / tokens per minute (TPM)" rejection as user-config; demotes it from a paging crash to expected. Managed-backend 413 guard-leaks still page (unchanged).

Bug shape

Async-without-contracts / permanent-rejection re-reported per tick (cron-billing-flood family, #3913).

Reproduces on

  • Branch: upstream/main
  • Config: subconscious provider = groq gpt-oss-120b free/on_demand tier (8000 TPM)

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Fields

No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions