Skip to content

fix(tra-974): pin preview backend LOG_LEVEL=info to stop debug log firehose#157

Merged
mikestankavich merged 1 commit into
mainfrom
fix/tra-974-preview-log-level
Jun 10, 2026
Merged

fix(tra-974): pin preview backend LOG_LEVEL=info to stop debug log firehose#157
mikestankavich merged 1 commit into
mainfrom
fix/tra-974-preview-log-level

Conversation

@mikestankavich

Copy link
Copy Markdown
Contributor

Problem

Preview detects as EnvDev, so the Go backend logger defaults to debug (logger/config.go) and emits a per-message ingest message processed line ~3/sec, 24/7. That's a GCP Cloud Logging ingest cost with no steady-state value, and it drowns other sessions troubleshooting on preview. Decision: preview should be quiet by default, escalate to debug only during an active debugging window.

Root cause (found while implementing — would have been a no-op otherwise)

The chart's config.logLevel only ever emitted BACKEND_LOG_LEVEL, which the current backend does not read. Proven against the live cluster: preview's ConfigMap already had BACKEND_LOG_LEVEL=info, yet the pod logged at debug — and an explicit LOG_LEVEL=info env (the stopgap kubectl set env) is what actually silenced it. So pinning config.logLevel would have shipped a silent no-op.

Fix — preview-only, prod untouched

  • chart (helm/trakrf-backend): new optional config.runtimeLogLevel that emits a real LOG_LEVEL ConfigMap key only when set. Empty default → envs that omit it (prod) render no LOG_LEVEL key and keep their APP_ENV→warn default — ConfigMap byte-identical to today, no prod pod bounce.
  • root app (argocd/root): per-env logLevel: info for preview, appended to the config block via the template's existing no-env-conditional values idiom (same pattern as jwtExpirationSeconds).

The legacy/dead BACKEND_LOG_LEVEL key is left in place and flagged for a separate cleanup. The backend-side detection fix (preview gets its own env defaulting to info) is the platform-repo option (b), out of scope here.

Verification (helm template)

# chart default (runtimeLogLevel empty)        → BACKEND_LOG_LEVEL only, no LOG_LEVEL
# chart --set config.runtimeLogLevel=info       → LOG_LEVEL: "info"
# root render (gke): trakrf-backend-preview      → inlineValues config.runtimeLogLevel: "info"
#                    trakrf-backend-prod          → (absent)

Deploy note

Root-template change → does not auto-sync. Requires scripts/apply-root-app.sh gke post-merge to materialize (re-renders all env child apps; only preview's ConfigMap actually changes, so only the preview backend pod bounces). The interim kubectl set env LOG_LEVEL=info override on preview keeps it quiet until then.

Tracking: TRA-974.

🤖 Generated with Claude Code

…rehose

Preview detects as EnvDev, so the Go backend logger defaults to debug
(logger/config.go) and emits a per-message "ingest message processed" line
~3/sec, 24/7 — a Cloud Logging ingest cost with no debugging value at
steady state, and it drowns other sessions troubleshooting on preview.

Root cause found while implementing: the chart's `config.logLevel` only ever
emitted BACKEND_LOG_LEVEL, which the current backend does NOT read — proven
live (preview's ConfigMap already had BACKEND_LOG_LEVEL=info yet the pod
logged debug; an explicit LOG_LEVEL=info env is what silenced it). So a naive
`config.logLevel` pin would have been a no-op.

Fix (preview-only, prod untouched):
- chart: new optional `config.runtimeLogLevel` that emits a real LOG_LEVEL
  ConfigMap key ONLY when set. Empty default → envs that omit it (prod) render
  no LOG_LEVEL key and keep their APP_ENV→warn default, byte-identical to today
  (no prod ConfigMap change, no prod pod bounce).
- root app: per-env `logLevel: info` for preview, appended to the config block
  via the existing no-env-conditional values idiom (like jwtExpirationSeconds).

The legacy/dead BACKEND_LOG_LEVEL key is left in place and flagged for a
separate cleanup (TRA-974). The backend-side env-detection fix (preview gets
its own env defaulting to info) is the platform-repo option (b), out of scope.

Verified via helm template: default render emits no LOG_LEVEL; preview app
inlineValues carry runtimeLogLevel=info → chart emits LOG_LEVEL="info"; prod
emits neither.

NOTE: root-template change — requires `scripts/apply-root-app.sh gke`
post-merge to materialize (root/templates/* don't auto-sync).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@mikestankavich mikestankavich merged commit 2f1e000 into main Jun 10, 2026
19 checks passed
@mikestankavich mikestankavich deleted the fix/tra-974-preview-log-level branch June 10, 2026 21:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant