feat: configurable ingest-time privacy controls for hook payloads (#148)#166
feat: configurable ingest-time privacy controls for hook payloads (#148)#166linkvapeluckyman wants to merge 3 commits into
Conversation
…ok payloads Implements the MVP for issue hoangsonww#148: - New lib/privacy.js sanitizer applied at every event-insert site in routes/hooks.js BEFORE persistence and WebSocket broadcast: six built-in detectors (secret-named keys, Bearer tokens, common API-key formats, private-key blocks, opt-in email addresses and home-directory paths) plus up to 100 custom key/value regex rules with mask / hash / drop_field / drop_event_payload actions - Conservative default policy: obvious secrets masked out of the box; hash action uses stable truncated SHA-256 so values stay correlatable - Redacted events carry a _privacy counters block (rules applied, masked / hashed / dropped) without exposing originals; clean payloads are stored byte-identical so analytics, filters, and cost views are unaffected - Fail-safe: sanitizer errors degrade to a metadata-only stub (never raw data) and never fail hook ingestion; invalid rules are rejected at save - REST API: GET/PUT /api/privacy + POST /api/privacy/preview (non-persisting before/after, supports draft policies); documented in OpenAPI/Swagger - Policy persisted in a new additive app_settings key/value table and included in GET /api/settings/export - Settings panel: master toggle, per-detector toggles, rule CRUD, live sample preview, explicit import/reimport warning; localized en/zh/vi - 14 server tests: policy validation, nested payload masking, key formats, summary sanitization on ingest + response, hash stability, drop actions, disabled passthrough, opt-in detectors, large payloads, preview isolation Closes hoangsonww#148 Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
There was a problem hiding this comment.
Code Review
This pull request introduces ingest-time privacy controls to redact, hash, or drop sensitive data from hook payloads and summaries before they are stored in SQLite or broadcast over WebSockets. It adds a new app_settings table, a server-side sanitizer with built-in and custom regex detectors, management and preview API endpoints, and a PrivacyControls settings UI. Feedback on these changes suggests slicing and sanitizing the prefix of oversized strings rather than skipping them entirely, compiling custom value rules with case-insensitivity (gi) to prevent under-redaction, and generating temporary client-side IDs for new rules to ensure stable React rendering keys.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
- lib/privacy: oversized strings (beyond the scan cap) are now masked wholesale instead of skipped — a string must never bypass value scanning. The cap itself is raised to 2 MB, above the 1 MB express body limit, so every string a real hook payload can carry is fully scanned - lib/privacy: custom value rules compile with gi so case-variants of a pattern (Password vs password) cannot slip past redaction - client(PrivacyControls): new rules get a client-side id so unsaved rules have stable React keys across delete/reorder - tests: new oversize-string masking case; drop_event_payload case now proves case-insensitive matching (15 privacy tests total) Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
There was a problem hiding this comment.
fix: address PR #166 review feedback
- lib/privacy: oversized strings (beyond the scan cap) are now masked
wholesale instead of skipped — a string must never bypass value scanning.
The cap itself is raised to 2 MB, above the 1 MB express body limit, so
every string a real hook payload can carry is fully scanned - lib/privacy: custom value rules compile with gi so case-variants of a
pattern (Password vs password) cannot slip past redaction - client(PrivacyControls): new rules get a client-side id so unsaved rules
have stable React keys across delete/reorder - tests: new oversize-string masking case; drop_event_payload case now
proves case-insensitive matching (15 privacy tests total)
…rols # Conflicts: # ARCHITECTURE.md
Summary
Implements the MVP for #148 — a configurable privacy policy that redacts, hashes, or
drops sensitive data from hook payloads before they reach SQLite or WebSocket clients.
Issue #148 acceptance criteria
routes/hooks.js; summaries are sanitized before both persist and broadcast)_privacycounters block is stamped only when something was redacted; clean payloads are stored byte-identical, so dashboards, filters, analytics, and cost views are unaffectedPOST /api/privacy/preview, also accepts a draft policy so the UI can preview unsaved edits)Built-in detectors (conservative default: secrets masked out of the box)
token/secret/password/api_key/auth/credential— same regex family as the Config Explorer redaction)Bearertokens inside stringssk-ant-…,sk-…,ghp_…,github_pat_…,AKIA…,xox…,AIza…)-----BEGIN … PRIVATE KEY-----)/Users/…,/home/…,C:\Users\…) — opt-in (off by default)Rule actions
mask→[REDACTED:<rule>]hash→ stable truncated SHA-256 (sha256:abc…), so values stay correlatable across eventsdrop_field→ key removed entirely (key-match rules only)drop_event_payload→ payload reduced to a metadata-only stub (covers the issue'spreserve_metadata_onlysemantics)Safety properties
app_settingskey/value table; policy travels withGET /api/settings/exportDeferred (per issue non-goals / open questions)
Verification
npm run test:server— 275 pass, 0 fail (14 new privacy tests)npm run test:client— 198 passnpm run build— tsc + vite clean; Prettier clean