Skip to content

fix(gateway): fix cache key calculation#4077

Closed
MasterPtato wants to merge 1 commit into01-28-fix_remove_graceful_requirement_for_metrics_aggregatorfrom
01-30-fix_gateway_fix_cache_key_calculation
Closed

fix(gateway): fix cache key calculation#4077
MasterPtato wants to merge 1 commit into01-28-fix_remove_graceful_requirement_for_metrics_aggregatorfrom
01-30-fix_gateway_fix_cache_key_calculation

Conversation

@MasterPtato
Copy link
Contributor

No description provided.

Copy link
Contributor Author

Warning

This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
Learn more


How to use the Graphite Merge Queue

Add the label merge-queue to this PR to add it to the merge queue.

You must have a Graphite account in order to use the merge queue. Sign up using this link.

An organization admin has enabled the Graphite Merge Queue in this repository.

Please do not merge from GitHub as this will restart CI on PRs being processed by the merge queue.

This stack of pull requests is managed by Graphite. Learn more about stacking.

@pkg-pr-new
Copy link

pkg-pr-new bot commented Jan 30, 2026

More templates

@rivetkit/cloudflare-workers

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/cloudflare-workers@4077

@rivetkit/db

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/db@4077

@rivetkit/framework-base

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/framework-base@4077

@rivetkit/next-js

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/next-js@4077

@rivetkit/react

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/react@4077

rivetkit

pnpm add https://pkg.pr.new/rivet-dev/rivet/rivetkit@4077

@rivetkit/sql-loader

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/sql-loader@4077

@rivetkit/virtual-websocket

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/virtual-websocket@4077

@rivetkit/engine-runner

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/engine-runner@4077

@rivetkit/engine-runner-protocol

pnpm add https://pkg.pr.new/rivet-dev/rivet/@rivetkit/engine-runner-protocol@4077

commit: 66df968

@MasterPtato MasterPtato force-pushed the 01-28-fix_remove_graceful_requirement_for_metrics_aggregator branch from c6e49a4 to 40593e6 Compare January 30, 2026 21:13
@MasterPtato MasterPtato force-pushed the 01-30-fix_gateway_fix_cache_key_calculation branch from 340ee20 to 143c296 Compare January 30, 2026 21:13
@claude
Copy link

claude bot commented Jan 30, 2026

PR Review: fix(gateway): fix cache key calculation

Summary

This PR refactors the cache key calculation logic to properly handle WebSocket connections by moving the WebSocket detection earlier in the request lifecycle. This ensures consistent cache key generation across both routing and caching paths.


Positive Changes

1. Improved Architecture & Code Organization

  • Consolidation: Merged cache/actor.rs into cache/pegboard_gateway.rs creates better cohesion with routing logic
  • Single responsibility: Both path-based and target-based cache key generation now live in the same module
  • DRY principle: Eliminated duplicate WebSocket detection logic across the codebase

2. Bug Fix: WebSocket Detection Timing
The core fix moves WebSocket detection earlier in the request lifecycle:

  • Before: WebSocket detection happened after RequestContext creation in proxy_service.rs:411
  • After: Detection at proxy_service.rs:382 and stored in RequestContext
  • Impact: Cache keys can now correctly differentiate WebSocket vs HTTP requests

3. Enhanced Testability

  • req_ctx.is_websocket() method provides clean abstraction for testing
  • Centralized WebSocket state eliminates timing-dependent bugs

Code Quality Observations

1. Duplicate Code Pattern

The target extraction logic is duplicated between cache/mod.rs:36-55 and routing/mod.rs:75-94. Both blocks parse Sec-WebSocket-Protocol and X-Rivet-Target identically. Consider extracting to a shared helper function.

2. Silent Error Handling

In cache/mod.rs:27-31 and cache/mod.rs:59-61, errors from cache key generation are silently ignored. Add debug logging when cache key generation fails to help diagnose issues.

3. Inconsistent Path Hashing

  • Path-based cache: hashes stripped_path (cache/pegboard_gateway.rs:30)
  • Target-based cache: hashes full req_ctx.path() (cache/pegboard_gateway.rs:95)

This creates different cache keys for the same actor accessed via different routing methods. Example:

  • Path-based: /gateway/actor-123/api/users → hashes /api/users
  • Header-based to /api/users → hashes /api/users
  • Header-based to /gateway/actor-123/api/users → hashes /gateway/actor-123/api/users (different!)

4. TODO Comments

Both cache functions have identical TODO comments. Consider documenting the decision and removing them.


Security Considerations

All Good - URL encoding handled consistently, no injection risks, proper header validation.


Performance Considerations

1. Redundant Protocol Parsing
In cache/pegboard_gateway.rs:57 and routing/pegboard_gateway.rs:97, the code allocates a Vec to parse protocols. Use iterators directly to avoid allocation.

2. Hash Function Choice
Using DefaultHasher is secure but slower. Consider ahash for 2-3x speedup if profiling shows this is a bottleneck.


Test Coverage

Missing test scenarios:

  1. WebSocket cache key consistency
  2. Path normalization
  3. Query string handling
  4. Fallback behavior

Recommendations Summary

High Priority:

  1. Add debug logging when cache key generation fails
  2. Resolve stripped_path vs full path inconsistency (potential bug)
  3. Extract duplicate target extraction logic

Medium Priority:
4. Add test coverage for WebSocket scenarios
5. Document/resolve query string TODO
6. Optimize protocol parsing


Verdict

Recommendation: Approve with minor improvements

The core bug fix is sound and improves the architecture. The identified issues are polish items that do not block merging, though addressing the cache key consistency question would prevent potential future bugs.

Great work on identifying and fixing the WebSocket detection timing issue!

@claude
Copy link

claude bot commented Jan 30, 2026

PR Review: fix(gateway): fix cache key calculation

Summary

This PR fixes cache key calculation in the gateway by ensuring WebSocket detection happens before RequestContext is created, and refactors the cache key logic to handle both path-based and header/protocol-based routing consistently.


Positive Changes

  1. Critical Bug Fix: Moving is_websocket detection before RequestContext creation (proxy_service.rs:383) ensures the context accurately reflects the connection type throughout its lifecycle.

  2. Better Code Organization: Consolidating actor-specific cache logic into cache/pegboard_gateway.rs improves separation of concerns. The old cache/actor.rs only handled header-based routing; the new module handles both path-based and header/protocol-based routing.

  3. Consistent WebSocket Handling: Using req_ctx.is_websocket() throughout the codebase (instead of checking headers inline) improves consistency and reduces duplication.

  4. Cache Key Parity: The refactor ensures path-based and header-based actor routing produce equivalent cache keys for the same actor+path+method combination, which is important for cache hit rates.


Issues and Concerns

1. Silent Error Swallowing in Cache Key Function (cache/mod.rs:27-31, 59-61)

Cache key generation errors are silently ignored and fall through to the next routing method. This masks legitimate errors (invalid actor IDs, malformed headers) that should fail fast. This creates debugging difficulty when cache keys are incorrectly generated and potential cache misses for valid requests.

Recommendation: Add structured logging at minimum, or consider making errors explicit for certain failure modes (e.g., invalid actor ID format should probably fail the request rather than fall back).

2. Inconsistent TODO Comments (cache/pegboard_gateway.rs:29, 94)

The TODOs ask if query should be excluded, while path-based routing uses stripped_path which includes the query string (routing/mod.rs:229-232). Cache key behavior for requests differing only in query params is unclear and potentially inconsistent.

Recommendation: Decide whether query params should affect cache keys and update the code to match, then remove or clarify the TODOs.

3. Unnecessary String Allocation (cache/pegboard_gateway.rs:71, 85)

Both branches convert to String only to immediately parse as Id (line 88). The intermediate string allocation is unnecessary and could be optimized.

4. Duplicate Protocol Parsing Logic

The WebSocket protocol parsing for actor ID extraction is duplicated between cache key generation (cache/pegboard_gateway.rs:46-67) and routing logic (routing/pegboard_gateway.rs:84-116). Consider extracting to a shared helper function to reduce code duplication and ensure consistency.


Minor Observations

  • Code style follows CLAUDE.md guidelines (complete sentences, lowercase comments, structured logging)
  • Imports are properly organized at file tops
  • Error handling uses the custom error derive system correctly
  • Tracing uses structured logging rather than formatted strings

Testing Recommendations

Consider adding tests for:

  1. Cache key consistency between path-based and header-based routing for the same actor/path/method
  2. WebSocket vs HTTP requests generating appropriate cache keys
  3. URL-encoded actor IDs in both path and protocol headers
  4. Query string handling in cache keys (once TODO is resolved)

Priority Recommendations

Must Fix: Issue 1 (silent error swallowing) - add logging at minimum

Should Fix: Issues 2 and 4 (query string TODO and duplicate parsing)

Nice to Have: Issue 3 (string allocation optimization)


Final Assessment

The PR successfully fixes the WebSocket detection timing issue and improves code organization. The core logic is sound, but error handling could be more explicit and some code duplication should be addressed. The changes are functional and safe to merge with the suggested improvements applied either now or in a follow-up PR.

@MasterPtato MasterPtato force-pushed the 01-28-fix_remove_graceful_requirement_for_metrics_aggregator branch from 40593e6 to 963ce72 Compare January 30, 2026 23:05
@MasterPtato MasterPtato force-pushed the 01-30-fix_gateway_fix_cache_key_calculation branch from 143c296 to 66df968 Compare January 30, 2026 23:05
@graphite-app
Copy link
Contributor

graphite-app bot commented Jan 30, 2026

Merge activity

  • Jan 30, 11:21 PM UTC: MasterPtato added this pull request to the Graphite merge queue.
  • Jan 30, 11:22 PM UTC: CI is running for this pull request on a draft pull request (#4080) due to your merge queue CI optimization settings.
  • Jan 30, 11:23 PM UTC: Merged by the Graphite merge queue via draft PR: #4080.

graphite-app bot pushed a commit that referenced this pull request Jan 30, 2026
@graphite-app graphite-app bot closed this Jan 30, 2026
@graphite-app graphite-app bot deleted the 01-30-fix_gateway_fix_cache_key_calculation branch January 30, 2026 23:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

Comments