feat: content-hash conversation identity (server-only hashing)#64
Merged
Conversation
basilebong
added a commit
that referenced
this pull request
May 21, 2026
Lifts the sequence diagram from PR #64 into the integration guide as a top-of-doc "Request flow at a glance" section. Also fixes the residual `conversation_id`/`conversation_version` references in architecture.md and integrate.md that survived the content-hash merge. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
6 tasks
Replaces dev-set conversation IDs with a SHA-256 over the canonical-JSON
encoding of the turn array (including sha256 of RecordedAudio WAV bytes).
The dev sets `name` as a free-form display label; identity is the hash.
Wire: POST /v1/replays now carries {name, turns, modality, run_config?}.
The server recomputes the hash, upserts the conversation row by hash
(last-write-wins on name), and creates the replay. POST /v1/conversations
is removed. GET routes use :hash instead of :id; the ?version query is
gone. Replays reference conversations via conversation_hash FK.
SDK: Conversation(name=, turns=). compute_hash is a lazy cached_property.
RecordedAudio bytes are sha256'd at first hash access; the cache keys on
(path, mtime_ns, size). The bind/baggage/JWT-attribute pipeline carries
conversation_hash in place of (conversation_id, conversation_version).
A canonical-JSON parity fixture (tests/fixtures/hash-parity.json) pins
the byte-for-byte contract between the Python SDK and the TS server.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…stale APIs
- replays.service: wrap `ensureConversation` and replay/meta inserts in a
single transaction so a failed insert can't leave an orphan conversation
row polluting `last_run_at` ordering. `ensureConversation` now takes
`StoreDbOrTx` so it composes from both contexts.
- Conversation: freeze the dataclass and drop `functools.cached_property`
so editing a WAV (mtime changes) or mutating `turns` in place is
reflected on the next `.hash` access. Disk-heavy work stays memoized in
`_AUDIO_SHA256_CACHE` keyed by `(path, mtime_ns, size)`.
- `_sha256_file`: wrap `OSError` in `AudioMissingError` so the SDK's
typed-errors contract holds at the call site.
- orchestrator: wrap `httpx.HTTPStatusError` on `POST /v1/replays` as a
new typed `XrayServerError` — the failure happens before the replay
row exists, so it can't flow through the `failure_reason` PATCH path.
- Delete dead `judge` plumbing from the orchestrator (`JudgePatchBody`,
`_judge_to_wire`, `judge` field on `RunResult`, the unreachable PATCH
branch). `Conversation.judge` is still accepted and ignored.
- Expand `tests/fixtures/hash-parity.json` from one shape to a 9-case
vector: ASCII, unicode + emoji surrogate pair, control chars,
U+2028/U+2029, U+007F DEL, empty text, recorded audio, TTS with/without
voice_id. Both Python and TS parity tests iterate it.
- Move `shortHash` / `HASH_PREFIX_LEN` from `src/client/lib/format.ts`
into the existing `src/client/format.ts`; delete the orphan `lib/`
helper; add co-located tests.
- Drop stale SDK examples folder and clean stale APIs from the SDK
README (`id=`, `title=`, `expect_agent_turn`, `xray.trace`,
`POST /v1/conversations`) — a working example will land in a
follow-up PR alongside a dev LiveKit instance.
- Fix `makeConversationTurn` so an `{role:"agent"}` override no longer
carries the user-turn default `text:"hello"`.
- Fix stale `(id, version)` PK comment on `ConversationRow`.
- New tests: `XrayServerError` on POST failure, audio cap fires locally.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…terminal status, numeric guard Cleans up the open review threads on the content-hash branch: - LiveKitDriver → LiveKitRuntime everywhere (README, docs, exports, tests, error messages). The class name in the quickstart now matches the implementation, so a copy-paste import resolves. - README's wiring section now describes JWT participant attributes instead of the stale "room metadata" mechanism, and the OpenAI TTS cache description matches the actual (text, voice, model) fingerprint layout. - Drop `path` from the RecordedAudio wire payload. The sha256 is the full identity; including the local filesystem path made the conversation hash machine-local (same checked-in spec produced different hashes on Alice's vs Bob's box). Mirrors in the TS schema, parity fixture, and the SDK's _audio_to_wire — and locked in by a new "same bytes, different path ⇒ same hash" test. - Replay terminal-status guard now covers `completed` too, not just `failed`. A "rescue" PATCH that flips a completed run back to running silently rewrote the outcome; now both raise ReplayStatusTransitionError. Sibling test mirrors the existing `failed` coverage. - Bound the orchestrator's XrayServerError message at e.response.text[:500] — matches the OTLP exporter's truncation and stops a 5xx HTML error page from dumping into the dev's stdout. - corrupt turns_json in the conversations store now logs a warn line with the conversation hash and the underlying error/issues before returning []. Two tests pin both the parse-failure and schema-failure branches. - Canonical encoder on both sides now rejects numeric values. JSON.stringify(1.0) is "1"; Python's json.dumps(1.0) is "1.0" — the hashes would silently diverge across languages the moment a numeric field landed in a turn. TS throws in canonicalStringify; the Python side scans the encoded output for unquoted numeric tokens (cheaper and less type-narrow-fighty than a typed tree walk). Booleans stay allowed (they roundtrip identically) and digits inside strings stay allowed (regression guard test for the scanner). - ruff format/check fixes on test_real_server.py, test_orchestrator.py, test_conversation.py and runtime/livekit.py — what tripped CI. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
POST /v1/replays becomes multipart/form-data: a `spec` JSON part
carries the Conversation (`name`, `turns`, `modality`, `run_config`),
and one named file part per `RecordedAudio` turn carries the raw WAV
bytes. The server reads each file part, sha256s the bytes, stores a
content-addressed copy under `<audioRoot>/recorded/<sha256>.wav`
(idempotent — same bytes ⇒ same file, written once via `flag: "wx"`),
substitutes the sha256 into the canonical turn JSON, then hashes the
canonical JSON to derive `conversation_hash`.
Why: the SDK shouldn't be a hash authority. Cross-language canonical
encoder parity (the hash-parity.json vector) was a brittle wire
contract that broke the moment either side's encoding drifted. With
the server as the sole hash authority, drift becomes impossible by
construction. RecordedAudio bytes are also now server-resident, so a
future PR can serve them back to the inspector for per-turn playback
without the dev needing to keep their local WAVs around.
SDK changes (sdk/python/):
- delete `Conversation.hash`, `_canonical_turns_json`,
`_hash_turns_wire`, `_reject_numeric_tokens`, `_sha256_file`,
`_AUDIO_SHA256_CACHE`
- rename `to_replay_create_payload` → `to_replay_spec_payload`;
add `recorded_audio_uploads()` yielding `(upload_key, path)` pairs
- orchestrator POSTs multipart via httpx `files=`, opening file
handles inside a `contextlib.ExitStack`
- `_ReplayCreateResponse` reads `conversation_hash` from the server
Server changes (src/server/):
- new request-form turn schemas with `{kind: "recorded", upload_key}`;
canonical/stored schemas keep `{kind: "recorded", sha256}`
- `materializeRequestTurns` walks request turns, hashes bytes,
substitutes sha256, validates no orphan or missing upload_keys
- one typed error `RecordedAudioUploadKeyError` with
`reason: "missing" | "unreferenced"`; 400 either way
- `saveRecordedConversationAudio` writes content-addressed copies in
parallel via `Promise.all`, swallowing `EEXIST` (same-bytes idempotency)
- replays router accepts multipart, caps spec at 256 KB and the whole
body at 512 MB; audio bytes per part still capped at MAX_AUDIO_BYTES
Tests:
- delete tests/fixtures/hash-parity.json — cross-language parity vector
has no consumers now
- new `createReplayForTest` helper supplies a session-wide temp audio
root (cleaned up on process exit) and an empty audio-parts map
- new server tests: audio-bytes sha256 substitution; content-addressed
on-disk write; 400 on missing / unreferenced upload_key
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
96707f2 to
d8d2123
Compare
Collaborator
Author
|
This change is part of the following stack: Change managed by git-spice. |
A `final=True` transcription that arrives after one agent turn's audio ends — e.g. a plugin's delayed `conversation_item_added` — would sit in the shared queue and satisfy the next agent turn's `final_seen` on entry, ending it before any new audio is captured. Drop pre-turn segments before installing the drainer. Regression test: test_runtime_drains_stale_transcripts_between_agent_turns
…l hashing - conversations.service: materializeOneTurn dispatch via ts-pattern; pre-hash audio bytes via Promise.all and return PendingAudioWrite[] so the router iterates pairs directly. ensureConversation switched to options object. listConversations projects explicit columns (drops turnsJson read). - conversations.errors: add MissingSpecPartError (extends MalformedConversationBodyError); reuse ConversationHashSchema for the canonical recorded sha256 field. - replays.service: ConversationHashNotFoundError dropped — throw the canonical ConversationNotFoundError from conversations.errors. enqueueAnalysis now throws ReplayNotFoundError (404) when the row vanished between claim and check, replacing the synthetic "unknown" lifecycle path. buildReplayDetail accepts a pre-fetched ReplayRow so create/update/get/compare drop one redundant SELECT each. - replays.errors + audio.errors: ReplayNotReadyForAnalysisError.currentState and ReplayUploadStateError.currentState typed as ReplayLifecycleState. - audio.types: compile-time exhaustiveness check on ALL_CONTENT_TYPES vs AudioContentType so a future content-type addition fails to compile without the corresponding picklist entry. - test-utils: replays.test-utils routes conversation seeding through the conversations slice's seedConversation; conversations.router.test reuses makeTempAudioRoot. - errors tests: add coverage for MissingSpecPartError, InvalidConversationRequestError, MalformedConversationBodyError, ConversationBodyTooLargeError per errors.md §5. - sdk/python orchestrator: collapse the two _read_*_response helpers into a generic _read_response[T]; extract _raise_for_status_typed for the two HTTPStatusError → XrayServerError wraps. Wire-visible changes: - POST /v1/conversations missing-spec error now returns issues[0].type = "multipart_part" (was "json_body") with a clearer message. - POST /v1/replays/:id/analyze returns 404 replay_not_found (was 409 replay_not_ready_for_analysis with current_state: "unknown") when the row is deleted between the claim and the post-claim check. Verify: pnpm typecheck && pnpm check && pnpm test (252 pass); cd sdk/python && uv run pyright && uv run pytest (42 pass).
…9 PATCH, router tests
- scripts/seed.ts: restore the /v1/conversations multipart upsert step
(broken since the multipart refactor) so /v1/replays has a hash to
reference. Send /v1/replays as JSON {conversation_hash, run_config},
not FormData. PATCH uses lifecycle_state + a valid failure_reason
picklist value.
- orchestrator.py: wrap _evaluate_judge in try/except so a dev-authored
judge raising can't strand the replay row pre-PATCH. Final PATCH
tolerates 409 — when SSE wait drops out before the worker emits the
terminal event, the server has already settled the lifecycle and its
truth wins. Non-409 errors go through _raise_for_status_typed so the
dev sees XrayServerError instead of a raw httpx exception.
- audio.service: saveRecordedConversationAudio now writes to a per-call
.tmp-<uuid> file and rename(2)s atomically. The previous
writeFile(flag:"wx") + EEXIST-as-success strategy returned success
to the loser of a concurrent identical upload while the winner was
still streaming bytes — partial-content window for any reader.
- conversations.router: count spec bytes via Buffer.byteLength(...,"utf8")
instead of String.length. UTF-16 code units undercount multi-byte text
by up to 4x against the BYTES-named cap; outer multipart limit kept
the worst case bounded but the spec cap silently lied.
- conversations.router.test: cover POST /v1/conversations end-to-end
(happy path text-only, recorded audio, idempotent hash, 400 cases
missing/malformed/schema-invalid spec + upload_key missing/unreferenced,
413 oversize, UTF-8 vs UTF-16 byte counting). afterEach cleans up
temp audio root + store handles across the file.
Stale references from the content-hash rename were still present in the narrative docs and a few code comments; the new 409-tolerance branch on the final PATCH lacked test coverage after the SSE tests were removed. - docs/architecture.md: drop VersionFingerprintMismatchError mentions, rewrite the POST /v1/conversations item to reflect the multipart + server-hashing flow, update the ER diagram (hash/name/last_run_at), update inspector endpoints to use :hash. - docs/integrate.md: OTEL baggage list uses xray.conversation.hash instead of the removed .id/.version keys. - conversations.service.ts: ensureConversation docstring no longer claims last_run_at is denormalized from MAX(replays.started_at) — it's set on every POST /v1/conversations, which is what the code does. - replays.errors.ts: trim ReplayLifecycleTransitionError doc to what the error means; SDK-side editorial belongs in the PR description. - orchestrator.py: rewrite the top-of-file step list to match the actual 10-step flow, renumber the inline step comments sequentially (was 1, 2, 2, 2b, 3, 4, 5b, 5c, 5, 6, 7, 8), update the cross-reference in the 409-tolerance comment. - test_orchestrator.py: add tests for the PATCH-409 tolerance branch and its non-409-still-raises counterpart.
LukasPoque
previously approved these changes
May 22, 2026
Member
LukasPoque
left a comment
There was a problem hiding this comment.
Fixed all review findings and committed them, @basilebong maybe you can make a review and then its fine to merge IMHO
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Conversation identity moves to a server-computed content hash of the canonical turn JSON (with sha256 of each
RecordedAudio's WAV bytes substituted in). The dev sets a free-formname; renaming doesn't change identity, editing a turn or a WAV does. The SDK does zero hashing — no cross-language parity fixture to keep in sync.This branch also carries main's #65 (server-as-analyzer: stereo WAV upload + VAD + turn derivation + SSE progress). The two surfaces are independent.
API request flow
sequenceDiagram autonumber participant SDK as Python SDK participant Xray as xray server participant Agent as Voice agent participant Worker as bunqueue worker SDK->>Xray: POST /v1/conversations multipart spec + audio parts Note over Xray: hash = sha256(canonical turns), upsert by hash Xray-->>SDK: 200 hash SDK->>Xray: POST /v1/replays JSON conversation_hash Xray-->>SDK: 201 id, lifecycle_state pending SDK->>Agent: runtime.run(conversation) Agent->>Xray: POST /v1/otlp/v1/traces xray.replay.id Agent-->>SDK: AgentResponses SDK->>Xray: POST /v1/replays/:id/audio stereo WAV SDK->>Xray: POST /v1/replays/:id/analyze Xray->>Worker: enqueue Xray-->>SDK: 202 job_id SDK->>Xray: GET /v1/replays/:id/events SSE Worker->>Xray: VAD, speech_segments, replay_turns Xray-->>SDK: state / progress / completed SDK->>Xray: GET /v1/replays/:id Xray-->>SDK: turns + tool_calls + model_usage + spans SDK->>Xray: PATCH /v1/replays/:id lifecycle_state Note over SDK,Xray: 409 tolerated — server-owns-lifecycle Xray-->>SDK: 200What changed for callers
POST /v1/conversations—multipart/form-data:specJSON ({name, turns}, ≤256 KB measured in UTF-8 bytes) plus one file part perRecordedAudioturn, keyed by the turn's declaredupload_key(any[A-Za-z0-9_.-]string, ≤50 MB each). Server hashes, returns the full conversation row ({hash, name, created_at, last_run_at, turns}).POST /v1/replays— JSON{conversation_hash, run_config?}, returns the replay row atlifecycle_state: "pending".conversationsPK ishash;replays.conversation_hashFKs into it.<XRAY_AUDIO_ROOT>/recorded/<sha256>.wav(deduplicated across conversations, written via tmp +rename(2)so concurrent identical uploads never expose a partial file); full-replay stereo mixdown at<XRAY_AUDIO_ROOT>/<replay_id>/replay.wav.Conversation(name=..., turns=...);xray.run(...)returnsRunResult.conversation_hash. Existing data wiped (pre-1.0).judgeraising no longer strands the replay row — orchestrator logs and proceeds to PATCH. The final PATCH tolerates 409, accepting the server's lifecycle when SSE wait drops before the terminal event.Atomicity
Audio files are written content-addressed (
recorded/<sha256>.wav) before the conversation upsert — a partial write + failed upsert leaves a harmless orphan that the next submission finds in place. Typed errors at every boundary:RecordedAudioUploadKeyError(missing/unreferenced upload_key),ConversationNotFoundError(404),ReplayLifecycleTransitionError(409 on terminal-state mutations).Verify