Enforce the memory read → dedupe → write → update-index protocol (#4116)#4388
Enforce the memory read → dedupe → write → update-index protocol (#4116)#4388M3gA-Mind wants to merge 4 commits into
Conversation
…sai#4116) Agents were told to follow a read-index -> dedupe -> write -> update-index cycle around durable memory but it was never enforced, so writes landed without a dedupe read (duplicating entries) and update_memory_md was skipped (leaving MEMORY.md drifting from the store). Add a MemoryProtocolTracker state machine and drive it from a tinyagents MemoryProtocolMiddleware: on each successful memory op it advances the cycle and, on a write, appends a corrective note to the tool result the model reads (flagging a missing dedupe read or a drifting index), then warns at run end when a write was never followed by update_memory_md. Same structured-correction-surfaced-to-the-model pattern as the unknown-tool recovery. Registered outermost so its guidance survives the after-tool byte cap.
|
Warning Review limit reachedYou’ve reached a temporary PR review limit under our Fair Usage Limits Policy. Next review available in: 31 minutes Enable usage-based reviews in Billing to review now. Otherwise, wait until the next included review is available. How can I continue?After more reviews become available, a review can be triggered using the To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based reviews. How do review limits work?CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan review availability. For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, additional reviews become available more gradually as earlier reviews age out of the rolling window. Please refer docs for additional details. Review details⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (4)
📝 WalkthroughWalkthroughAdds a new ChangesMemory protocol enforcement
Estimated code review effort: 3 (Moderate) | ~25 minutes Sequence Diagram(s)sequenceDiagram
participant Agent
participant Middleware as MemoryProtocolMiddleware
participant Tracker as MemoryProtocolTracker
participant ToolResult
Agent->>Middleware: after_tool(memory write result)
Middleware->>Tracker: observe_tool(tool_name)
Tracker-->>Middleware: MemoryProtocolObservation
alt needs_guidance
Middleware->>ToolResult: append corrective guidance note
end
Agent->>Middleware: after_agent(run end)
Middleware->>Tracker: pending_index_update()
Tracker-->>Middleware: true/false
alt pending update
Middleware->>Agent: emit stale-index warning
end
Suggested labels: Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
Comment |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 1243ede870
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/openhuman/agent/harness/mod.rs`:
- Line 35: Move the new memory protocol wiring out of the flat root-level module
file and into a dedicated submodule directory. Update the `mod memory_protocol;`
declaration in `harness::mod` to point at the relocated module under
`memory_protocol/mod.rs`, and ensure any associated module files live within
that subdirectory rather than as a new top-level `memory_protocol.rs`.
In `@src/openhuman/tinyagents/mod.rs`:
- Around line 393-400: The memory-protocol enforcement is only added in
run_turn_via_tinyagents_shared, leaving run_turn_via_tinyagents without the same
harness middleware. Update run_turn_via_tinyagents to install
MemoryProtocolMiddleware on its harness too, matching the shared path so both
entrypoints enforce the read → dedupe → write → update-index flow consistently.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 93e32ce0-9d78-4d65-8543-9504bab66fb2
📒 Files selected for processing (4)
src/openhuman/agent/harness/memory_protocol.rssrc/openhuman/agent/harness/mod.rssrc/openhuman/tinyagents/middleware.rssrc/openhuman/tinyagents/mod.rs
…oints - Classify memory_tree(mode=ingest_document) as a Write and update_memory_md only as an index-update when file=MEMORY.md (a SKILL.md edit no longer masks a stale MEMORY.md index). Arguments are captured in the middleware's before_tool hook (keyed by call id) since the tool result carries none. - Install MemoryProtocolMiddleware on the run_turn_via_tinyagents entrypoint too, so both harness-build paths enforce the cycle consistently. - Extend unit + middleware tests for the SKILL.md and memory_tree-ingest cases.
Review feedback addressed (e4a2ee8)Fixed in code (3):
Dismissed (1):
New regression tests cover the SKILL.md and @coderabbitai full review |
🧩 Analysis chain🏁 Script executed: #!/bin/bash
set -euo pipefail
fd memory_protocol.rs
echo "---"
sed -n '1,120p' src/openhuman/agent/harness/memory_protocol.rs 2>/dev/null | head -200Length of output: 6395 🏁 Script executed: #!/bin/bash
set -euo pipefail
rg -n "MemoryProtocolMiddleware" src/openhuman/tinyagents/mod.rs
echo "---"
rg -n "fn run_turn_via_tinyagents\b" -A 20 src/openhuman/tinyagents/mod.rsLength of output: 1647 All three fixes check out against the code: Kicking off a full review now to go through the rest of the diff (including the new ✅ Action performedFull review finished. |
|
Superseded by #4444 — a rework onto post-#4399 |
…dex protocol (tinyhumansai#4116) Rework of tinyhumansai#4388 onto post-tinyhumansai#4399 code. tinyhumansai#4399 deleted the original home (agent/harness/memory_protocol.rs did not exist yet; the enforcement was to live in the turn engine); the turn loop now runs on the tinyagents crate, so this re-expresses the protocol as a crate middleware. - Ports the pure `MemoryProtocolTracker` state machine as src/openhuman/agent/harness/memory_protocol.rs: classifies each memory tool call (IndexRead / Write / IndexUpdate / Other) and, on a write, flags a missing dedupe read or a stale index (a prior write never followed by update_memory_md). - Adds `MemoryProtocolMiddleware` in the tinyagents adapter seam: before_tool captures the classified op by call id; after_tool advances the tracker on success and appends a `[memory-protocol]` corrective note to the tool result so the model is nudged back onto the cycle; after_agent warns when the run ends with a stale index. Registered outermost in assemble_turn_harness so its after_tool runs after the byte-cap truncation, preserving the note. - Fixes a stale classification from the original PR: `memory_search` is not a real tool — replaced with the actual dedupe-read search tools (memory_vector_search / memory_chunk_context / memory_hybrid_search). 9 unit tests (classification + tracker ladder) + 4 middleware integration tests (corrective note, SKILL.md doesn't close the cycle, consolidated memory_tree ingest = write, failed write doesn't advance). All green.
Closes #4116
Problem
The memory protocol (read-index → dedupe → write → update-index) was described to the agent but never enforced, so it was followed inconsistently: durable writes (
memory_store) landed without a dedupe read (creating near-duplicates) andupdate_memory_mdwas skipped, soMEMORY.mddrifted from the underlying store.Enforcement approach
Expressed as a tinyagents graph middleware — the same "structured correction surfaced back to the model" pattern the unknown-tool recovery (#4118 / #4360) uses, and the live tool-dispatch seam.
agent/harness/memory_protocol.rs(new) — a pure, side-effect-freeMemoryProtocolTrackerstate machine.classify_memory_opsorts the memory tool surface intoIndexRead(memory_recall/memory_search/memory_tree_*reads),Write(memory_store/memory_forget/memory_tree_ingest_document), andIndexUpdate(update_memory_md). The tracker enforces the repeatable cycle: a write reportsmissing_index_readwhen no dedupe read preceded it andindex_driftwhen a prior write was never synced;update_memory_mdcloses the cycle and arms the next;pending_index_update()exposes an unsynced write at run end.tinyagents/middleware.rs—MemoryProtocolMiddleware. Itsafter_tooladvances the tracker on each successful memory op (failed ops don't demand an index update) and appends a[memory-protocol]corrective note to the tool result the model reads. Itsafter_agentwarns when the run ends with a write that was never followed byupdate_memory_md.tinyagents/mod.rs— registers the middleware outermost so itsafter_toolruns last (after the byte-cap truncation, since after-hooks run in reverse order) and the guidance survives into the transcript. No-op on turns that never touch memory.This satisfies the acceptance criteria: enforces the read-index → dedupe → write → update-index sequence as a post-step check, and detects a write not preceded by an index read / not followed by an index update, then corrects (guidance to the model) or warns (run end).
Tests
memory_protocol.rs: 7 unit tests — classification, clean full cycle, missing-read flag, drift-at-next-write, pending-index-update at run end, cycle re-arming after an update, reads/other ops need no guidance.middleware.rs: 4 integration tests — note appended on an unread write, clean read→write→update only reminds on the write, a failed write does not advance the protocol, a second unsynced write flags drift.Caveat
Runtime drift detection ("write not followed by
update_memory_md") fires at the next memory write within the run and is logged at run end viaafter_agent; the per-write forward reminder is the proactive correction. Localcargo checkis clean; relying on CI for the test run.Summary by CodeRabbit