Skip to content

Issue/4249-finish-tinyagents-migration#4399

Draft
senamakel wants to merge 269 commits into
tinyhumansai:mainfrom
senamakel:issue/4249-finish-tinyagents-migration
Draft

Issue/4249-finish-tinyagents-migration#4399
senamakel wants to merge 269 commits into
tinyhumansai:mainfrom
senamakel:issue/4249-finish-tinyagents-migration

Conversation

@senamakel

Copy link
Copy Markdown
Member

Summary

  • What changed and why.
  • Keep this to 3-6 bullets focused on user-visible or architecture-impacting changes.

Problem

  • What issue or risk this PR addresses.
  • Include context needed for reviewers to evaluate correctness quickly.

Solution

  • How the implementation solves the problem.
  • Note important design decisions and tradeoffs.

Submission Checklist

If a section does not apply to this change, mark the item as N/A with a one-line reason. Do not delete items.

  • Tests added or updated (happy path + at least one failure / edge case) per Testing Strategy
  • Diff coverage ≥ 80% — changed lines (Vitest + cargo-llvm-cov merged via diff-cover) meet the gate enforced by .github/workflows/pr-ci.yml. Run pnpm test:coverage and pnpm test:rust locally; PRs below 80% on changed lines will not merge.
  • Coverage matrix updated — added/removed/renamed feature rows in docs/TEST-COVERAGE-MATRIX.md reflect this change (or N/A: behaviour-only change)
  • All affected feature IDs from the matrix are listed in the PR description under ## Related
  • No new external network dependencies introduced (mock backend used per Testing Strategy)
  • Manual smoke checklist updated if this touches release-cut surfaces (docs/RELEASE-MANUAL-SMOKE.md)
  • Linked issue closed via Closes #NNN in the ## Related section

Impact

  • Runtime/platform impact (desktop/mobile/web/CLI), if any.
  • Performance, security, migration, or compatibility implications.

Related

  • Closes:
  • Follow-up PR(s)/TODOs:

AI Authored PR Metadata (required for Codex/Linear PRs)

Keep this section for AI-authored PRs. For human-only PRs, mark each field N/A.

Linear Issue

  • Key:
  • URL:

Commit & Branch

  • Branch:
  • Commit SHA:

Validation Run

  • pnpm --filter openhuman-app format:check
  • pnpm typecheck
  • Focused tests:
  • Rust fmt/check (if changed):
  • Tauri fmt/check (if changed):

Validation Blocked

  • command:
  • error:
  • impact:

Behavior Changes

  • Intended behavior change:
  • User-visible effect:

Parity Contract

  • Legacy behavior preserved:
  • Guard/fallback/dispatch parity checks:

Duplicate / Superseded PR Handling

  • Duplicate PR(s):
  • Canonical PR:
  • Resolution (closed/superseded/updated):

@coderabbitai

coderabbitai Bot commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 07bc6a6b-da11-41bf-a98c-3cb2d3e542d1

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review

Comment @coderabbitai help to get the list of available commands.

senamakel added 29 commits July 1, 2026 19:30
senamakel added 30 commits July 2, 2026 09:11
Adds DomainEvent::{WorkspacePrepared, WorkspaceViolation, WorkspaceCleanup}
for the 08.5 worktree-isolation workstream, plus cargo fmt normalization
across session_import and payload_summarizer.

Claude-Session: https://claude.ai/code/session_01Frnx4CvLQBCGoDyT6FT6Sq
Wraps OpenHuman's EmbeddingProvider as tinyagents
harness::embeddings::EmbeddingModel, bridging the &[String] vs &[&str]
signature and anyhow->TinyAgentsError::Embedding error mapping.
Preserves dimensions()/signature() fidelity. Not yet wired into the
recall path (09.2); re-exported so it is part of the crate surface.

Claude-Session: https://claude.ai/code/session_01Frnx4CvLQBCGoDyT6FT6Sq
Wire GitWorktreeIsolation prepare/cleanup and a new enforce_workspace_path
helper to publish DomainEvent::{WorkspacePrepared,WorkspaceCleanup,
WorkspaceViolation} onto the global bus for the security audit trail.
Descriptor stays a carrier; SecurityPolicy/landlock remains the enforcement
authority. worktree_context.rs deletion + acting-tool migration deferred.

Claude-Session: https://claude.ai/code/session_01Frnx4CvLQBCGoDyT6FT6Sq
…ch (10.2)

run_turn_via_tinyagents_shared now inspects registry.diagnostics() after
harness assembly and aborts the turn before any model dispatch when an
error-severity diagnostic (duplicate name / dangling alias) is present,
via new AgentError::RegistryValidationFailed. Warnings are logged only.
Existing hand-rolled dedup left intact (deletion deferred until parity).

Claude-Session: https://claude.ai/code/session_01Frnx4CvLQBCGoDyT6FT6Sq
New read-only RPC projecting the CapabilityRegistry inventory
(models/tools/graphs/agents with ComponentMetadata) plus a Graphviz DOT
export from durable descriptor sources reachable outside a turn. Additive
sibling to agent.graph_topologies; the full per-agent tool surface and
per-run-only kinds are documented deferrals in the response.

Claude-Session: https://claude.ai/code/session_01Frnx4CvLQBCGoDyT6FT6Sq
…(06.1)

UsageInfo now carries cache_creation_tokens/reasoning_tokens through the
bridge into the persisted TokenUsage record instead of hardcoding 0.
claude_code provider path populates real cache-creation tokens. Providers
that do not report these keep 0. No public cost RPC shape change.

Claude-Session: https://claude.ai/code/session_01Frnx4CvLQBCGoDyT6FT6Sq
…onMiddleware (03.1)

Retain the ContextCompressionMiddleware handle on the assembled turn and
drain records() after the run, logging per-compaction provenance (source
ids, before/after token estimates, reason) under a grep-friendly [context]
prefix. Additive; ToolOutputMiddleware Compressed contract untouched.

Claude-Session: https://claude.ai/code/session_01Frnx4CvLQBCGoDyT6FT6Sq
…ity gating (02.1)

New tinyagents/routes.rs projects the 7 router tiers (chat/reasoning/
agentic/coding/burst/summarization/vision) into the crate ModelRegistry as
per-route ProviderModel entries with real ModelProfile (vision/reasoning/
context-window). set_default_model still points at the turn's effective
model so dispatch is unchanged; this enables 02.2 fallback ordering. Adds
RequiredCapabilitiesMiddleware stamping the turn CapabilitySet onto each
ModelRequest so unfit models are rejected pre-dispatch (vision wired;
tool/reasoning/BYOK signals documented as follow-ups). Route policy stays
in router.rs/factory.rs.

Claude-Session: https://claude.ai/code/session_01Frnx4CvLQBCGoDyT6FT6Sq
…uman overlays (02.4)

unified_model_catalog() seeds from tinyagents ModelCatalog::seed(), overlays
KNOWN_MODEL_PRICING rates/windows (source of truth, identical numbers), local
runtime models, and pattern-window backfill. Model-picker RPC now sources
local models from config. estimate_cost_usd/context_window stay on
KNOWN_MODEL_PRICING to guarantee numeric identity; duplicate-table deletion
deferred until a snapshot lookup is proven identical. No cost numbers changed.

Claude-Session: https://claude.ai/code/session_01Frnx4CvLQBCGoDyT6FT6Sq
Adds PromptCacheSegmentMiddleware stamping content-fingerprinted system/tools
PromptSegments, then PromptCacheGuardMiddleware with protect_prompt_prefix=true;
CacheLayoutEvents surface as structured [cache] warnings alongside the retained
CacheAlignMiddleware. Threads deterministic_cacheable through the shared runner
to attach InMemoryResponseCache only for internal deterministic runs — all three
production callers (chat/channel/subagent) set false, so interactive turns are
never served cached responses. CacheHit/Miss counted in the bridge; cost-footer
DTO wiring deferred to 06.

Claude-Session: https://claude.ai/code/session_01Frnx4CvLQBCGoDyT6FT6Sq
…ent (09.2/09.3)

New tinyagents/retriever.rs wraps Memory::recall as the swappable retrieval
seam: projects entries to crate ScoredDoc, carries path_scope and applies the
id-keyed dedupe rule, and emits AgentEvent::MemoryLoaded. memory_context.rs and
memory_loader.rs load recall through the facade; CROSS_CHAT_HEADER, citation
format, and collect_recall_citations output stay byte-identical (engine
unchanged, adapter-first). Concrete crate Retriever exposed as the engine-swap
seam. Embedding usage/cost (09.4) deferred to coordinate with 06.

Claude-Session: https://claude.ai/code/session_01Frnx4CvLQBCGoDyT6FT6Sq
… (05.2)

New agent_orchestration/subagent_events.rs centralizes construction + publish
of DomainEvent::Subagent{Spawned,Completed,Failed,AwaitingUser} across 24 sites
in 6 files. Event variants, field values, and ordering are byte-identical, so
RunLedgerFinalizeSubscriber and UI consumers see no change; this is the single
hook point for future ordering/rate-limiting/journal-mirroring (05.1).

Claude-Session: https://claude.ai/code/session_01Frnx4CvLQBCGoDyT6FT6Sq
…rojections (10.1)

assemble_turn_harness now registers Agent descriptors deduped from the runtime
AgentDefinitionRegistry and agent_registry builtins so they appear in the
snapshot/diagnostics streams, and exercises to_model_registry()/to_tool_registry()
as validation projections with a [registry] count summary. ComponentMetadata
description/tags persistence and register_agent (needs executable blueprint) are
documented follow-ups; live register_model/register_tool glue left intact.

Claude-Session: https://claude.ai/code/session_01Frnx4CvLQBCGoDyT6FT6Sq
…02.2)

RunPolicy.fallback now carries an ordered same-family alternate chain from the
02.1 routes; FallbackObserverMiddleware emits AgentEvent::FallbackSelected with
no extra dispatch; RetryScheduled surfaces (dormant while max_attempts pinned=1
to avoid double-retry with still-wrapped ReliableProvider). ProviderModel maps
permanent/billing rejections to non-retryable via OpenHuman classifiers. Adapter-
first: reliable.rs annotated for deletion in the 11-testing conformance pass.

Claude-Session: https://claude.ai/code/session_01Frnx4CvLQBCGoDyT6FT6Sq
Tool-argument fragments now ride ModelStreamItem::ToolCallDelta into
AgentProgress::ToolCallArgsDelta instead of ThinkingForwarder. Start event +
tool_name stay on the forwarder (crate ToolDelta has no name field); a shared
per-turn ToolNameMap labels streamed fragments so UI timeline parity holds.
Removed emit_tool_args from ThinkingForwarder; its start marker + non-streaming
reasoning fallback stay live. Child-run streaming preserved via scope-aware
bridge. Ledger row updated.

Claude-Session: https://claude.ai/code/session_01Frnx4CvLQBCGoDyT6FT6Sq
…uthority (07.3)

Maps Redirect/Pause/Resume/Cancel to crate SteeringCommand via a new
SteeringDirective, delivered only through the registered SteeringHandle with
fail-closed policy checks. SteeringPolicy tightens by run class (background
subagent runs accept control-flow steering without transcript injection;
interactive keeps InjectMessage+Pause). Steered event projected under
[steering]. Recursion: documents why spawn_depth_context stays a thin
projector (cross-process MCP-hop depth + synchronous pre-dispatch surface);
cap=3 and SpawnDepthExceeded wording unchanged. run_queue mechanics retained.

Claude-Session: https://claude.ai/code/session_01Frnx4CvLQBCGoDyT6FT6Sq
…4.1)

Behind OPENHUMAN_SESSION_DUAL_WRITE (default OFF), after a successful legacy
JSONL transcript write, also append the turn to the slash-free store stream
session.{stem}.messages and upsert the NS_SESSIONS descriptor, reusing the
session_import convert normalization so live and imported records are
shape-identical. Store writes are fire-and-forget and non-fatal; OFF-default
behavior is byte-identical to today. Reads stay legacy (04.2). Factored shared
open_session_stores() helper. StoreChatHistory adoption evaluated + deferred.

Claude-Session: https://claude.ai/code/session_01Frnx4CvLQBCGoDyT6FT6Sq
Attaches a second EventSink listener (FanOutSink -> RedactingSink -> JournalSink
-> StoreEventJournal over the 04.1 JsonlAppendStore) alongside the unchanged
OpenhumanEventBridge, plus a Store-backed FileStatusStore (crate ships only
in-memory) writing running/completed/failed snapshots keyed by run_id with
list_by_root/thread/active. RedactingSink masks credential-valued env secrets
before persistence. Adds read_run_events/read_run_status replay-reader seam for
a future replay RPC. Writes are best-effort/non-fatal.

BEHAVIOR NOTE: the per-turn EventSink is now created unconditionally (was gated
on on_progress/pause) so a run is reconstructable without subscribing at start;
all journal/status I/O is non-fatal. Subscribers untouched; no deletions.

Claude-Session: https://claude.ai/code/session_01Frnx4CvLQBCGoDyT6FT6Sq
…3/09.4)

Adds run_id/root_run_id to TokenUsage (serde default + skip_serializing_if,
stamped None pending run-tree threading; rollup swap deferred). ProviderEmbedding
Model.embed records best-effort embedding usage (provider/model/dims/vectors)
priced via the unified catalog, zero-cost when no embedding rate exists. Non-fatal
[cost][embed] recording; public cost DTOs backward-compatible.

Claude-Session: https://claude.ai/code/session_01Frnx4CvLQBCGoDyT6FT6Sq
…perative cancel (07.2)

Adds reconcile_orphaned_tasks_on_boot: scans the durable DetachedTaskStore for
tasks left live by a prior process, settles them terminal (CancelRequested->
Cancelled, else Failed with an orphaned-by-restart reason), and emits the 05.2
terminal lifecycle event so the run ledger finalizes. Hooked in
bootstrap_core_runtime next to run-ledger recovery. Flips the CancellationToken
before abort in cancel_for_thread/cancel_all so cooperative cancel is uniform;
terminal store write preserved. Best-effort/non-fatal; no deletions, no shrink.

Claude-Session: https://claude.ai/code/session_01Frnx4CvLQBCGoDyT6FT6Sq
… review gate (08.3)

Adds DelegationConfig::require_review_approval: the durable delegation graph
emits NodeResult::Interrupt at the review approval point (persisted Sync via the
existing SqlRunLedgerCheckpointer so the pause survives restart) and resumes via
Command::resume, mapping the stable ApprovalDecision RPC wire strings
(approve_once/approve_always_for_tool/deny) with deny overriding. run_delegation_
durable/resume_delegation added; deny_decision() preserves TTL-deny. Interactive
chat approval gate untouched (durable-vs-chat boundary documented). Workflow
human-review + live approval-RPC delivery noted as follow-ups.

Claude-Session: https://claude.ai/code/session_01Frnx4CvLQBCGoDyT6FT6Sq
…inyagents-migration

# Conflicts:
#	src/openhuman/agent_orchestration/parent_context/mod.rs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant