fix(credentials): re-resolve active-user workspace in channel runtime + scheduler after store_session (#4398)#4411
Conversation
…ystems after store_session (tinyhumansai#4398) Long-lived subsystems started from a pre-login Config snapshot (workspace = users/local/) and never re-resolved after credentials::ops::store_session wrote active_user.toml, so they kept reading/writing under users/local/ until a process restart. Route each subsystem's workspace resolution through a re-bindable holder (mirroring memory::global::init) and re-point them all from store_session_inner, right after the existing memory/conversation rebind. - Cron scheduler: process-global ACTIVE_CONFIG holder; the poll loop re-resolves config + SecurityPolicy each tick, so due_jobs reads the cron store under users/<user_id>/. New scheduler::rebind(config). - Channel runtime: ChannelRuntimeContext.workspace_dir (baked Arc<PathBuf>) is now a shared Arc<RwLock<PathBuf>> handle read via workspace_dir(); channels::rebind_workspace(path) swaps it. The Telegram busy-state subscriber and PROFILE.md writer share the same handle (fixes a stale-workspace event drop the base change would otherwise introduce). - Security sandbox: live_policy::set_workspace_dir rebuilds the policy via from_config so file-writing tools stay confined to the activated user's workspace. - Channel memory store: ctx.memory is now a swappable handle; channels::rebind_memory(config) rebuilds it (shared build_channel_memory helper with the tinyhumansai#3712 keyword-only fallback) so conversation auto-save and memory-context retrieval land in the right workspace. Agent-definition registry re-resolution is deferred to a follow-up: it needs an invasive OnceLock -> RwLock refactor of AgentDefinitionRegistry::GLOBAL with ~8 &'static caller ripples, out of scope for this change. Tests: scheduler re-resolves to the activated workspace after rebind; workspace handle re-resolves + no-op when unregistered; live_policy::set_workspace_dir swaps workspace while preserving autonomy/action_dir; memory rebind no-op guard. Closes tinyhumansai#4398
|
Warning Review limit reachedYou’ve reached a temporary PR review limit under our Fair Usage Limits Policy. Next review available in: 5 minutes Enable usage-based reviews in Billing to review now. Otherwise, wait until the next included review is available. How can I continue?After more reviews become available, a review can be triggered using the To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based reviews. How do review limits work?CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan review availability. For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, additional reviews become available more gradually as earlier reviews age out of the rolling window. Please refer docs for additional details. Review details⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (25)
Comment |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 6dd5194165
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| effective_config.workspace_dir.clone(), | ||
| &effective_config.autonomy, | ||
| ); | ||
| crate::openhuman::channels::rebind_memory(&effective_config); |
There was a problem hiding this comment.
Rebind the channel tools, not only the context memory
When the channel runtime starts pre-login and the user signs in, this call only swaps ChannelRuntimeContext.memory_handle; it does not update the tool instances already captured in ctx.tools_registry. That registry is built once with Arc::clone(&mem) in channels/runtime/startup.rs, and the dispatch path keeps passing the same registry to agent.run_turn, while memory_store/memory_recall/memory_forget store that original Arc<dyn Memory> in their structs. As a result, after login the pre-turn memory context and autosave use the activated user's workspace, but any memory tool call from the same channel turn still reads/writes users/local, causing split-brain and possible cross-user memory writes; rebuild/swap the channel tool registry or make the memory tools dereference the same rebindable handle.
Useful? React with 👍 / 👎.
- rebind_in: make the RwLock write match a statement so the write guard temporary drops before the handle Arc (fixes E0597 borrow-lifetime error) - apply cargo fmt to touched channels/learning files
|
Fixed the Rust Quality (fmt, clippy) lane:
Verified locally: |
…signatures The tinyhumansai#4398 change migrated three constructors to shared handles but left three test call sites on the old signatures, breaking test compilation (caught by the Rust Core Coverage lane, which builds tests; plain clippy -p openhuman does not): - channels/routes_tests.rs: TelegramRemoteSubscriber::new now takes Arc<RwLock<PathBuf>> (workspace handle), not PathBuf. - channels/tests/memory.rs: ChannelRuntimeContext.memory renamed to memory_handle: Arc<RwLock<Arc<dyn Memory>>>. - tests/learning_phase4_integration_test.rs: ProfileMdRenderer::new now takes Arc<RwLock<PathBuf>> (workspace handle), not PathBuf. Values unchanged (wrapped in the new handle types); behaviour identical.
|
Follow-up: the Rust Core Coverage lane was red on a real test-compile break (not covered by my earlier fmt/clippy commit —
Fixed all three by wrapping the same values in the new handle types (behaviour unchanged). Verified locally green: |
|
Also ticked the PR Submission Checklist honestly now that CI has confirmed the Rust lanes: Diff coverage ≥ 80% is green via the passing |
|
Also ticked the PR Submission Checklist honestly now that CI confirmed the Rust lanes: Diff coverage ≥ 80% green via the passing |
Summary
credentials::ops::store_session, so a sign-in that happens after the core started (pre-login) stops routing writes tousers/local/.memory::global::init) instead of a boot-timeConfigsnapshot; re-pointed fromstore_session_inneralongside the existing memory/conversation rebind.live_policy), the channel memory store, and the Telegram busy-state + PROFILE.md subscribers.OnceLock -> RwLockrefactor — see## Related).Problem
When
openhuman-corestarts before any user has logged in, it opens its DBs/profiles under~/.openhuman/users/local/. A later sign-in over RPC callscredentials::ops::store_session, which writes~/.openhuman/active_user.tomland creates~/.openhuman/users/<user_id>/. But long-lived subsystems hold a cachedConfigsnapshot from startup and never re-resolve, so they keep reading/writing underusers/local/until a process restart.store_session_inneralready rebinds memory + conversation persistence (#2445); the scheduler and channel runtime did not.Split from the #2437 triage meta-issue (item E).
Solution
Each subsystem's workspace resolution is routed through a re-bindable holder and re-pointed from
store_session_innerright after the memory/conversation rebind. All calls are no-ops when the subsystem isn't running in-process.ACTIVE_CONFIGholder; the poll loop re-resolves config +SecurityPolicyevery tick, sodue_jobsreads the cron store underusers/<user_id>/. Newcron::scheduler::rebind(config).ChannelRuntimeContext.workspace_dir(bakedArc<PathBuf>) is now a sharedArc<RwLock<PathBuf>>handle read viaworkspace_dir();channels::rebind_workspace(path)swaps it in place. The Telegram busy-state subscriber and PROFILE.md writer read the same handle, so they re-resolve for free — this also fixes a stale-workspace event drop the base change would otherwise introduce (events are stamped with the current workspace, so a baked snapshot in the subscriber would drop them all post-login).live_policy::set_workspace_dir(ws, autonomy)rebuilds the policy viaSecurityPolicy::from_config(fresh canonical-path cache), so file-writing tools stay confined to the activated user's workspace. Reuses the existing hot-swapLiveState.ctx.memoryis now a swappable handle;channels::rebind_memory(config)rebuilds it via a sharedbuild_channel_memoryhelper (with the [Bug] Telegram and Discord messaging channels broken — show connected but messages not sending/receiving #3712 keyword-only fallback) so conversation auto-save + memory-context retrieval land in the right workspace. The swap is race-safe: an in-flight turn keeps the store it already cloned; new turns pick up the new one.Design note / tradeoff: the tool registry (
Configsnapshot) and the assembled system prompt bake workspace at boot; re-resolving those means reconstructing live objects mid-turn (race hazards). They are left for a follow-up and are lower-impact because agent tools already write memory through the already-reboundmemory::globalclient.Submission Checklist
rebind; workspace handle re-resolves + no-op when unregistered;live_policy::set_workspace_dirswaps workspace while preserving autonomy/action_dir; memory-rebind no-op guard.Rust Core Coverage (cargo-llvm-cov)lane passed on this PR, enforcing ≥ 80% coverage on the changed Rust lines (rebind seams unit-tested;store_session_innerwiring + memory-store rebuild exercised).N/A: behaviour-preserving workspace-resolution fix; no new feature rows## Related—N/A(no matrix rows affected)N/A: no release-cut surface changeCloses #NNNin the## RelatedsectionImpact
/status, and PROFILE.md all route tousers/<user_id>/instead ofusers/local/, without a restart.users/local/.SecurityPolicy::from_configper tick andctx.workspace_dir()/ctx.memory()take an uncontendedRwLockread per use — negligible.Related
AgentDefinitionRegistry) + tool-registryConfigsnapshot + baked system-prompt re-resolution — deferred; the registry needs anOnceLock<Registry>->OnceLock<RwLock<Registry>>refactor with ~8&'staticcaller updates. Parent triage: Triage request: 5 issues surfaced by remote-core attach-mode investigation (attach-mode silent, daemonlifecycle recurrence, configPersistence printer, cross-host path leak, active-user re-resolution) #2437 (item E). Precedent: fix: harden workspace routing and local-first gates #2445 (memory/conversation rebind).AI Authored PR Metadata (required for Codex/Linear PRs)
Linear Issue
Commit & Branch
fix/GH-4398-reresolve-config-after-store-sessionValidation Run
pnpm --filter openhuman-app format:check—N/A: Rust-only change; frontend (app/src) untouched (Frontend Checkscorrectly skipped).pnpm typecheck—N/A: no TypeScript changed.Rust Core Coveragelane builds + executes them);cargo check -p openhuman --testsalso passes locally (exit 0).cargo fmt --all -- --check(exit 0) andcargo clippy -p openhuman(exit 0); theRust Quality (fmt, clippy)CI lane passed.N/A: Tauri shell (app/src-tauri) untouched.Validation Blocked
command:cargo check/pnpm test:rusterror:local check/build matrix intentionally not run per this workflow's rulesimpact:build + tests validated by CI; change was covered by two independent static compile-reviews (no compile-blocking issues found)Behavior Changes
Parity Contract
Config; when no login occurs, behavior is identical to before. Boot-path memory construction (incl. [Bug] Telegram and Discord messaging channels broken — show connected but messages not sending/receiving #3712 keyword-only fallback) is unchanged — extracted verbatim into a shared helper.Duplicate / Superseded PR Handling