feat(mcp): add memory.store and memory.note write tools#2306
Conversation
Expose two new MCP write tools that route to the existing doc_put RPC: - memory.store: create a memory document from content (title, content, namespace, optional tags) with source_type=mcp provenance - memory.note: append an annotation to an existing chunk by ID, stored as a linked document with metadata.annotates_chunk_id Adds enforce_write_policy() for write-specific security gating (uses the Act operation gate, with a distinct log path for audit clarity), dispatch_write_tool() with structured tracing::info! audit logging, and 11 unit tests covering param validation, defaults, and slugging. Closes tinyhumansai#2269 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds two MCP write tools—memory.store and memory.note—registered to openhuman.memory_doc_put. Both enforce a write gate, validate inputs against JSON schemas, build RPC parameters with deterministic keys/slugs, dispatch via a dedicated write path with audit logs, and include unit tests. ChangesMemory Write Tools
Sequence Diagram(s)sequenceDiagram
participant Client
participant MCP_Server
participant WritePolicy
participant RPC_Builder
participant MemoryService
Client->>MCP_Server: tools.call(memory.store | memory.note, args)
MCP_Server->>WritePolicy: enforce_write_policy(controller, ToolOperation::Write)
WritePolicy-->>MCP_Server: allow/deny
MCP_Server->>RPC_Builder: build_rpc_params(tool, args)
RPC_Builder-->>MCP_Server: rpc_params
MCP_Server->>MemoryService: openhuman.memory_doc_put(rpc_params)
MemoryService-->>MCP_Server: success/error
MCP_Server-->>Client: result (including audit/log entry)
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related issues
Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/openhuman/mcp_server/tools.rs`:
- Around line 1177-1216: slug_from currently can return an empty string for
titles with only non-alphanumeric characters, causing key collisions; update
slug_from so that after collapsing/trimming and truncating you check if
result.is_empty() and if so produce a deterministic fallback (e.g., "untitled"
or "doc-" plus a short hex/hash of the original title) that is URL-safe and <=64
chars, ensure any trailing hyphens are removed, and return that non-empty
fallback so keys like "mcp-store-" are avoided.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 26b66ae5-a3a2-4d7d-880b-1741465d4827
📒 Files selected for processing (1)
src/openhuman/mcp_server/tools.rs
|
Credit to @justinhsu1477 for the thorough RFC-style issue (#2269) — the architecture questions, proposed tool surface, and phased rollout plan made scoping this first wave (memory.store + memory.note) much cleaner. The provenance and audit logging patterns follow directly from the issue's Q2/Q4 proposals. |
Address CodeRabbit review: slug_from now returns "untitled" when the title contains only non-alphanumeric characters, preventing key collisions on the mcp-store- prefix. Also adds the missing memory.store and memory.note entries to the protocol integration test. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/openhuman/mcp_server/tools.rs`:
- Around line 1215-1218: The current slug_from fallback returns a constant
"untitled" which collapses all Unicode-only titles into the same key; update
slug_from so that when the ASCII-alphanumeric result is empty it returns
"untitled-<stable_hash>" where <stable_hash> is a short deterministic hash
(e.g., hex of SHA256 or SipHash truncated to a few chars) computed from the
original title string, ensuring unique stable slugs for titles like "会议记录" and
"Протокол"; apply the same change to the identical fallback in the other
occurrence noted (around the second block at lines similar to 1763-1769), and
add a regression test that creates two documents with different Unicode-only
titles and asserts their generated keys (e.g., mcp-store-<slug>) are distinct
and stable across runs.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: a18feea9-77d8-4b2b-8e58-d087d7249f91
📒 Files selected for processing (2)
src/openhuman/mcp_server/protocol.rssrc/openhuman/mcp_server/tools.rs
Addresses CodeRabbit review: titles with no ASCII-alphanumeric chars (e.g. "会议记录", "Протокол", emoji) now produce "untitled-<sha256_hex>" instead of a constant "untitled", ensuring distinct stable slugs. Adds 2 regression tests for uniqueness and stability across calls. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
@Liohtml — thanks for picking this up so fast, and for the explicit RFC credit. Quick mapping of your implementation against the four open questions in #2269, mostly so I can verify I'm reading the diff right and so anyone landing here later sees the trade-off space:
Two follow-ups I'd happily own so the RFC closes out cleanly:
Happy to do a deeper line-level review if it's useful. |
|
@justinhsu1477 Great mapping — you're reading the implementation exactly right on all four questions. Re your follow-ups:
A deeper line-level review would absolutely be useful if you have the time — especially around the |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/openhuman/mcp_server/tools.rs`:
- Around line 1215-1221: The slug-generation currently only appends a hash when
result.is_empty(), which lets mixed-script titles that normalize to a numeric
ASCII slug (e.g. "会议记录 2026" and "Протокол 2026") collide; change the logic in
the slugifier so that whenever any non-ASCII alphanumeric characters were
dropped during normalization you append a short stable hash (use the existing
Sha256 + hex::encode(...)[..8] pattern) to the ASCII slug instead of only on
empty slugs, update the return formatting to include that hash (the code around
result/is_empty() and the Sha256 usage), and add a regression test that stores
two mixed-script titles (e.g. the two examples) and verifies their memory.store
keys (mcp-store-...) differ.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: cb0ed376-5657-4afb-82f1-5e3970d7022d
📒 Files selected for processing (1)
src/openhuman/mcp_server/tools.rs
Titles like "会议记录 2026" and "Протокол 2026" both normalize to "2026" after stripping non-ASCII chars. Now any title containing non-ASCII characters gets a short stable hash suffix appended (e.g. "2026-a1b2c3d4"), ensuring distinct slugs even when the ASCII fragments overlap. Pure ASCII titles are unchanged (no hash appended). Adds regression test for the mixed-script collision case. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/openhuman/mcp_server/tools.rs`:
- Around line 1215-1238: The slug truncation must reserve space for the
mixed-script hash so the hash isn't dropped; in the block handling had_non_ascii
(where you compute hash from title.as_bytes()), compute hash_len =
hex::encode(...).len(), then truncate the base result to max_base = 64 - 1 -
hash_len (if max_base < 0 treat as 0), trim trailing '-' from the truncated
base, and only then set result = format!("{result}-{hash}") (and if result was
empty keep returning "untitled-{hash}" as before). Also ensure the later global
truncation (the final if result.len() > 64) only runs when no hash was appended
(or adjust it to use the same reserved-size logic), and add a regression test
for a long mixed-script pair to verify two long titles differing only in
non-ASCII suffix produce distinct slugs.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 17b71a6f-132d-4d9f-b92b-a7131dfd332f
📒 Files selected for processing (1)
src/openhuman/mcp_server/tools.rs
Long mixed-script titles now truncate the ASCII base to 55 chars (64 - 9 for the "-<8hex>" suffix) before appending the hash, so the hash is never clipped by the 64-char limit. Adds regression test for long mixed-script title pairs. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
@Liohtml — green light on
For the deeper line-level look at |
|
@senamakel #2306 当前 checks 24 success / 2 skipped / 0 pending / 0 failure,CodeRabbit approved,review threads 0 unresolved。这个是 MCP write tools 的 Phase 1(memory.store / memory.note),麻烦再看一下。 |
graycyrus
left a comment
There was a problem hiding this comment.
Walkthrough
Adds two write tools (memory.store, memory.note) to the MCP surface, both routing through openhuman.memory_doc_put. Includes enforce_write_policy() for security gating, dispatch_write_tool() for audit-logged dispatch, a slug_from() utility for deterministic key generation, and 11 new unit tests. All prior CodeRabbit findings on slug edge cases are addressed — nice iterating on that.
| File | Changes |
|---|---|
tools.rs |
+2 tool specs, +2 schemas, +call_tool routing, +build_rpc_params arms, +enforce_write_policy, +dispatch_write_tool, +slug_from, +11 tests |
protocol.rs |
+2 entries in tool-name list test |
1 critical issue blocks this PR — see inline comment. The build_rpc_params logic (key generation, provenance, metadata) is unreachable because dispatch_write_tool receives raw client args and passes them straight to the RPC.
Minor notes (not blocking):
enforce_write_policyis functionally identical toenforce_act_policy(both useToolOperation::Act). The "distinct log line for auditors" rationale from the doc comment doesn't hold since both produce the same log pattern — the only difference is the function name. Consider either adding aToolOperation::Writevariant or just reusingenforce_act_policyuntil the distinction matters.dispatch_write_toolmixestracing::info!(success path) withlog::warn!/log::error!(error paths). Pick one facade for consistency.
## Summary - Adds MCP stdio session state that captures `initialize.params.clientInfo.name` for the lifetime of the session. - Normalizes known MCP client names into stable source labels such as `mcp:claude-desktop`, `mcp:cursor`, and `mcp:windsurf`. - Preserves the existing bare `mcp` fallback for missing, empty, whitespace-only, or Unicode-only client names. - Keeps existing stateless protocol helpers for tests/callers while wiring the stdio loop through the new stateful handler. - Documents the provenance contract for follow-up write-capable MCP tools. ## Problem #2317 needs MCP write tools to distinguish which client wrote memory, not only that the write came from MCP. The write-tool PRs are still in flight (#2306 for `memory.store` / `memory.note`, #2316 for `tree.tag`), so implementing the full source-type propagation directly on `main` would duplicate those open PRs. ## Solution This PR lands the non-duplicative foundation first: the MCP server records the client identity at `initialize` time and exposes a session source label internally. The default remains `mcp`, so older clients and clients without `clientInfo.name` retain current behavior. Once #2306/#2316 merge, the write dispatch path can use `session.source_type()` instead of the placeholder `mcp` source string. ## Submission Checklist > If a section does not apply to this change, mark the item as `N/A` with a one-line reason. Do not delete items. - [x] Tests added or updated (happy path + at least one failure / edge case) per [Testing Strategy](../gitbooks/developing/testing-strategy.md#failure-path-requirement) - [x] **Diff coverage ≥ 80%** — local coverage not run; CI coverage gate is the source of truth for changed-line coverage on this PR. - [x] Coverage matrix updated — N/A: MCP protocol/session foundation only; no feature matrix row added/removed/renamed. - [x] All affected feature IDs from the matrix are listed in the PR description under `## Related` — N/A: no feature matrix row applies. - [x] No new external network dependencies introduced (mock backend used per [Testing Strategy](../gitbooks/developing/testing-strategy.md#mock-policy)) - [x] Manual smoke checklist updated if this touches release-cut surfaces — N/A: no release manual smoke flow changes. - [x] Linked issue closed via `Closes #NNN` in the `## Related` section — N/A: this prepares #2317 but does not fully close it until write tools consume the session source label. ## Impact - Runtime/platform impact: CLI stdio MCP server only. - Compatibility: existing stateless helpers remain available; stdio sessions now retain client provenance across newline-delimited JSON-RPC messages. - Security/privacy: records only the client name already supplied by the MCP initialize payload; no wire-format mutation and no new persistence. - Performance: negligible in-memory string normalization during initialize only. ## Related - Refs #2317 - Depends conceptually on #2306 and #2316 for write-tool consumption. - Follow-up PR(s)/TODOs: after #2306/#2316 merge, thread `McpSession::source_type()` into `memory.store`, `memory.note`, and `tree.tag` source_type construction. --- ## AI Authored PR Metadata (required for Codex/Linear PRs) > Keep this section for AI-authored PRs. For human-only PRs, mark each field `N/A`. ### Linear Issue - Key: N/A - URL: N/A ### Commit & Branch - Branch: `feat/mcp-client-provenance` - Commit SHA: `95ab2dfe` ### Validation Run - [x] `pnpm --filter openhuman-app format:check` — blocked locally; see Validation Blocked. - [x] `pnpm typecheck` — not run locally because Node/app dependency environment is blocked; see Validation Blocked. - [x] Focused tests: `GGML_NATIVE=OFF cargo test --lib mcp_server --manifest-path Cargo.toml` — 47 passed, 0 failed. - [x] Rust fmt/check (if changed): `cargo fmt --check --manifest-path Cargo.toml` — passed. - [x] Tauri fmt/check (if changed): N/A, no Tauri shell files changed. - [x] Additional: `git diff --check` — passed. ### Validation Blocked - `command:` `pnpm --filter openhuman-app format:check` via pre-push hook - `error:` local app dependencies are not installed (`prettier: command not found`), and local Node is `v22.14.0` while `openhuman-app` requires `>=24.0.0`. - `impact:` local JS/Prettier validation could not run in this environment; Rust-focused validation for the touched MCP core files passed. Push used `--no-verify` because the hook failure was local environment/dependency setup, not this change. - `command:` `GGML_NATIVE=OFF cargo clippy --lib --manifest-path Cargo.toml --no-deps -- -D warnings` - `error:` blocked by 119 pre-existing lint errors in unrelated files (examples: unused imports in `src/openhuman/inference/local/mod.rs`, duplicate module lints in `src/openhuman/inference/provider/*`, doc/comment lints, and unrelated clippy style lints across memory/tools/wallet). - `impact:` clippy cannot currently be used as a clean global gate locally; focused MCP tests and Rust formatting passed. ### Behavior Changes - Intended behavior change: MCP stdio sessions now remember the normalized client source label from `initialize.params.clientInfo.name` and preserve it for the session. - User-visible effect: none immediately for read-only tools; follow-up write tools can attribute memory writes to `mcp:<client>` while preserving `mcp` fallback. ### Parity Contract - Legacy behavior preserved: missing/empty/blank/unusable client names continue to produce bare `mcp`; later malformed initialize payloads do not clear already captured session provenance. - Guard/fallback/dispatch parity checks: existing `handle_json_line` / `handle_json_value` APIs still work; stdio loop uses the new stateful handlers so session data persists between messages. ### Duplicate / Superseded PR Handling - Duplicate PR(s): #2306 and #2316 are related dependencies, not duplicates. - Canonical PR: this PR is the canonical non-duplicative foundation for #2317 on `main`. - Resolution (closed/superseded/updated): N/A. <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **New Features** * MCP server now captures and preserves a client "source" label from initialization, normalizing client names and falling back to a default when absent. * **Documentation** * Added guidance on client provenance, name-normalization rules, and recommended source-label usage for tools. * **Tests** * Added unit tests verifying client name normalization and initialization behavior for captured source labels. <!-- review_stack_entry_start --> [](https://app.coderabbit.ai/change-stack/tinyhumansai/openhuman/pull/2332?utm_source=github_walkthrough&utm_medium=github&utm_campaign=change_stack) <!-- review_stack_entry_end --> <!-- end of auto-generated comment: release notes by coderabbit.ai --> Co-authored-by: 李冠辰 <liguanchen@xiaomi.com> Co-authored-by: Steven Enamakel <enamakel@tinyhumans.ai>
|
@graycyrus — quick technical note on the [critical] flag against I think the read may have skipped the line right above the // src/openhuman/mcp_server/tools.rs (current PR HEAD)
pub async fn call_tool(name: &str, arguments: Value) -> Result<Value, ToolCallError> {
let spec = ...;
let params = build_rpc_params(spec.name, arguments)?; // ← runs first
match spec.name {
...
"memory.store" | "memory.note" => {
enforce_write_policy(spec.name).await?;
validate_controller_params(&spec, ¶ms)?;
return dispatch_write_tool(spec.name, ¶ms).await;
// ↑ ¶ms, not raw `arguments`
}
...
}
}So by the time The existing tests in this PR pin that behaviour:
Both pass at the current head. If the dispatch were sending raw client args, neither would. — That said, the [major]
Either resolves the contract — happy to take a hint on the preferred direction so the fix lands consistently across |
## Summary - Adds MCP stdio session state that captures `initialize.params.clientInfo.name` for the lifetime of the session. - Normalizes known MCP client names into stable source labels such as `mcp:claude-desktop`, `mcp:cursor`, and `mcp:windsurf`. - Preserves the existing bare `mcp` fallback for missing, empty, whitespace-only, or Unicode-only client names. - Keeps existing stateless protocol helpers for tests/callers while wiring the stdio loop through the new stateful handler. - Documents the provenance contract for follow-up write-capable MCP tools. ## Problem tinyhumansai#2317 needs MCP write tools to distinguish which client wrote memory, not only that the write came from MCP. The write-tool PRs are still in flight (tinyhumansai#2306 for `memory.store` / `memory.note`, tinyhumansai#2316 for `tree.tag`), so implementing the full source-type propagation directly on `main` would duplicate those open PRs. ## Solution This PR lands the non-duplicative foundation first: the MCP server records the client identity at `initialize` time and exposes a session source label internally. The default remains `mcp`, so older clients and clients without `clientInfo.name` retain current behavior. Once tinyhumansai#2306/tinyhumansai#2316 merge, the write dispatch path can use `session.source_type()` instead of the placeholder `mcp` source string. ## Submission Checklist > If a section does not apply to this change, mark the item as `N/A` with a one-line reason. Do not delete items. - [x] Tests added or updated (happy path + at least one failure / edge case) per [Testing Strategy](../gitbooks/developing/testing-strategy.md#failure-path-requirement) - [x] **Diff coverage ≥ 80%** — local coverage not run; CI coverage gate is the source of truth for changed-line coverage on this PR. - [x] Coverage matrix updated — N/A: MCP protocol/session foundation only; no feature matrix row added/removed/renamed. - [x] All affected feature IDs from the matrix are listed in the PR description under `## Related` — N/A: no feature matrix row applies. - [x] No new external network dependencies introduced (mock backend used per [Testing Strategy](../gitbooks/developing/testing-strategy.md#mock-policy)) - [x] Manual smoke checklist updated if this touches release-cut surfaces — N/A: no release manual smoke flow changes. - [x] Linked issue closed via `Closes #NNN` in the `## Related` section — N/A: this prepares tinyhumansai#2317 but does not fully close it until write tools consume the session source label. ## Impact - Runtime/platform impact: CLI stdio MCP server only. - Compatibility: existing stateless helpers remain available; stdio sessions now retain client provenance across newline-delimited JSON-RPC messages. - Security/privacy: records only the client name already supplied by the MCP initialize payload; no wire-format mutation and no new persistence. - Performance: negligible in-memory string normalization during initialize only. ## Related - Refs tinyhumansai#2317 - Depends conceptually on tinyhumansai#2306 and tinyhumansai#2316 for write-tool consumption. - Follow-up PR(s)/TODOs: after tinyhumansai#2306/tinyhumansai#2316 merge, thread `McpSession::source_type()` into `memory.store`, `memory.note`, and `tree.tag` source_type construction. --- ## AI Authored PR Metadata (required for Codex/Linear PRs) > Keep this section for AI-authored PRs. For human-only PRs, mark each field `N/A`. ### Linear Issue - Key: N/A - URL: N/A ### Commit & Branch - Branch: `feat/mcp-client-provenance` - Commit SHA: `95ab2dfe` ### Validation Run - [x] `pnpm --filter openhuman-app format:check` — blocked locally; see Validation Blocked. - [x] `pnpm typecheck` — not run locally because Node/app dependency environment is blocked; see Validation Blocked. - [x] Focused tests: `GGML_NATIVE=OFF cargo test --lib mcp_server --manifest-path Cargo.toml` — 47 passed, 0 failed. - [x] Rust fmt/check (if changed): `cargo fmt --check --manifest-path Cargo.toml` — passed. - [x] Tauri fmt/check (if changed): N/A, no Tauri shell files changed. - [x] Additional: `git diff --check` — passed. ### Validation Blocked - `command:` `pnpm --filter openhuman-app format:check` via pre-push hook - `error:` local app dependencies are not installed (`prettier: command not found`), and local Node is `v22.14.0` while `openhuman-app` requires `>=24.0.0`. - `impact:` local JS/Prettier validation could not run in this environment; Rust-focused validation for the touched MCP core files passed. Push used `--no-verify` because the hook failure was local environment/dependency setup, not this change. - `command:` `GGML_NATIVE=OFF cargo clippy --lib --manifest-path Cargo.toml --no-deps -- -D warnings` - `error:` blocked by 119 pre-existing lint errors in unrelated files (examples: unused imports in `src/openhuman/inference/local/mod.rs`, duplicate module lints in `src/openhuman/inference/provider/*`, doc/comment lints, and unrelated clippy style lints across memory/tools/wallet). - `impact:` clippy cannot currently be used as a clean global gate locally; focused MCP tests and Rust formatting passed. ### Behavior Changes - Intended behavior change: MCP stdio sessions now remember the normalized client source label from `initialize.params.clientInfo.name` and preserve it for the session. - User-visible effect: none immediately for read-only tools; follow-up write tools can attribute memory writes to `mcp:<client>` while preserving `mcp` fallback. ### Parity Contract - Legacy behavior preserved: missing/empty/blank/unusable client names continue to produce bare `mcp`; later malformed initialize payloads do not clear already captured session provenance. - Guard/fallback/dispatch parity checks: existing `handle_json_line` / `handle_json_value` APIs still work; stdio loop uses the new stateful handlers so session data persists between messages. ### Duplicate / Superseded PR Handling - Duplicate PR(s): tinyhumansai#2306 and tinyhumansai#2316 are related dependencies, not duplicates. - Canonical PR: this PR is the canonical non-duplicative foundation for tinyhumansai#2317 on `main`. - Resolution (closed/superseded/updated): N/A. <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **New Features** * MCP server now captures and preserves a client "source" label from initialization, normalizing client names and falling back to a default when absent. * **Documentation** * Added guidance on client provenance, name-normalization rules, and recommended source-label usage for tools. * **Tests** * Added unit tests verifying client name normalization and initialization behavior for captured source labels. <!-- review_stack_entry_start --> [](https://app.coderabbit.ai/change-stack/tinyhumansai/openhuman/pull/2332?utm_source=github_walkthrough&utm_medium=github&utm_campaign=change_stack) <!-- review_stack_entry_end --> <!-- end of auto-generated comment: release notes by coderabbit.ai --> Co-authored-by: 李冠辰 <liguanchen@xiaomi.com> Co-authored-by: Steven Enamakel <enamakel@tinyhumans.ai>
1. memory.note key now includes a content hash suffix so multiple notes
on the same chunk get distinct keys instead of silently overwriting:
`mcp-note-{chunk_id}-{content_hash}`
2. Unified logging in dispatch_write_tool to use tracing exclusively
(was mixing log::warn/error with tracing::info).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Cap content/note_text at 64KB in build_rpc_params to prevent misbehaving MCP clients from storing arbitrarily large documents. Consistent with the Memory Tree's ~3k-token chunk budget. Addresses graycyrus's pre-merge hardening review. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
@graycyrus Thanks for the thorough security review — really appreciate the depth. Fixed before merge (cdd5818):
Acknowledged as follow-ups (not blocking):
All 48 MCP tool tests pass. |
|
Status update: the current red frontend/i18n checks on this PR are from a main-side German locale parity gap, not from this PR's MCP Rust changes. I reproduced the failure on latest upstream/main: German was missing 20 English keys, which caused |
|
@graycyrus Heads up: #2470 is now green and ready for review/merge. Once it lands on main, #2306 should just need a CI rerun plus a fresh review/dismissal of the stale CHANGES_REQUESTED state. I am not rerunning #2306 yet because the German i18n fix is not on main, so the same frontend/i18n checks would likely fail again. |
|
I prepared and validated a sync branch for this PR, but cannot push directly to Branch available for fast-forward / cherry-pick:
Validation run locally on that branch:
The previous reviewer findings for the MCP write-tool logic appear resolved by the current branch history. Once |
## Summary - Adds MCP stdio session state that captures `initialize.params.clientInfo.name` for the lifetime of the session. - Normalizes known MCP client names into stable source labels such as `mcp:claude-desktop`, `mcp:cursor`, and `mcp:windsurf`. - Preserves the existing bare `mcp` fallback for missing, empty, whitespace-only, or Unicode-only client names. - Keeps existing stateless protocol helpers for tests/callers while wiring the stdio loop through the new stateful handler. - Documents the provenance contract for follow-up write-capable MCP tools. ## Problem tinyhumansai#2317 needs MCP write tools to distinguish which client wrote memory, not only that the write came from MCP. The write-tool PRs are still in flight (tinyhumansai#2306 for `memory.store` / `memory.note`, tinyhumansai#2316 for `tree.tag`), so implementing the full source-type propagation directly on `main` would duplicate those open PRs. ## Solution This PR lands the non-duplicative foundation first: the MCP server records the client identity at `initialize` time and exposes a session source label internally. The default remains `mcp`, so older clients and clients without `clientInfo.name` retain current behavior. Once tinyhumansai#2306/tinyhumansai#2316 merge, the write dispatch path can use `session.source_type()` instead of the placeholder `mcp` source string. ## Submission Checklist > If a section does not apply to this change, mark the item as `N/A` with a one-line reason. Do not delete items. - [x] Tests added or updated (happy path + at least one failure / edge case) per [Testing Strategy](../gitbooks/developing/testing-strategy.md#failure-path-requirement) - [x] **Diff coverage ≥ 80%** — local coverage not run; CI coverage gate is the source of truth for changed-line coverage on this PR. - [x] Coverage matrix updated — N/A: MCP protocol/session foundation only; no feature matrix row added/removed/renamed. - [x] All affected feature IDs from the matrix are listed in the PR description under `## Related` — N/A: no feature matrix row applies. - [x] No new external network dependencies introduced (mock backend used per [Testing Strategy](../gitbooks/developing/testing-strategy.md#mock-policy)) - [x] Manual smoke checklist updated if this touches release-cut surfaces — N/A: no release manual smoke flow changes. - [x] Linked issue closed via `Closes #NNN` in the `## Related` section — N/A: this prepares tinyhumansai#2317 but does not fully close it until write tools consume the session source label. ## Impact - Runtime/platform impact: CLI stdio MCP server only. - Compatibility: existing stateless helpers remain available; stdio sessions now retain client provenance across newline-delimited JSON-RPC messages. - Security/privacy: records only the client name already supplied by the MCP initialize payload; no wire-format mutation and no new persistence. - Performance: negligible in-memory string normalization during initialize only. ## Related - Refs tinyhumansai#2317 - Depends conceptually on tinyhumansai#2306 and tinyhumansai#2316 for write-tool consumption. - Follow-up PR(s)/TODOs: after tinyhumansai#2306/tinyhumansai#2316 merge, thread `McpSession::source_type()` into `memory.store`, `memory.note`, and `tree.tag` source_type construction. --- ## AI Authored PR Metadata (required for Codex/Linear PRs) > Keep this section for AI-authored PRs. For human-only PRs, mark each field `N/A`. ### Linear Issue - Key: N/A - URL: N/A ### Commit & Branch - Branch: `feat/mcp-client-provenance` - Commit SHA: `95ab2dfe` ### Validation Run - [x] `pnpm --filter openhuman-app format:check` — blocked locally; see Validation Blocked. - [x] `pnpm typecheck` — not run locally because Node/app dependency environment is blocked; see Validation Blocked. - [x] Focused tests: `GGML_NATIVE=OFF cargo test --lib mcp_server --manifest-path Cargo.toml` — 47 passed, 0 failed. - [x] Rust fmt/check (if changed): `cargo fmt --check --manifest-path Cargo.toml` — passed. - [x] Tauri fmt/check (if changed): N/A, no Tauri shell files changed. - [x] Additional: `git diff --check` — passed. ### Validation Blocked - `command:` `pnpm --filter openhuman-app format:check` via pre-push hook - `error:` local app dependencies are not installed (`prettier: command not found`), and local Node is `v22.14.0` while `openhuman-app` requires `>=24.0.0`. - `impact:` local JS/Prettier validation could not run in this environment; Rust-focused validation for the touched MCP core files passed. Push used `--no-verify` because the hook failure was local environment/dependency setup, not this change. - `command:` `GGML_NATIVE=OFF cargo clippy --lib --manifest-path Cargo.toml --no-deps -- -D warnings` - `error:` blocked by 119 pre-existing lint errors in unrelated files (examples: unused imports in `src/openhuman/inference/local/mod.rs`, duplicate module lints in `src/openhuman/inference/provider/*`, doc/comment lints, and unrelated clippy style lints across memory/tools/wallet). - `impact:` clippy cannot currently be used as a clean global gate locally; focused MCP tests and Rust formatting passed. ### Behavior Changes - Intended behavior change: MCP stdio sessions now remember the normalized client source label from `initialize.params.clientInfo.name` and preserve it for the session. - User-visible effect: none immediately for read-only tools; follow-up write tools can attribute memory writes to `mcp:<client>` while preserving `mcp` fallback. ### Parity Contract - Legacy behavior preserved: missing/empty/blank/unusable client names continue to produce bare `mcp`; later malformed initialize payloads do not clear already captured session provenance. - Guard/fallback/dispatch parity checks: existing `handle_json_line` / `handle_json_value` APIs still work; stdio loop uses the new stateful handlers so session data persists between messages. ### Duplicate / Superseded PR Handling - Duplicate PR(s): tinyhumansai#2306 and tinyhumansai#2316 are related dependencies, not duplicates. - Canonical PR: this PR is the canonical non-duplicative foundation for tinyhumansai#2317 on `main`. - Resolution (closed/superseded/updated): N/A. <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **New Features** * MCP server now captures and preserves a client "source" label from initialization, normalizing client names and falling back to a default when absent. * **Documentation** * Added guidance on client provenance, name-normalization rules, and recommended source-label usage for tools. * **Tests** * Added unit tests verifying client name normalization and initialization behavior for captured source labels. <!-- review_stack_entry_start --> [](https://app.coderabbit.ai/change-stack/tinyhumansai/openhuman/pull/2332?utm_source=github_walkthrough&utm_medium=github&utm_campaign=change_stack) <!-- review_stack_entry_end --> <!-- end of auto-generated comment: release notes by coderabbit.ai --> Co-authored-by: 李冠辰 <liguanchen@xiaomi.com> Co-authored-by: Steven Enamakel <enamakel@tinyhumans.ai>
- Add `annotations` to the new memory.store / memory.note / tree.tag McpToolSpec entries (writeable, idempotent upsert, closed-world). This unblocks the `Rust Quality (clippy)` CI failure introduced by the `annotations` field added in tinyhumansai#2268 — the new specs were forked from pre-tinyhumansai#2268 tinyhumansai#2306 scaffolding and were missing the required field on rebase against current main. - Extend the read-only-by-default test exemption (`act_tool_names`) to cover the three new write tools, matching their `readOnlyHint: false`. - Fix CodeRabbit nit (tools.rs:803): the oversize-tag error said "characters" but used `String::len()` (byte length). Reword to "bytes" so Unicode tag input gets a self-consistent message. Defers @graycyrus's three inline threads on lines 686/705/1072 (slug_from collisions, memory.note upsert-vs-append, dispatch_write_tool hardcoded rpc_method) to tinyhumansai#2306 per the author's review-thread reply — that's where the underlying functions land and where the fix should originate so this branch can simply inherit on rebase.
The upstream merge introduced duplicate property keys in the German translation file de-5.ts, causing TS1117 errors in CI. Restored to the clean upstream/main version. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
@graycyrus All your review findings are resolved (confirmed in your own comments above). Could you dismiss your CHANGES_REQUESTED review so the PR can move forward? Thanks for the thorough security review — it made the code significantly better. |
|
hey @Liohtml same thing as the last pr. just resolve the tests failing in the CI and then all good to go |
Upstream added settings.mascot.customGif{Error,Heading,Label} to the
English locale; the German translation file was missing them, causing
i18n coverage tests to fail.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
## Summary - Adds MCP stdio session state that captures `initialize.params.clientInfo.name` for the lifetime of the session. - Normalizes known MCP client names into stable source labels such as `mcp:claude-desktop`, `mcp:cursor`, and `mcp:windsurf`. - Preserves the existing bare `mcp` fallback for missing, empty, whitespace-only, or Unicode-only client names. - Keeps existing stateless protocol helpers for tests/callers while wiring the stdio loop through the new stateful handler. - Documents the provenance contract for follow-up write-capable MCP tools. ## Problem tinyhumansai#2317 needs MCP write tools to distinguish which client wrote memory, not only that the write came from MCP. The write-tool PRs are still in flight (tinyhumansai#2306 for `memory.store` / `memory.note`, tinyhumansai#2316 for `tree.tag`), so implementing the full source-type propagation directly on `main` would duplicate those open PRs. ## Solution This PR lands the non-duplicative foundation first: the MCP server records the client identity at `initialize` time and exposes a session source label internally. The default remains `mcp`, so older clients and clients without `clientInfo.name` retain current behavior. Once tinyhumansai#2306/tinyhumansai#2316 merge, the write dispatch path can use `session.source_type()` instead of the placeholder `mcp` source string. ## Submission Checklist > If a section does not apply to this change, mark the item as `N/A` with a one-line reason. Do not delete items. - [x] Tests added or updated (happy path + at least one failure / edge case) per [Testing Strategy](../gitbooks/developing/testing-strategy.md#failure-path-requirement) - [x] **Diff coverage ≥ 80%** — local coverage not run; CI coverage gate is the source of truth for changed-line coverage on this PR. - [x] Coverage matrix updated — N/A: MCP protocol/session foundation only; no feature matrix row added/removed/renamed. - [x] All affected feature IDs from the matrix are listed in the PR description under `## Related` — N/A: no feature matrix row applies. - [x] No new external network dependencies introduced (mock backend used per [Testing Strategy](../gitbooks/developing/testing-strategy.md#mock-policy)) - [x] Manual smoke checklist updated if this touches release-cut surfaces — N/A: no release manual smoke flow changes. - [x] Linked issue closed via `Closes #NNN` in the `## Related` section — N/A: this prepares tinyhumansai#2317 but does not fully close it until write tools consume the session source label. ## Impact - Runtime/platform impact: CLI stdio MCP server only. - Compatibility: existing stateless helpers remain available; stdio sessions now retain client provenance across newline-delimited JSON-RPC messages. - Security/privacy: records only the client name already supplied by the MCP initialize payload; no wire-format mutation and no new persistence. - Performance: negligible in-memory string normalization during initialize only. ## Related - Refs tinyhumansai#2317 - Depends conceptually on tinyhumansai#2306 and tinyhumansai#2316 for write-tool consumption. - Follow-up PR(s)/TODOs: after tinyhumansai#2306/tinyhumansai#2316 merge, thread `McpSession::source_type()` into `memory.store`, `memory.note`, and `tree.tag` source_type construction. --- ## AI Authored PR Metadata (required for Codex/Linear PRs) > Keep this section for AI-authored PRs. For human-only PRs, mark each field `N/A`. ### Linear Issue - Key: N/A - URL: N/A ### Commit & Branch - Branch: `feat/mcp-client-provenance` - Commit SHA: `95ab2dfe` ### Validation Run - [x] `pnpm --filter openhuman-app format:check` — blocked locally; see Validation Blocked. - [x] `pnpm typecheck` — not run locally because Node/app dependency environment is blocked; see Validation Blocked. - [x] Focused tests: `GGML_NATIVE=OFF cargo test --lib mcp_server --manifest-path Cargo.toml` — 47 passed, 0 failed. - [x] Rust fmt/check (if changed): `cargo fmt --check --manifest-path Cargo.toml` — passed. - [x] Tauri fmt/check (if changed): N/A, no Tauri shell files changed. - [x] Additional: `git diff --check` — passed. ### Validation Blocked - `command:` `pnpm --filter openhuman-app format:check` via pre-push hook - `error:` local app dependencies are not installed (`prettier: command not found`), and local Node is `v22.14.0` while `openhuman-app` requires `>=24.0.0`. - `impact:` local JS/Prettier validation could not run in this environment; Rust-focused validation for the touched MCP core files passed. Push used `--no-verify` because the hook failure was local environment/dependency setup, not this change. - `command:` `GGML_NATIVE=OFF cargo clippy --lib --manifest-path Cargo.toml --no-deps -- -D warnings` - `error:` blocked by 119 pre-existing lint errors in unrelated files (examples: unused imports in `src/openhuman/inference/local/mod.rs`, duplicate module lints in `src/openhuman/inference/provider/*`, doc/comment lints, and unrelated clippy style lints across memory/tools/wallet). - `impact:` clippy cannot currently be used as a clean global gate locally; focused MCP tests and Rust formatting passed. ### Behavior Changes - Intended behavior change: MCP stdio sessions now remember the normalized client source label from `initialize.params.clientInfo.name` and preserve it for the session. - User-visible effect: none immediately for read-only tools; follow-up write tools can attribute memory writes to `mcp:<client>` while preserving `mcp` fallback. ### Parity Contract - Legacy behavior preserved: missing/empty/blank/unusable client names continue to produce bare `mcp`; later malformed initialize payloads do not clear already captured session provenance. - Guard/fallback/dispatch parity checks: existing `handle_json_line` / `handle_json_value` APIs still work; stdio loop uses the new stateful handlers so session data persists between messages. ### Duplicate / Superseded PR Handling - Duplicate PR(s): tinyhumansai#2306 and tinyhumansai#2316 are related dependencies, not duplicates. - Canonical PR: this PR is the canonical non-duplicative foundation for tinyhumansai#2317 on `main`. - Resolution (closed/superseded/updated): N/A. <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **New Features** * MCP server now captures and preserves a client "source" label from initialization, normalizing client names and falling back to a default when absent. * **Documentation** * Added guidance on client provenance, name-normalization rules, and recommended source-label usage for tools. * **Tests** * Added unit tests verifying client name normalization and initialization behavior for captured source labels. <!-- review_stack_entry_start --> [](https://app.coderabbit.ai/change-stack/tinyhumansai/openhuman/pull/2332?utm_source=github_walkthrough&utm_medium=github&utm_campaign=change_stack) <!-- review_stack_entry_end --> <!-- end of auto-generated comment: release notes by coderabbit.ai --> Co-authored-by: 李冠辰 <liguanchen@xiaomi.com> Co-authored-by: Steven Enamakel <enamakel@tinyhumans.ai>
|
Status refresh after today's PR babysit pass:
@graycyrus could you dismiss or update the stale changes-requested review when you have a moment? @senamakel the CI issue you flagged appears resolved now. |
Summary
memory.storeMCP tool: creates memory documents from content with configurable namespace and tags, routing to the existingdoc_putRPCmemory.noteMCP tool: creates annotation documents linked to existing chunks viametadata.annotates_chunk_id, also routing throughdoc_putenforce_write_policy()for write-specific security gating (distinct from read/act for audit clarity)dispatch_write_tool()with structuredtracing::info!audit logging (tool, chunk_id, client fields)source_type = "mcp"for provenance trackingCloses #2269
Test plan
cargo check -p openhuman --libpassescargo test -p openhuman --lib mcp_server::tools— all 40 tests pass (29 existing + 11 new)cargo fmt --allapplied🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
Tests