Skip to content

feat(mcp): add memory.store and memory.note write tools#2306

Merged
senamakel merged 14 commits into
tinyhumansai:mainfrom
Liohtml:feat/mcp-write-tools
May 25, 2026
Merged

feat(mcp): add memory.store and memory.note write tools#2306
senamakel merged 14 commits into
tinyhumansai:mainfrom
Liohtml:feat/mcp-write-tools

Conversation

@Liohtml
Copy link
Copy Markdown
Contributor

@Liohtml Liohtml commented May 20, 2026

Summary

  • Add memory.store MCP tool: creates memory documents from content with configurable namespace and tags, routing to the existing doc_put RPC
  • Add memory.note MCP tool: creates annotation documents linked to existing chunks via metadata.annotates_chunk_id, also routing through doc_put
  • Add enforce_write_policy() for write-specific security gating (distinct from read/act for audit clarity)
  • Add dispatch_write_tool() with structured tracing::info! audit logging (tool, chunk_id, client fields)
  • Both tools set source_type = "mcp" for provenance tracking
  • 11 new unit tests covering param validation, defaults, slug generation, and error cases

Closes #2269

Test plan

  • cargo check -p openhuman --lib passes
  • cargo test -p openhuman --lib mcp_server::tools — all 40 tests pass (29 existing + 11 new)
  • cargo fmt --all applied
  • CI checks pass on PR

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Added memory.store to save memories (title + content) with automatic namespace, deterministic key generation and tagging support
    • Added memory.note to attach notes to specific memory chunks with automatic key generation and linked metadata and content structure
  • Tests

    • Expanded unit tests covering the new memory tools, argument validation (required fields and unknown-field rejection), defaults, and metadata/content construction

Review Change Stack

Expose two new MCP write tools that route to the existing doc_put RPC:

- memory.store: create a memory document from content (title, content,
  namespace, optional tags) with source_type=mcp provenance
- memory.note: append an annotation to an existing chunk by ID, stored
  as a linked document with metadata.annotates_chunk_id

Adds enforce_write_policy() for write-specific security gating (uses
the Act operation gate, with a distinct log path for audit clarity),
dispatch_write_tool() with structured tracing::info! audit logging,
and 11 unit tests covering param validation, defaults, and slugging.

Closes tinyhumansai#2269

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@Liohtml Liohtml requested a review from a team May 20, 2026 08:05
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 20, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Adds two MCP write tools—memory.store and memory.note—registered to openhuman.memory_doc_put. Both enforce a write gate, validate inputs against JSON schemas, build RPC parameters with deterministic keys/slugs, dispatch via a dedicated write path with audit logs, and include unit tests.

Changes

Memory Write Tools

Layer / File(s) Summary
Tool declarations and schemas
src/openhuman/mcp_server/tools.rs
Defines allowed-argument keys and registers memory.store / memory.note in the base tool catalog mapped to openhuman.memory_doc_put; adds JSON schemas that require specific fields and disallow extra properties.
Tool dispatch routing
src/openhuman/mcp_server/tools.rs
call_tool now recognizes the new tools, enforces write-policy (Act), validates controller params, and routes to a dedicated write dispatch helper.
RPC parameter construction
src/openhuman/mcp_server/tools.rs
build_rpc_params adds branches for both tools: required-field validation, namespace defaulting for memory.store, deterministic key generation, tags forwarding, note document composition, and metadata.annotates_chunk_id for notes.
Write enforcement and dispatch
src/openhuman/mcp_server/tools.rs
Adds enforce_write_policy to gate write operations and dispatch_write_tool to invoke openhuman.memory_doc_put, map outcomes, and emit structured audit logs.
Slug generation helper
src/openhuman/mcp_server/tools.rs
slug_from normalizes titles into deterministic, URL-safe slugs with lowercasing, non-alphanumeric→hyphen conversion, hyphen collapsing, truncation, and hash-based fallback for non-ASCII/empty inputs.
Tests / protocol expectations
src/openhuman/mcp_server/protocol.rs, src/openhuman/mcp_server/tools.rs
Updates tools/list expectation to include the new tools; adds tests for RPC-param construction/validation, namespace/tag handling, unknown-argument rejection, constructed document fields, and slug_from behavior.

Sequence Diagram(s)

sequenceDiagram
  participant Client
  participant MCP_Server
  participant WritePolicy
  participant RPC_Builder
  participant MemoryService
  Client->>MCP_Server: tools.call(memory.store | memory.note, args)
  MCP_Server->>WritePolicy: enforce_write_policy(controller, ToolOperation::Write)
  WritePolicy-->>MCP_Server: allow/deny
  MCP_Server->>RPC_Builder: build_rpc_params(tool, args)
  RPC_Builder-->>MCP_Server: rpc_params
  MCP_Server->>MemoryService: openhuman.memory_doc_put(rpc_params)
  MemoryService-->>MCP_Server: success/error
  MCP_Server-->>Client: result (including audit/log entry)
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related issues

Possibly related PRs

  • tinyhumansai/openhuman#1760: Extends the same MCP stdio tool-discovery/call plumbing and is a likely predecessor to these write-tool changes.

Suggested reviewers

  • senamakel

Poem

A rabbit hops through code and logs,
Two tools to store and note the thoughts.
Slugs march tidy, hashes keep the pace,
Memory finds a stable place. 🐇✨

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title clearly and specifically summarizes the main change: adding two new MCP write tools (memory.store and memory.note) to the codebase.
Linked Issues check ✅ Passed The PR implements the primary objectives from issue #2269 Phase 1: adds memory.store and memory.note write tools with provenance (source_type=mcp), security gating (enforce_write_policy), and audit logging (dispatch_write_tool). Defers tree.tag and destructive operations as planned.
Out of Scope Changes check ✅ Passed All changes are directly scoped to the linked issue objectives: memory.store and memory.note tools, write-path scaffolding, slug generation, parameter validation, and unit tests. No unrelated modifications detected.
Docstring Coverage ✅ Passed Docstring coverage is 90.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot added the working A PR that is being worked on by the team. label May 20, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/openhuman/mcp_server/tools.rs`:
- Around line 1177-1216: slug_from currently can return an empty string for
titles with only non-alphanumeric characters, causing key collisions; update
slug_from so that after collapsing/trimming and truncating you check if
result.is_empty() and if so produce a deterministic fallback (e.g., "untitled"
or "doc-" plus a short hex/hash of the original title) that is URL-safe and <=64
chars, ensure any trailing hyphens are removed, and return that non-empty
fallback so keys like "mcp-store-" are avoided.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 26b66ae5-a3a2-4d7d-880b-1741465d4827

📥 Commits

Reviewing files that changed from the base of the PR and between 65d92bf and bbe7ab7.

📒 Files selected for processing (1)
  • src/openhuman/mcp_server/tools.rs

Comment thread src/openhuman/mcp_server/tools.rs
@Liohtml
Copy link
Copy Markdown
Contributor Author

Liohtml commented May 20, 2026

Credit to @justinhsu1477 for the thorough RFC-style issue (#2269) — the architecture questions, proposed tool surface, and phased rollout plan made scoping this first wave (memory.store + memory.note) much cleaner. The provenance and audit logging patterns follow directly from the issue's Q2/Q4 proposals.

Address CodeRabbit review: slug_from now returns "untitled" when the
title contains only non-alphanumeric characters, preventing key
collisions on the mcp-store- prefix. Also adds the missing
memory.store and memory.note entries to the protocol integration test.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/openhuman/mcp_server/tools.rs`:
- Around line 1215-1218: The current slug_from fallback returns a constant
"untitled" which collapses all Unicode-only titles into the same key; update
slug_from so that when the ASCII-alphanumeric result is empty it returns
"untitled-<stable_hash>" where <stable_hash> is a short deterministic hash
(e.g., hex of SHA256 or SipHash truncated to a few chars) computed from the
original title string, ensuring unique stable slugs for titles like "会议记录" and
"Протокол"; apply the same change to the identical fallback in the other
occurrence noted (around the second block at lines similar to 1763-1769), and
add a regression test that creates two documents with different Unicode-only
titles and asserts their generated keys (e.g., mcp-store-<slug>) are distinct
and stable across runs.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: a18feea9-77d8-4b2b-8e58-d087d7249f91

📥 Commits

Reviewing files that changed from the base of the PR and between bbe7ab7 and b1d584a.

📒 Files selected for processing (2)
  • src/openhuman/mcp_server/protocol.rs
  • src/openhuman/mcp_server/tools.rs

Comment thread src/openhuman/mcp_server/tools.rs
Addresses CodeRabbit review: titles with no ASCII-alphanumeric chars
(e.g. "会议记录", "Протокол", emoji) now produce "untitled-<sha256_hex>"
instead of a constant "untitled", ensuring distinct stable slugs.
Adds 2 regression tests for uniqueness and stability across calls.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@justinhsu1477
Copy link
Copy Markdown
Contributor

@Liohtml — thanks for picking this up so fast, and for the explicit RFC credit.

Quick mapping of your implementation against the four open questions in #2269, mostly so I can verify I'm reading the diff right and so anyone landing here later sees the trade-off space:

  • Q1 (multi-process write path) — you route through the existing doc_put RPC, which is effectively Option A from the RFC (daemon is the sole writer). Cleanest path for v1 and side-steps Option C's WAL-coordination work.
  • Q2 (provenance)metadata.source_type = "mcp" on each chunk. Schema-light, no migration.
  • Q3 (SecurityPolicy) — separating enforce_write_policy() from the read enforcer is a nicer split than tunnelling both through the same gate; matches what the RFC was hoping for under "default-allow + audit".
  • Q4 (audit log) — structured tracing::info! instead of a new mcp_writes table. Reasonable for v1; if maintainers later want queryable history this is the easiest place to swap a SQLite-backed sink in.

Two follow-ups I'd happily own so the RFC closes out cleanly:

  1. tree.tag — the third tool the RFC proposed. Want me to spin it up as a follow-up PR using your enforce_write_policy + dispatch_write_tool scaffolding once this one lands, so it merges cleanly on top?
  2. client_info propagation — RFC Q2 also suggested capturing the calling MCP client name (e.g. "mcp:Claude Desktop" from the initialize payload) on each chunk, so a write from Claude Desktop reads differently from one written by Cursor. Doesn't need to block this PR — happy to file as a separate issue once this lands if maintainers want it.

Happy to do a deeper line-level review if it's useful.

@Liohtml
Copy link
Copy Markdown
Contributor Author

Liohtml commented May 20, 2026

@justinhsu1477 Great mapping — you're reading the implementation exactly right on all four questions.

Re your follow-ups:

  1. tree.tag — yes, please! The enforce_write_policy + dispatch_write_tool scaffolding is designed to be additive, so a follow-up PR on top should be clean. Happy to review it.
  2. client_info propagation — agreed this should be a separate issue. The initialize payload's clientInfo is the right source; the current source_type = "mcp" is a placeholder that your follow-up would refine to "mcp:Claude Desktop" etc.

A deeper line-level review would absolutely be useful if you have the time — especially around the dispatch_write_tooldoc_put RPC parameter assembly, since you know the memory schema better than anyone.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/openhuman/mcp_server/tools.rs`:
- Around line 1215-1221: The slug-generation currently only appends a hash when
result.is_empty(), which lets mixed-script titles that normalize to a numeric
ASCII slug (e.g. "会议记录 2026" and "Протокол 2026") collide; change the logic in
the slugifier so that whenever any non-ASCII alphanumeric characters were
dropped during normalization you append a short stable hash (use the existing
Sha256 + hex::encode(...)[..8] pattern) to the ASCII slug instead of only on
empty slugs, update the return formatting to include that hash (the code around
result/is_empty() and the Sha256 usage), and add a regression test that stores
two mixed-script titles (e.g. the two examples) and verifies their memory.store
keys (mcp-store-...) differ.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: cb0ed376-5657-4afb-82f1-5e3970d7022d

📥 Commits

Reviewing files that changed from the base of the PR and between b1d584a and 79026df.

📒 Files selected for processing (1)
  • src/openhuman/mcp_server/tools.rs

Comment thread src/openhuman/mcp_server/tools.rs
Titles like "会议记录 2026" and "Протокол 2026" both normalize to "2026"
after stripping non-ASCII chars. Now any title containing non-ASCII
characters gets a short stable hash suffix appended (e.g. "2026-a1b2c3d4"),
ensuring distinct slugs even when the ASCII fragments overlap.

Pure ASCII titles are unchanged (no hash appended).

Adds regression test for the mixed-script collision case.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/openhuman/mcp_server/tools.rs`:
- Around line 1215-1238: The slug truncation must reserve space for the
mixed-script hash so the hash isn't dropped; in the block handling had_non_ascii
(where you compute hash from title.as_bytes()), compute hash_len =
hex::encode(...).len(), then truncate the base result to max_base = 64 - 1 -
hash_len (if max_base < 0 treat as 0), trim trailing '-' from the truncated
base, and only then set result = format!("{result}-{hash}") (and if result was
empty keep returning "untitled-{hash}" as before). Also ensure the later global
truncation (the final if result.len() > 64) only runs when no hash was appended
(or adjust it to use the same reserved-size logic), and add a regression test
for a long mixed-script pair to verify two long titles differing only in
non-ASCII suffix produce distinct slugs.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 17b71a6f-132d-4d9f-b92b-a7131dfd332f

📥 Commits

Reviewing files that changed from the base of the PR and between 79026df and cc8507b.

📒 Files selected for processing (1)
  • src/openhuman/mcp_server/tools.rs

Comment thread src/openhuman/mcp_server/tools.rs Outdated
Long mixed-script titles now truncate the ASCII base to 55 chars
(64 - 9 for the "-<8hex>" suffix) before appending the hash, so the
hash is never clipped by the 64-char limit.

Adds regression test for long mixed-script title pairs.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@justinhsu1477
Copy link
Copy Markdown
Contributor

@Liohtml — green light on tree.tag ack'd, both follow-ups are now up:

For the deeper line-level look at dispatch_write_tooldoc_put parameter assembly: I'd rather sit with the second CodeRabbit round here first so the review lands against the version heading for merge, instead of churning on in-flight code. Will circle back after that.

coderabbitai[bot]
coderabbitai Bot previously approved these changes May 20, 2026
@graycyrus graycyrus self-assigned this May 20, 2026
@YOMXXX
Copy link
Copy Markdown
Contributor

YOMXXX commented May 20, 2026

@senamakel #2306 当前 checks 24 success / 2 skipped / 0 pending / 0 failure,CodeRabbit approved,review threads 0 unresolved。这个是 MCP write tools 的 Phase 1(memory.store / memory.note),麻烦再看一下。

Copy link
Copy Markdown
Contributor

@graycyrus graycyrus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Walkthrough

Adds two write tools (memory.store, memory.note) to the MCP surface, both routing through openhuman.memory_doc_put. Includes enforce_write_policy() for security gating, dispatch_write_tool() for audit-logged dispatch, a slug_from() utility for deterministic key generation, and 11 new unit tests. All prior CodeRabbit findings on slug edge cases are addressed — nice iterating on that.

File Changes
tools.rs +2 tool specs, +2 schemas, +call_tool routing, +build_rpc_params arms, +enforce_write_policy, +dispatch_write_tool, +slug_from, +11 tests
protocol.rs +2 entries in tool-name list test

1 critical issue blocks this PR — see inline comment. The build_rpc_params logic (key generation, provenance, metadata) is unreachable because dispatch_write_tool receives raw client args and passes them straight to the RPC.

Minor notes (not blocking):

  • enforce_write_policy is functionally identical to enforce_act_policy (both use ToolOperation::Act). The "distinct log line for auditors" rationale from the doc comment doesn't hold since both produce the same log pattern — the only difference is the function name. Consider either adding a ToolOperation::Write variant or just reusing enforce_act_policy until the distinction matters.
  • dispatch_write_tool mixes tracing::info! (success path) with log::warn!/log::error! (error paths). Pick one facade for consistency.

Comment thread src/openhuman/mcp_server/tools.rs
Comment thread src/openhuman/mcp_server/tools.rs
senamakel added a commit that referenced this pull request May 21, 2026
## Summary

- Adds MCP stdio session state that captures `initialize.params.clientInfo.name` for the lifetime of the session.
- Normalizes known MCP client names into stable source labels such as `mcp:claude-desktop`, `mcp:cursor`, and `mcp:windsurf`.
- Preserves the existing bare `mcp` fallback for missing, empty, whitespace-only, or Unicode-only client names.
- Keeps existing stateless protocol helpers for tests/callers while wiring the stdio loop through the new stateful handler.
- Documents the provenance contract for follow-up write-capable MCP tools.

## Problem

#2317 needs MCP write tools to distinguish which client wrote memory, not only that the write came from MCP. The write-tool PRs are still in flight (#2306 for `memory.store` / `memory.note`, #2316 for `tree.tag`), so implementing the full source-type propagation directly on `main` would duplicate those open PRs.

## Solution

This PR lands the non-duplicative foundation first: the MCP server records the client identity at `initialize` time and exposes a session source label internally. The default remains `mcp`, so older clients and clients without `clientInfo.name` retain current behavior. Once #2306/#2316 merge, the write dispatch path can use `session.source_type()` instead of the placeholder `mcp` source string.

## Submission Checklist

> If a section does not apply to this change, mark the item as `N/A` with a one-line reason. Do not delete items.

- [x] Tests added or updated (happy path + at least one failure / edge case) per [Testing Strategy](../gitbooks/developing/testing-strategy.md#failure-path-requirement)
- [x] **Diff coverage ≥ 80%** — local coverage not run; CI coverage gate is the source of truth for changed-line coverage on this PR.
- [x] Coverage matrix updated — N/A: MCP protocol/session foundation only; no feature matrix row added/removed/renamed.
- [x] All affected feature IDs from the matrix are listed in the PR description under `## Related` — N/A: no feature matrix row applies.
- [x] No new external network dependencies introduced (mock backend used per [Testing Strategy](../gitbooks/developing/testing-strategy.md#mock-policy))
- [x] Manual smoke checklist updated if this touches release-cut surfaces — N/A: no release manual smoke flow changes.
- [x] Linked issue closed via `Closes #NNN` in the `## Related` section — N/A: this prepares #2317 but does not fully close it until write tools consume the session source label.

## Impact

- Runtime/platform impact: CLI stdio MCP server only.
- Compatibility: existing stateless helpers remain available; stdio sessions now retain client provenance across newline-delimited JSON-RPC messages.
- Security/privacy: records only the client name already supplied by the MCP initialize payload; no wire-format mutation and no new persistence.
- Performance: negligible in-memory string normalization during initialize only.

## Related

- Refs #2317
- Depends conceptually on #2306 and #2316 for write-tool consumption.
- Follow-up PR(s)/TODOs: after #2306/#2316 merge, thread `McpSession::source_type()` into `memory.store`, `memory.note`, and `tree.tag` source_type construction.

---

## AI Authored PR Metadata (required for Codex/Linear PRs)

> Keep this section for AI-authored PRs. For human-only PRs, mark each field `N/A`.

### Linear Issue
- Key: N/A
- URL: N/A

### Commit & Branch
- Branch: `feat/mcp-client-provenance`
- Commit SHA: `95ab2dfe`

### Validation Run
- [x] `pnpm --filter openhuman-app format:check` — blocked locally; see Validation Blocked.
- [x] `pnpm typecheck` — not run locally because Node/app dependency environment is blocked; see Validation Blocked.
- [x] Focused tests: `GGML_NATIVE=OFF cargo test --lib mcp_server --manifest-path Cargo.toml` — 47 passed, 0 failed.
- [x] Rust fmt/check (if changed): `cargo fmt --check --manifest-path Cargo.toml` — passed.
- [x] Tauri fmt/check (if changed): N/A, no Tauri shell files changed.
- [x] Additional: `git diff --check` — passed.

### Validation Blocked
- `command:` `pnpm --filter openhuman-app format:check` via pre-push hook
- `error:` local app dependencies are not installed (`prettier: command not found`), and local Node is `v22.14.0` while `openhuman-app` requires `>=24.0.0`.
- `impact:` local JS/Prettier validation could not run in this environment; Rust-focused validation for the touched MCP core files passed. Push used `--no-verify` because the hook failure was local environment/dependency setup, not this change.
- `command:` `GGML_NATIVE=OFF cargo clippy --lib --manifest-path Cargo.toml --no-deps -- -D warnings`
- `error:` blocked by 119 pre-existing lint errors in unrelated files (examples: unused imports in `src/openhuman/inference/local/mod.rs`, duplicate module lints in `src/openhuman/inference/provider/*`, doc/comment lints, and unrelated clippy style lints across memory/tools/wallet).
- `impact:` clippy cannot currently be used as a clean global gate locally; focused MCP tests and Rust formatting passed.

### Behavior Changes
- Intended behavior change: MCP stdio sessions now remember the normalized client source label from `initialize.params.clientInfo.name` and preserve it for the session.
- User-visible effect: none immediately for read-only tools; follow-up write tools can attribute memory writes to `mcp:<client>` while preserving `mcp` fallback.

### Parity Contract
- Legacy behavior preserved: missing/empty/blank/unusable client names continue to produce bare `mcp`; later malformed initialize payloads do not clear already captured session provenance.
- Guard/fallback/dispatch parity checks: existing `handle_json_line` / `handle_json_value` APIs still work; stdio loop uses the new stateful handlers so session data persists between messages.

### Duplicate / Superseded PR Handling
- Duplicate PR(s): #2306 and #2316 are related dependencies, not duplicates.
- Canonical PR: this PR is the canonical non-duplicative foundation for #2317 on `main`.
- Resolution (closed/superseded/updated): N/A.


<!-- This is an auto-generated comment: release notes by coderabbit.ai -->
## Summary by CodeRabbit

* **New Features**
  * MCP server now captures and preserves a client "source" label from initialization, normalizing client names and falling back to a default when absent.

* **Documentation**
  * Added guidance on client provenance, name-normalization rules, and recommended source-label usage for tools.

* **Tests**
  * Added unit tests verifying client name normalization and initialization behavior for captured source labels.

<!-- review_stack_entry_start -->

[![Review Change Stack](https://storage.googleapis.com/coderabbit_public_assets/review-stack-in-coderabbit-ui.svg)](https://app.coderabbit.ai/change-stack/tinyhumansai/openhuman/pull/2332?utm_source=github_walkthrough&utm_medium=github&utm_campaign=change_stack)

<!-- review_stack_entry_end -->
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Co-authored-by: 李冠辰 <liguanchen@xiaomi.com>
Co-authored-by: Steven Enamakel <enamakel@tinyhumans.ai>
@justinhsu1477
Copy link
Copy Markdown
Contributor

@graycyrus — quick technical note on the [critical] flag against dispatch_write_tool:

I think the read may have skipped the line right above the match arm. build_rpc_params runs before the match in call_tool, not inside any of the arms:

// src/openhuman/mcp_server/tools.rs (current PR HEAD)
pub async fn call_tool(name: &str, arguments: Value) -> Result<Value, ToolCallError> {
    let spec = ...;
    let params = build_rpc_params(spec.name, arguments)?;   // ← runs first
    match spec.name {
        ...
        "memory.store" | "memory.note" => {
            enforce_write_policy(spec.name).await?;
            validate_controller_params(&spec, &params)?;
            return dispatch_write_tool(spec.name, &params).await;
            //                                    ↑ &params, not raw `arguments`
        }
        ...
    }
}

So by the time dispatch_write_tool is reached, params already carries the slug-based key, source_type = "mcp", the default namespace = "mcp", and (for memory.note) the metadata.annotates_chunk_id linkage. Those transformations aren't dead code — they're exactly what memory_doc_put ends up receiving via the params.clone() inside dispatch_write_tool.

The existing tests in this PR pin that behaviour:

  • memory_store_defaults_namespace_to_mcp asserts the built params have namespace == "mcp", source_type == "mcp", and a mcp-store-<slug> key.
  • memory_note_builds_annotation_document asserts the built params include the metadata.annotates_chunk_id linkage.

Both pass at the current head. If the dispatch were sending raw client args, neither would.

That said, the [major] memory.note upsert-vs-append note (line 680) is correct and worth addressing before merge: keying solely on chunk_id does mean re-annotating the same chunk silently overwrites the prior note. Two clean ways through:

  1. Mix a short content/timestamp hash into the key (mcp-note-{chunk_id}-{hash8(note_text)}) so the tool is genuinely append-style.
  2. Update the description from "Append a note" to "Set a note" if single-note-per-chunk is the intent.

Either resolves the contract — happy to take a hint on the preferred direction so the fix lands consistently across memory.note (here) and tree.tag (#2316).

mtkik pushed a commit to mtkik/openhuman-meet that referenced this pull request May 21, 2026
## Summary

- Adds MCP stdio session state that captures `initialize.params.clientInfo.name` for the lifetime of the session.
- Normalizes known MCP client names into stable source labels such as `mcp:claude-desktop`, `mcp:cursor`, and `mcp:windsurf`.
- Preserves the existing bare `mcp` fallback for missing, empty, whitespace-only, or Unicode-only client names.
- Keeps existing stateless protocol helpers for tests/callers while wiring the stdio loop through the new stateful handler.
- Documents the provenance contract for follow-up write-capable MCP tools.

## Problem

tinyhumansai#2317 needs MCP write tools to distinguish which client wrote memory, not only that the write came from MCP. The write-tool PRs are still in flight (tinyhumansai#2306 for `memory.store` / `memory.note`, tinyhumansai#2316 for `tree.tag`), so implementing the full source-type propagation directly on `main` would duplicate those open PRs.

## Solution

This PR lands the non-duplicative foundation first: the MCP server records the client identity at `initialize` time and exposes a session source label internally. The default remains `mcp`, so older clients and clients without `clientInfo.name` retain current behavior. Once tinyhumansai#2306/tinyhumansai#2316 merge, the write dispatch path can use `session.source_type()` instead of the placeholder `mcp` source string.

## Submission Checklist

> If a section does not apply to this change, mark the item as `N/A` with a one-line reason. Do not delete items.

- [x] Tests added or updated (happy path + at least one failure / edge case) per [Testing Strategy](../gitbooks/developing/testing-strategy.md#failure-path-requirement)
- [x] **Diff coverage ≥ 80%** — local coverage not run; CI coverage gate is the source of truth for changed-line coverage on this PR.
- [x] Coverage matrix updated — N/A: MCP protocol/session foundation only; no feature matrix row added/removed/renamed.
- [x] All affected feature IDs from the matrix are listed in the PR description under `## Related` — N/A: no feature matrix row applies.
- [x] No new external network dependencies introduced (mock backend used per [Testing Strategy](../gitbooks/developing/testing-strategy.md#mock-policy))
- [x] Manual smoke checklist updated if this touches release-cut surfaces — N/A: no release manual smoke flow changes.
- [x] Linked issue closed via `Closes #NNN` in the `## Related` section — N/A: this prepares tinyhumansai#2317 but does not fully close it until write tools consume the session source label.

## Impact

- Runtime/platform impact: CLI stdio MCP server only.
- Compatibility: existing stateless helpers remain available; stdio sessions now retain client provenance across newline-delimited JSON-RPC messages.
- Security/privacy: records only the client name already supplied by the MCP initialize payload; no wire-format mutation and no new persistence.
- Performance: negligible in-memory string normalization during initialize only.

## Related

- Refs tinyhumansai#2317
- Depends conceptually on tinyhumansai#2306 and tinyhumansai#2316 for write-tool consumption.
- Follow-up PR(s)/TODOs: after tinyhumansai#2306/tinyhumansai#2316 merge, thread `McpSession::source_type()` into `memory.store`, `memory.note`, and `tree.tag` source_type construction.

---

## AI Authored PR Metadata (required for Codex/Linear PRs)

> Keep this section for AI-authored PRs. For human-only PRs, mark each field `N/A`.

### Linear Issue
- Key: N/A
- URL: N/A

### Commit & Branch
- Branch: `feat/mcp-client-provenance`
- Commit SHA: `95ab2dfe`

### Validation Run
- [x] `pnpm --filter openhuman-app format:check` — blocked locally; see Validation Blocked.
- [x] `pnpm typecheck` — not run locally because Node/app dependency environment is blocked; see Validation Blocked.
- [x] Focused tests: `GGML_NATIVE=OFF cargo test --lib mcp_server --manifest-path Cargo.toml` — 47 passed, 0 failed.
- [x] Rust fmt/check (if changed): `cargo fmt --check --manifest-path Cargo.toml` — passed.
- [x] Tauri fmt/check (if changed): N/A, no Tauri shell files changed.
- [x] Additional: `git diff --check` — passed.

### Validation Blocked
- `command:` `pnpm --filter openhuman-app format:check` via pre-push hook
- `error:` local app dependencies are not installed (`prettier: command not found`), and local Node is `v22.14.0` while `openhuman-app` requires `>=24.0.0`.
- `impact:` local JS/Prettier validation could not run in this environment; Rust-focused validation for the touched MCP core files passed. Push used `--no-verify` because the hook failure was local environment/dependency setup, not this change.
- `command:` `GGML_NATIVE=OFF cargo clippy --lib --manifest-path Cargo.toml --no-deps -- -D warnings`
- `error:` blocked by 119 pre-existing lint errors in unrelated files (examples: unused imports in `src/openhuman/inference/local/mod.rs`, duplicate module lints in `src/openhuman/inference/provider/*`, doc/comment lints, and unrelated clippy style lints across memory/tools/wallet).
- `impact:` clippy cannot currently be used as a clean global gate locally; focused MCP tests and Rust formatting passed.

### Behavior Changes
- Intended behavior change: MCP stdio sessions now remember the normalized client source label from `initialize.params.clientInfo.name` and preserve it for the session.
- User-visible effect: none immediately for read-only tools; follow-up write tools can attribute memory writes to `mcp:<client>` while preserving `mcp` fallback.

### Parity Contract
- Legacy behavior preserved: missing/empty/blank/unusable client names continue to produce bare `mcp`; later malformed initialize payloads do not clear already captured session provenance.
- Guard/fallback/dispatch parity checks: existing `handle_json_line` / `handle_json_value` APIs still work; stdio loop uses the new stateful handlers so session data persists between messages.

### Duplicate / Superseded PR Handling
- Duplicate PR(s): tinyhumansai#2306 and tinyhumansai#2316 are related dependencies, not duplicates.
- Canonical PR: this PR is the canonical non-duplicative foundation for tinyhumansai#2317 on `main`.
- Resolution (closed/superseded/updated): N/A.


<!-- This is an auto-generated comment: release notes by coderabbit.ai -->
## Summary by CodeRabbit

* **New Features**
  * MCP server now captures and preserves a client "source" label from initialization, normalizing client names and falling back to a default when absent.

* **Documentation**
  * Added guidance on client provenance, name-normalization rules, and recommended source-label usage for tools.

* **Tests**
  * Added unit tests verifying client name normalization and initialization behavior for captured source labels.

<!-- review_stack_entry_start -->

[![Review Change Stack](https://storage.googleapis.com/coderabbit_public_assets/review-stack-in-coderabbit-ui.svg)](https://app.coderabbit.ai/change-stack/tinyhumansai/openhuman/pull/2332?utm_source=github_walkthrough&utm_medium=github&utm_campaign=change_stack)

<!-- review_stack_entry_end -->
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Co-authored-by: 李冠辰 <liguanchen@xiaomi.com>
Co-authored-by: Steven Enamakel <enamakel@tinyhumans.ai>
1. memory.note key now includes a content hash suffix so multiple notes
   on the same chunk get distinct keys instead of silently overwriting:
   `mcp-note-{chunk_id}-{content_hash}`

2. Unified logging in dispatch_write_tool to use tracing exclusively
   (was mixing log::warn/error with tracing::info).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Cap content/note_text at 64KB in build_rpc_params to prevent
misbehaving MCP clients from storing arbitrarily large documents.
Consistent with the Memory Tree's ~3k-token chunk budget.

Addresses graycyrus's pre-merge hardening review.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@Liohtml
Copy link
Copy Markdown
Contributor Author

Liohtml commented May 21, 2026

@graycyrus Thanks for the thorough security review — really appreciate the depth.

Fixed before merge (cdd5818):

  • Added 64KB content size limit on both memory.store and memory.note in build_rpc_params. Rejects with a clear error message including actual vs limit sizes. Test added.

Acknowledged as follow-ups (not blocking):

  • ToolOperation::Write as distinct gate — agreed, will file when the surface grows beyond store/note
  • Per-write user notification — UX concern, separate issue territory
  • Write-specific rate limiting — good idea for a tighter budget
  • Multi-process write path documentation — SQLite WAL handles persistence; cache staleness is acceptable and documented in Add write tools to the MCP server (memory.store, memory.note, tree.tag) #2269's Q1 resolution

All 48 MCP tool tests pass.

@YOMXXX
Copy link
Copy Markdown
Contributor

YOMXXX commented May 22, 2026

Status update: the current red frontend/i18n checks on this PR are from a main-side German locale parity gap, not from this PR's MCP Rust changes. I reproduced the failure on latest upstream/main: German was missing 20 English keys, which caused pnpm i18n:check, I18nContext.test.tsx, and coverage.test.ts to fail.\n\nOpened #2470 to fix the German locale gap. Once #2470 lands, rerun #2306 checks. The remaining non-CI blocker is the stale CHANGES_REQUESTED review decision from earlier reviews; all review threads are resolved, so after checks are green this should just need a fresh review/dismissal from the reviewer.

@YOMXXX
Copy link
Copy Markdown
Contributor

YOMXXX commented May 22, 2026

@graycyrus Heads up: #2470 is now green and ready for review/merge. Once it lands on main, #2306 should just need a CI rerun plus a fresh review/dismissal of the stale CHANGES_REQUESTED state. I am not rerunning #2306 yet because the German i18n fix is not on main, so the same frontend/i18n checks would likely fail again.

@justinhsu1477
Copy link
Copy Markdown
Contributor

@Liohtml — nice job on the content-size cap (cdd5818). And kudos for the patience with this review cycle — landing the foundation here makes #2316 a trivial rebase on top.

Looking forward to merge once @YOMXXX's #2470 clears the German-locale CI noise.

@YOMXXX
Copy link
Copy Markdown
Contributor

YOMXXX commented May 22, 2026

I prepared and validated a sync branch for this PR, but cannot push directly to Liohtml:feat/mcp-write-tools from the current YOMXXX credentials (Permission to Liohtml/openhuman.git denied).

Branch available for fast-forward / cherry-pick:

  • YOMXXX:fix/2306-review-sync
  • Head commit: 196ca6dc
  • Includes: merge from current upstream/main (7fe3dd06) plus a small German i18n fix for missing MCP settings strings.

Validation run locally on that branch:

  • GGML_NATIVE=OFF cargo test --manifest-path Cargo.toml mcp_server::tools -- --nocapture -> 48 passed, 0 failed
  • pnpm i18n:check -> passed, including de missing keys = 0
  • pre-push with GGML_NATIVE=OFF completed: format check, lint (0 errors / existing warnings), TypeScript compile, Tauri rust:check, and command-token lint passed

The previous reviewer findings for the MCP write-tool logic appear resolved by the current branch history. Once Liohtml:feat/mcp-write-tools is advanced to this branch, this PR should be ready for CI rerun and final reviewer approval. cc @graycyrus

CodeGhost21 pushed a commit to CodeGhost21/openhuman that referenced this pull request May 22, 2026
## Summary

- Adds MCP stdio session state that captures `initialize.params.clientInfo.name` for the lifetime of the session.
- Normalizes known MCP client names into stable source labels such as `mcp:claude-desktop`, `mcp:cursor`, and `mcp:windsurf`.
- Preserves the existing bare `mcp` fallback for missing, empty, whitespace-only, or Unicode-only client names.
- Keeps existing stateless protocol helpers for tests/callers while wiring the stdio loop through the new stateful handler.
- Documents the provenance contract for follow-up write-capable MCP tools.

## Problem

tinyhumansai#2317 needs MCP write tools to distinguish which client wrote memory, not only that the write came from MCP. The write-tool PRs are still in flight (tinyhumansai#2306 for `memory.store` / `memory.note`, tinyhumansai#2316 for `tree.tag`), so implementing the full source-type propagation directly on `main` would duplicate those open PRs.

## Solution

This PR lands the non-duplicative foundation first: the MCP server records the client identity at `initialize` time and exposes a session source label internally. The default remains `mcp`, so older clients and clients without `clientInfo.name` retain current behavior. Once tinyhumansai#2306/tinyhumansai#2316 merge, the write dispatch path can use `session.source_type()` instead of the placeholder `mcp` source string.

## Submission Checklist

> If a section does not apply to this change, mark the item as `N/A` with a one-line reason. Do not delete items.

- [x] Tests added or updated (happy path + at least one failure / edge case) per [Testing Strategy](../gitbooks/developing/testing-strategy.md#failure-path-requirement)
- [x] **Diff coverage ≥ 80%** — local coverage not run; CI coverage gate is the source of truth for changed-line coverage on this PR.
- [x] Coverage matrix updated — N/A: MCP protocol/session foundation only; no feature matrix row added/removed/renamed.
- [x] All affected feature IDs from the matrix are listed in the PR description under `## Related` — N/A: no feature matrix row applies.
- [x] No new external network dependencies introduced (mock backend used per [Testing Strategy](../gitbooks/developing/testing-strategy.md#mock-policy))
- [x] Manual smoke checklist updated if this touches release-cut surfaces — N/A: no release manual smoke flow changes.
- [x] Linked issue closed via `Closes #NNN` in the `## Related` section — N/A: this prepares tinyhumansai#2317 but does not fully close it until write tools consume the session source label.

## Impact

- Runtime/platform impact: CLI stdio MCP server only.
- Compatibility: existing stateless helpers remain available; stdio sessions now retain client provenance across newline-delimited JSON-RPC messages.
- Security/privacy: records only the client name already supplied by the MCP initialize payload; no wire-format mutation and no new persistence.
- Performance: negligible in-memory string normalization during initialize only.

## Related

- Refs tinyhumansai#2317
- Depends conceptually on tinyhumansai#2306 and tinyhumansai#2316 for write-tool consumption.
- Follow-up PR(s)/TODOs: after tinyhumansai#2306/tinyhumansai#2316 merge, thread `McpSession::source_type()` into `memory.store`, `memory.note`, and `tree.tag` source_type construction.

---

## AI Authored PR Metadata (required for Codex/Linear PRs)

> Keep this section for AI-authored PRs. For human-only PRs, mark each field `N/A`.

### Linear Issue
- Key: N/A
- URL: N/A

### Commit & Branch
- Branch: `feat/mcp-client-provenance`
- Commit SHA: `95ab2dfe`

### Validation Run
- [x] `pnpm --filter openhuman-app format:check` — blocked locally; see Validation Blocked.
- [x] `pnpm typecheck` — not run locally because Node/app dependency environment is blocked; see Validation Blocked.
- [x] Focused tests: `GGML_NATIVE=OFF cargo test --lib mcp_server --manifest-path Cargo.toml` — 47 passed, 0 failed.
- [x] Rust fmt/check (if changed): `cargo fmt --check --manifest-path Cargo.toml` — passed.
- [x] Tauri fmt/check (if changed): N/A, no Tauri shell files changed.
- [x] Additional: `git diff --check` — passed.

### Validation Blocked
- `command:` `pnpm --filter openhuman-app format:check` via pre-push hook
- `error:` local app dependencies are not installed (`prettier: command not found`), and local Node is `v22.14.0` while `openhuman-app` requires `>=24.0.0`.
- `impact:` local JS/Prettier validation could not run in this environment; Rust-focused validation for the touched MCP core files passed. Push used `--no-verify` because the hook failure was local environment/dependency setup, not this change.
- `command:` `GGML_NATIVE=OFF cargo clippy --lib --manifest-path Cargo.toml --no-deps -- -D warnings`
- `error:` blocked by 119 pre-existing lint errors in unrelated files (examples: unused imports in `src/openhuman/inference/local/mod.rs`, duplicate module lints in `src/openhuman/inference/provider/*`, doc/comment lints, and unrelated clippy style lints across memory/tools/wallet).
- `impact:` clippy cannot currently be used as a clean global gate locally; focused MCP tests and Rust formatting passed.

### Behavior Changes
- Intended behavior change: MCP stdio sessions now remember the normalized client source label from `initialize.params.clientInfo.name` and preserve it for the session.
- User-visible effect: none immediately for read-only tools; follow-up write tools can attribute memory writes to `mcp:<client>` while preserving `mcp` fallback.

### Parity Contract
- Legacy behavior preserved: missing/empty/blank/unusable client names continue to produce bare `mcp`; later malformed initialize payloads do not clear already captured session provenance.
- Guard/fallback/dispatch parity checks: existing `handle_json_line` / `handle_json_value` APIs still work; stdio loop uses the new stateful handlers so session data persists between messages.

### Duplicate / Superseded PR Handling
- Duplicate PR(s): tinyhumansai#2306 and tinyhumansai#2316 are related dependencies, not duplicates.
- Canonical PR: this PR is the canonical non-duplicative foundation for tinyhumansai#2317 on `main`.
- Resolution (closed/superseded/updated): N/A.


<!-- This is an auto-generated comment: release notes by coderabbit.ai -->
## Summary by CodeRabbit

* **New Features**
  * MCP server now captures and preserves a client "source" label from initialization, normalizing client names and falling back to a default when absent.

* **Documentation**
  * Added guidance on client provenance, name-normalization rules, and recommended source-label usage for tools.

* **Tests**
  * Added unit tests verifying client name normalization and initialization behavior for captured source labels.

<!-- review_stack_entry_start -->

[![Review Change Stack](https://storage.googleapis.com/coderabbit_public_assets/review-stack-in-coderabbit-ui.svg)](https://app.coderabbit.ai/change-stack/tinyhumansai/openhuman/pull/2332?utm_source=github_walkthrough&utm_medium=github&utm_campaign=change_stack)

<!-- review_stack_entry_end -->
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Co-authored-by: 李冠辰 <liguanchen@xiaomi.com>
Co-authored-by: Steven Enamakel <enamakel@tinyhumans.ai>
@senamakel senamakel self-assigned this May 22, 2026
senamakel added a commit to justinhsu1477/openhuman that referenced this pull request May 22, 2026
- Add `annotations` to the new memory.store / memory.note / tree.tag
  McpToolSpec entries (writeable, idempotent upsert, closed-world). This
  unblocks the `Rust Quality (clippy)` CI failure introduced by the
  `annotations` field added in tinyhumansai#2268 — the new specs were forked from
  pre-tinyhumansai#2268 tinyhumansai#2306 scaffolding and were missing the required field on
  rebase against current main.

- Extend the read-only-by-default test exemption (`act_tool_names`) to
  cover the three new write tools, matching their `readOnlyHint: false`.

- Fix CodeRabbit nit (tools.rs:803): the oversize-tag error said
  "characters" but used `String::len()` (byte length). Reword to
  "bytes" so Unicode tag input gets a self-consistent message.

Defers @graycyrus's three inline threads on lines 686/705/1072
(slug_from collisions, memory.note upsert-vs-append, dispatch_write_tool
hardcoded rpc_method) to tinyhumansai#2306 per the author's review-thread reply —
that's where the underlying functions land and where the fix should
originate so this branch can simply inherit on rebase.
The upstream merge introduced duplicate property keys in the German
translation file de-5.ts, causing TS1117 errors in CI. Restored to
the clean upstream/main version.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@Liohtml
Copy link
Copy Markdown
Contributor Author

Liohtml commented May 23, 2026

@graycyrus All your review findings are resolved (confirmed in your own comments above). Could you dismiss your CHANGES_REQUESTED review so the PR can move forward? Thanks for the thorough security review — it made the code significantly better.

@senamakel
Copy link
Copy Markdown
Member

hey @Liohtml same thing as the last pr. just resolve the tests failing in the CI and then all good to go

Liohtml and others added 2 commits May 23, 2026 10:23
Upstream added settings.mascot.customGif{Error,Heading,Label} to the
English locale; the German translation file was missing them, causing
i18n coverage tests to fail.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
AusAgentSmith pushed a commit to AusAgentSmith/openhuman that referenced this pull request May 23, 2026
## Summary

- Adds MCP stdio session state that captures `initialize.params.clientInfo.name` for the lifetime of the session.
- Normalizes known MCP client names into stable source labels such as `mcp:claude-desktop`, `mcp:cursor`, and `mcp:windsurf`.
- Preserves the existing bare `mcp` fallback for missing, empty, whitespace-only, or Unicode-only client names.
- Keeps existing stateless protocol helpers for tests/callers while wiring the stdio loop through the new stateful handler.
- Documents the provenance contract for follow-up write-capable MCP tools.

## Problem

tinyhumansai#2317 needs MCP write tools to distinguish which client wrote memory, not only that the write came from MCP. The write-tool PRs are still in flight (tinyhumansai#2306 for `memory.store` / `memory.note`, tinyhumansai#2316 for `tree.tag`), so implementing the full source-type propagation directly on `main` would duplicate those open PRs.

## Solution

This PR lands the non-duplicative foundation first: the MCP server records the client identity at `initialize` time and exposes a session source label internally. The default remains `mcp`, so older clients and clients without `clientInfo.name` retain current behavior. Once tinyhumansai#2306/tinyhumansai#2316 merge, the write dispatch path can use `session.source_type()` instead of the placeholder `mcp` source string.

## Submission Checklist

> If a section does not apply to this change, mark the item as `N/A` with a one-line reason. Do not delete items.

- [x] Tests added or updated (happy path + at least one failure / edge case) per [Testing Strategy](../gitbooks/developing/testing-strategy.md#failure-path-requirement)
- [x] **Diff coverage ≥ 80%** — local coverage not run; CI coverage gate is the source of truth for changed-line coverage on this PR.
- [x] Coverage matrix updated — N/A: MCP protocol/session foundation only; no feature matrix row added/removed/renamed.
- [x] All affected feature IDs from the matrix are listed in the PR description under `## Related` — N/A: no feature matrix row applies.
- [x] No new external network dependencies introduced (mock backend used per [Testing Strategy](../gitbooks/developing/testing-strategy.md#mock-policy))
- [x] Manual smoke checklist updated if this touches release-cut surfaces — N/A: no release manual smoke flow changes.
- [x] Linked issue closed via `Closes #NNN` in the `## Related` section — N/A: this prepares tinyhumansai#2317 but does not fully close it until write tools consume the session source label.

## Impact

- Runtime/platform impact: CLI stdio MCP server only.
- Compatibility: existing stateless helpers remain available; stdio sessions now retain client provenance across newline-delimited JSON-RPC messages.
- Security/privacy: records only the client name already supplied by the MCP initialize payload; no wire-format mutation and no new persistence.
- Performance: negligible in-memory string normalization during initialize only.

## Related

- Refs tinyhumansai#2317
- Depends conceptually on tinyhumansai#2306 and tinyhumansai#2316 for write-tool consumption.
- Follow-up PR(s)/TODOs: after tinyhumansai#2306/tinyhumansai#2316 merge, thread `McpSession::source_type()` into `memory.store`, `memory.note`, and `tree.tag` source_type construction.

---

## AI Authored PR Metadata (required for Codex/Linear PRs)

> Keep this section for AI-authored PRs. For human-only PRs, mark each field `N/A`.

### Linear Issue
- Key: N/A
- URL: N/A

### Commit & Branch
- Branch: `feat/mcp-client-provenance`
- Commit SHA: `95ab2dfe`

### Validation Run
- [x] `pnpm --filter openhuman-app format:check` — blocked locally; see Validation Blocked.
- [x] `pnpm typecheck` — not run locally because Node/app dependency environment is blocked; see Validation Blocked.
- [x] Focused tests: `GGML_NATIVE=OFF cargo test --lib mcp_server --manifest-path Cargo.toml` — 47 passed, 0 failed.
- [x] Rust fmt/check (if changed): `cargo fmt --check --manifest-path Cargo.toml` — passed.
- [x] Tauri fmt/check (if changed): N/A, no Tauri shell files changed.
- [x] Additional: `git diff --check` — passed.

### Validation Blocked
- `command:` `pnpm --filter openhuman-app format:check` via pre-push hook
- `error:` local app dependencies are not installed (`prettier: command not found`), and local Node is `v22.14.0` while `openhuman-app` requires `>=24.0.0`.
- `impact:` local JS/Prettier validation could not run in this environment; Rust-focused validation for the touched MCP core files passed. Push used `--no-verify` because the hook failure was local environment/dependency setup, not this change.
- `command:` `GGML_NATIVE=OFF cargo clippy --lib --manifest-path Cargo.toml --no-deps -- -D warnings`
- `error:` blocked by 119 pre-existing lint errors in unrelated files (examples: unused imports in `src/openhuman/inference/local/mod.rs`, duplicate module lints in `src/openhuman/inference/provider/*`, doc/comment lints, and unrelated clippy style lints across memory/tools/wallet).
- `impact:` clippy cannot currently be used as a clean global gate locally; focused MCP tests and Rust formatting passed.

### Behavior Changes
- Intended behavior change: MCP stdio sessions now remember the normalized client source label from `initialize.params.clientInfo.name` and preserve it for the session.
- User-visible effect: none immediately for read-only tools; follow-up write tools can attribute memory writes to `mcp:<client>` while preserving `mcp` fallback.

### Parity Contract
- Legacy behavior preserved: missing/empty/blank/unusable client names continue to produce bare `mcp`; later malformed initialize payloads do not clear already captured session provenance.
- Guard/fallback/dispatch parity checks: existing `handle_json_line` / `handle_json_value` APIs still work; stdio loop uses the new stateful handlers so session data persists between messages.

### Duplicate / Superseded PR Handling
- Duplicate PR(s): tinyhumansai#2306 and tinyhumansai#2316 are related dependencies, not duplicates.
- Canonical PR: this PR is the canonical non-duplicative foundation for tinyhumansai#2317 on `main`.
- Resolution (closed/superseded/updated): N/A.


<!-- This is an auto-generated comment: release notes by coderabbit.ai -->
## Summary by CodeRabbit

* **New Features**
  * MCP server now captures and preserves a client "source" label from initialization, normalizing client names and falling back to a default when absent.

* **Documentation**
  * Added guidance on client provenance, name-normalization rules, and recommended source-label usage for tools.

* **Tests**
  * Added unit tests verifying client name normalization and initialization behavior for captured source labels.

<!-- review_stack_entry_start -->

[![Review Change Stack](https://storage.googleapis.com/coderabbit_public_assets/review-stack-in-coderabbit-ui.svg)](https://app.coderabbit.ai/change-stack/tinyhumansai/openhuman/pull/2332?utm_source=github_walkthrough&utm_medium=github&utm_campaign=change_stack)

<!-- review_stack_entry_end -->
<!-- end of auto-generated comment: release notes by coderabbit.ai -->

Co-authored-by: 李冠辰 <liguanchen@xiaomi.com>
Co-authored-by: Steven Enamakel <enamakel@tinyhumans.ai>
@YOMXXX
Copy link
Copy Markdown
Contributor

YOMXXX commented May 23, 2026

Status refresh after today's PR babysit pass:

  • Latest head: ce2f3056.
  • CI is green: all required checks are SUCCESS; only platform E2E skips remain skipped as expected.
  • Review threads check via GraphQL shows every code review thread is isResolved=true and the prior inline threads are now outdated/resolved.
  • The remaining blocker appears to be the stale CHANGES_REQUESTED review state from the earlier graycyrus review, after graycyrus later confirmed the findings were resolved/false alarm.

@graycyrus could you dismiss or update the stale changes-requested review when you have a moment? @senamakel the CI issue you flagged appears resolved now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

working A PR that is being worked on by the team.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add write tools to the MCP server (memory.store, memory.note, tree.tag)

5 participants