POC: Integrate local document retrieval with skills via MCP by oliverholworthy · Pull Request #2002 · NVIDIA/NeMo-Retriever

oliverholworthy · 2026-05-08T16:29:56Z

This PR adds a POC for document retrieval through a skill and MCP tool. The goal of this POC is to explore integration patterns for local document retreival:

a repo-local skill can automatically route local documentation questions to a retrieval tool
an MCP server provides a clean agent/tool boundary
indexes are local, reusable, and scoped to the resolved input path
the agent receives structured evidence and synthesizes the final answer from that evidence, instead of manually grepping the repository

What This Adds

retriever local CLI commands for local document indexing/search:
- init
- search
- ask
- status
- doctor
- clean
MCP server entrypoint:
- retriever-local-mcp
MCP tools:
- local_document_ask
- local_document_search
- local_document_status
Repo-scoped Codex skill:
- .agents/skills/nemo-retriever-local-document-search
Project-local Codex MCP config example:
- .codex/config.toml.example

Supported document types are currently:

.pdf
.txt
.md
.markdown
.docx
.pptx

The workflow is retrieval-only. ask returns evidence and metadata, but does not generate a prose answer itself:

"answer": null,
"answer_generation": "not_configured"

The agent is expected to synthesize the final response from returned evidence.

Configuration

After installing/building the local NeMo Retriever environment, configure Codex with a project-local .codex/config.toml like:

[mcp_servers.nemo_retriever_local]
command = "/absolute/path/to/NeMo-Retriever/nemo_retriever/.venv/bin/retriever-local-mcp"
args = []
startup_timeout_sec = 60
tool_timeout_sec = 3600
enabled_tools = ["local_document_ask", "local_document_search", "local_document_status"]

cwd is intentionally omitted so the MCP server inherits the active Codex project/session directory. This lets prompts like In ./docs, ... resolve relative to whichever project Codex is currently running in.

For another project, copy the skill directory into that repo:

.agents/skills/nemo-retriever-local-document-search/

Then start Codex from that project root and ask a docs-grounded question such as:

In ./docs, explain how to configure this project for self-hosted model endpoints, async execution, validators, and optional MCP tool use. Cite the docs you use.

Behavior

By default the tool uses local embedding inference with:

nvidia/llama-nemotron-embed-1b-v2

Remote embedding is available explicitly with --inference remote / inference="remote" and an API key, but local is the default for the skill.

When the MCP tool is called without an explicit index, it derives a stable project-local index path from a hash of the resolved absolute input path, for example:

.nemo-retriever/local-index-54ed29c6fcb8

This avoids collisions between ./docs in different repos and allows warm reuse across follow-up questions.

What The POC Demonstrates

Tested this with the NeMo Retriever docs and the DataDesigner docs. The DataDesigner test showed the agent using the MCP retrieval tool first, creating/reusing a path-scoped local index, and answering a multi-part configuration question from retrieved docs without broad manual repo search.

This is the core outcome: the skill + MCP pattern working as as a portable way to wire local retrieval into agent behavior.

Known Gaps

Evidence is currently chunk/file based; Markdown/text line spans would improve citation quality.
PDF support is text-focused; this does not currently use full multimodal extraction.
The index is refreshed from a manifest/staleness check, not a live watcher.
This is not intended as a shared production search service.
The implementation is larger than ideal because the library does not yet expose a single “index this local corpus into a VDB and search it” abstraction.
Discovery, manifest/staleness, and ingest-to-LanceDB lifecycle code should be moved into smaller reusable modules or library APIs.

oliverholworthy added 5 commits May 8, 2026 13:13

Add local document search CLI and MCP skill

8ffc250

Remove packaged skill eval prompts

45d4a73

Remove repo-local skill eval prompts

56769f8

Remove duplicate top-level skill copy

23461f8

Revert text split fallback tokenizer

4286c52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

POC: Integrate local document retrieval with skills via MCP#2002

POC: Integrate local document retrieval with skills via MCP#2002
oliverholworthy wants to merge 5 commits intoNVIDIA:mainfrom
oliverholworthy:oholworthy/local-search-cli-skill

oliverholworthy commented May 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

oliverholworthy commented May 8, 2026

What This Adds

Configuration

Behavior

What The POC Demonstrates

Known Gaps

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant