Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
74 changes: 74 additions & 0 deletions .agents/skills/nemo-retriever-local-document-search/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
---
name: nemo-retriever-local-document-search
description: Use when the user has a local file or folder of documents and wants to search, ask questions, summarize, or inspect that bounded corpus with NeMo Retriever local document search.
---

# NeMo Retriever Local Document Search

Use this skill when the user has a local file or folder of documents and wants to search, ask questions, summarize, or inspect that bounded corpus. This is for small, single-user, exploratory document search over local files.

Do not use this skill to perform deployments, operate shared services, serve models, or mutate production indexes. Do use this skill to answer documentation questions about those topics when the requested corpus is local docs.

## Decision Rules

Use local search automatically when the corpus is a local path, the task is one-off or exploratory, no multi-user serving requirement is mentioned, no audit or compliance requirement is mentioned, and the corpus appears under the default safety caps.

When the user asks about "these docs", "this folder", or provides a local path, use that path directly. When the user asks about the current project without providing a path, use a clearly implied documentation path only if it is already known from context or can be checked with a narrow existence check such as `./docs` or `./README.md`; otherwise ask for the corpus path. Do not recursively search the repo to discover a corpus.

Ask before indexing when the path may contain sensitive data, the corpus may be large, the user asks for persistent memory, or the workspace is shared.

Route away from local search when the user wants you to implement, operate, configure, or mutate a shared service where multiple users or services need access, access control matters, refresh has to be continuous, or production latency or throughput is required. Do not route away for conceptual documentation questions such as "what are the deployment options?" or "which setup should I choose?"; answer those from local docs with `local_document_ask` when a corpus is available.

## Commands

When MCP tools are available, prefer `local_document_ask` for a first query. Provide `input_path`, `query`, and optional `top_k`, `include`, `exclude`, `max_docs`, and `max_pages`. Omit `index` unless the user explicitly wants a particular index location; the MCP tool chooses a stable path-scoped index by hashing the resolved absolute `input_path`. The tool returns the same payload shape as the CLI, including `index_action`, `warnings`, and `evidence`.

Use `local_document_search` when the user explicitly wants to search an existing index. Use `local_document_status` to inspect index health, staleness, and chunk counts.

Treat Retriever as the recall mechanism. Do not run separate corpus search or inventory commands such as `rg`, `grep`, `find`, or broad `ls` over the document directory before retrieval, and do not inspect repository files to decide which docs to query. Keep the retrieval query faithful to the user's wording; do not add extra product names, deployment modes, or suspected answers unless the user mentioned them. After Retriever returns evidence, you may read the specific `evidence[].source_file` paths it cited to verify context or produce line-specific citations. Do not search or read source code files for docs/how-to questions unless the user explicitly asks for code/API implementation details, or the retrieved evidence itself cites a source file outside the docs corpus. Only search the corpus yourself when the Retriever result is empty, clearly wrong, or the user explicitly asks for lexical matching.

For documentation/how-to questions, answer from the retrieved local docs only. If the user asks what a repository actually does under the hood, asks to find every usage, asks for fallback order, or asks for implementation details, use Retriever first for docs context and then inspect/search the relevant source files. Make the distinction explicit in the answer when code behavior differs from documented guidance.

Do not use web search or external documentation while this skill is active unless the user explicitly asks for current/latest external docs, or the local corpus is missing/unavailable. If local evidence does not mention an option or term, say that it was not surfaced in the local docs instead of searching the web to fill the gap.

If the MCP tools are unavailable, fall back to the CLI:

```bash
retriever local ask ./path/to/docs "What are the warranty limitations?" --output json
```

The MCP tools and CLI default to local GPU inference. Use `inference="remote"` or `--inference remote` only when the user explicitly wants hosted embeddings or local inference is unavailable.

For debugging or validation, prefer the explicit workflow:

```bash
retriever local init ./path/to/docs --index ./.nemo-retriever/local-index --output json
retriever local search "What are the warranty limitations?" --index ./.nemo-retriever/local-index --output json
retriever local status --index ./.nemo-retriever/local-index --output json
```

For recovery, run:

```bash
retriever local doctor --index ./.nemo-retriever/local-index --input-path ./path/to/docs --output json
```

Use remote hosted embeddings when the user has no local GPU:

```bash
retriever local ask ./path/to/docs "What changed?" --inference remote --output json
```

This requires `NVIDIA_API_KEY` or `NGC_API_KEY`; pass `--embed-invoke-url` only when using a non-default embedding endpoint.

## Validation

After indexing, check `documents_processed`, `chunk_count`, `warnings`, and `next_recommended_command` in JSON output. For `ask`, check `resolved_input_path`, `corpus_root`, `index_action`, `indexed_now`, `reused_index`, `reindexed`, and `reindex_reasons` so the user can tell which corpus was used and whether the index was reused or rebuilt. The MCP tool and CLI return retrieved evidence rather than generated prose; synthesize the answer yourself from `evidence[].chunk_text` and cite `evidence[].source_file`, `evidence[].page`, and useful section metadata when present.

Only cite files that appear in returned `evidence[].source_file` or files you explicitly read after retrieval. Do not add extra source filenames from memory or from likely related docs.

Run one focused `ask` command per user question by default. Only run a follow-up Retriever query when the evidence is empty, clearly off-topic, stale, or insufficient for the user's requested level of completeness. Follow-up queries must stay within the local corpus and should target terms from the user's question or from the first Retriever evidence, not speculative categories. If results are empty or warnings mention staleness, run `status` and `doctor`, then re-run `init` if the corpus changed.

## Production Boundary

If the user needs a shared assistant, governed data, role-based access, auditability, continuous refresh, or service-level latency, explain that a single-user local index is the wrong lifecycle. Route them toward deployed, governed retrieval infrastructure instead of indexing locally.
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
interface:
display_name: "NeMo Retriever Local Search"
short_description: "Ask questions over a small local document folder with NeMo Retriever."
default_prompt: "Use NeMo Retriever local document search to answer questions over a local file or folder, citing retrieved evidence."

dependencies:
tools:
- type: "mcp"
value: "nemo_retriever_local"
description: "Repo-local NeMo Retriever document search MCP server."
transport: "stdio"
7 changes: 7 additions & 0 deletions .codex/config.toml.example
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
[mcp_servers.nemo_retriever_local]
command = "/absolute/path/to/repo/nemo_retriever/.venv/bin/retriever-local-mcp"
args = []
# Omit cwd so the MCP server inherits the Codex project/session directory.
startup_timeout_sec = 60
tool_timeout_sec = 3600
enabled_tools = ["local_document_ask", "local_document_search", "local_document_status"]
8 changes: 8 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -272,3 +272,11 @@ nemo_retriever/tabular-dev-tools/benchmarks/**/*.sql
nemo_retriever/tabular-dev-tools/benchmarks/**/*.json

spool/

# Local document search indexes
.nemo-retriever/
.hf-modules/

# Machine-local Codex MCP config
.codex/config.toml
!.codex/config.toml.example
1 change: 1 addition & 0 deletions nemo_retriever/pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -157,6 +157,7 @@ all = [
[project.scripts]
retriever = "nemo_retriever.__main__:main"
retriever-harness = "nemo_retriever.harness:main"
retriever-local-mcp = "nemo_retriever.local.mcp_server:main"

[tool.setuptools.dynamic]
version = {attr = "nemo_retriever.version.get_build_version"}
Expand Down
5 changes: 1 addition & 4 deletions nemo_retriever/src/nemo_retriever/local/__main__.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,10 +4,7 @@

from __future__ import annotations

import typer


app = typer.Typer(help="Simple non-distributed pipeline for local development, debugging, and research.")
from nemo_retriever.local.document_search import app


def main():
Expand Down
Loading
Loading