RAG Library Mode integration #1440

ericevans-nv · 2026-01-20T21:57:25Z

Description

Adds the nvidia_nat_rag_lib plugin package that integrates the NVIDIA RAG Blueprint pipeline into NeMo Agent Toolkit, providing grounded retrieval and generation capabilities.

Note

The nvidia-rag>=2.4.0 package is scheduled for release on PyPI around February 10th, 2026.

What's Included

New Package: nvidia_nat_rag_lib

search tool: Retrieves grounded document excerpts from indexed sources
generate tool: Synthesizes answers grounded in cited source material
Declarative YAML configuration for RAG pipeline features:
- Two-stage retrieval (vector search + semantic reranking)
- Query rewriting via LLM
- Confidence filtering
- Structured citations

Example Configuration

examples/RAG/simple_rag/configs/rag_library_mode_config.yml demonstrating the plugin
Updated README with setup instructions, bootstrap data commands, and example output

Ingestion Script Enhancement

Added --drop_collection flag to scripts/langchain_web_ingest.py for easier re-ingestion with different embedding models

By Submitting this PR I confirm:

I am familiar with the Contributing Guidelines.
We require that all contributors "sign-off" on their commits. This certifies that the contribution is your original work, or you have rights to submit it under the same license, or a compatible license.
- Any contribution which contains commits that are not Signed-Off will not be accepted.
When the PR is ready for review, new or existing tests cover these changes.
When the PR is ready for review, the documentation is up to date with these changes.

Summary by CodeRabbit

New Features
- NVIDIA RAG Library: end-to-end RAG tools (search & generate) with reranking, query rewriting, citations, confidence filtering, multi-collection search, and a ready-to-use function group.
- CLI: options to drop an existing collection before ingest and to select embedding model.
- Embedding config: optional embedding output-dimension setting.
Documentation
- Expanded README with NVIDIA RAG Library guide, examples, and example RAG config.
Tests
- Added unit and integration tests covering RAG models, tools, and integration workflows.
Chores
- Added optional rag-lib dependency and packaging/license placeholders for the new RAG subpackage.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

Signed-off-by: Eric Evans <194135482+ericevans-nv@users.noreply.github.com>

…ag-lib-mode-integration

Signed-off-by: Eric Evans <194135482+ericevans-nv@users.noreply.github.com>

…ag-lib-mode-integration

…roper testing Signed-off-by: Eric Evans <194135482+ericevans-nv@users.noreply.github.com>

Signed-off-by: Eric Evans <194135482+ericevans-nv@users.noreply.github.com>

coderabbitai · 2026-01-20T21:57:44Z

Walkthrough

Adds a new NVIDIA RAG Library subpackage with configs, Pydantic models, an async adapter exposing search/generate tools, unit and integration tests, example config/docs, workspace/dependency registrations, a small embedder config field, and CLI flags for the ingestion script.

Changes

Cohort / File(s)	Summary
New NVIDIA RAG package `packages/nvidia_nat_rag_lib/pyproject.toml`, `packages/nvidia_nat_rag_lib/...`	Introduces the nvidia-nat-rag-lib package: project metadata, build config, dependencies (nvidia-nat~=1.5, nvidia-rag>=2.4.0), setuptools_scm versioning, UV workspace mapping, and entry point `nat.plugins.rag_lib.register`.
RAG plugin code `packages/nvidia_nat_rag_lib/src/nat/plugins/rag_lib/__init__.py`, `.../client.py`, `.../config.py`, `.../models.py`, `.../register.py`	Adds `RAGPipelineConfig` and result models, plus `nvidia_rag_lib` async adapter that resolves LLM/embedder/retriever refs, instantiates NvidiaRAG, and exposes `search` and `generate` tools with streaming and error handling.
Package metadata & docs `packages/nvidia_nat_rag_lib/LICENSE-3rd-party.txt`, `packages/nvidia_nat_rag_lib/LICENSE.md`, `packages/nvidia_nat_rag_lib/src/nat/meta/pypi.md`	Adds license reference files and PyPI-facing package description and feature notes.
Tests for rag-lib `packages/nvidia_nat_rag_lib/tests/test_models.py`, `packages/nvidia_nat_rag_lib/tests/test_rag_lib_function.py`, `packages/nvidia_nat_rag_lib/tests/test_tools.py`	Adds unit tests for models and tools and extensive pytest integration tests (Milvus-backed), fixtures, parametrization, and an expected xfail case.
Examples & README `examples/RAG/simple_rag/configs/rag_library_mode_config.yml`, `examples/RAG/simple_rag/README.md`	Adds example YAML wiring NVIDIA components and an "Advanced RAG with NVIDIA RAG Library" README section; reorganizes README structure.
Workspace & dependency updates `pyproject.toml`, `packages/nvidia_nat_all/pyproject.toml`	Registers `nvidia-nat-rag-lib` as a workspace package, adds optional dependency group `rag-lib`, and includes the package in nvidia_nat_all dependencies and UV sources.
Embedder config change `src/nat/embedder/nim_embedder.py`	Adds optional `dimensions: int
Ingestion script enhancements `scripts/langchain_web_ingest.py`	Adds CLI flags `--drop_collection` and `--embedding_model`, imports `MilvusClient`, and optionally drops an existing Milvus collection before ingestion.
Docker compose formatting `examples/deploy/docker-compose.milvus.yml`	Removes a trailing empty line (formatting-only).
Vale vocabulary `ci/vale/styles/config/vocabularies/nat/accept.txt`	Adds regex entries for "Declaratively" and "Rerank/ Reranking".

Sequence Diagram(s)

sequenceDiagram
  participant Builder as Builder
  participant Resolver as ConfigResolver
  participant NvidiaRAG as NvidiaRAG
  participant VectorStore as Retriever/VectorStore
  participant FunctionGroup as FunctionGroup

  Builder->>Resolver: Build NvidiaRAGConfig from NvidiaRAGLibConfig
  Resolver->>Resolver: Resolve LLM, Embedder, Retriever refs
  Resolver->>VectorStore: Validate/configure collection_names & retriever
  Resolver-->>Builder: Return populated NvidiaRAGConfig
  Builder->>NvidiaRAG: Instantiate client with NvidiaRAGConfig
  NvidiaRAG-->>Builder: Client ready
  Builder->>FunctionGroup: Yield FunctionGroup wrapping client
  FunctionGroup->>NvidiaRAG: search(query) / generate(query)
  NvidiaRAG-->>FunctionGroup: search results / streamed deltas

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 60.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title clearly and concisely describes the main change: integration of a RAG (Retrieval-Augmented Generation) Library Mode, which is the primary focus of this PR introducing the nvidia_nat_rag_lib plugin package.
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 15

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

pyproject.toml (1)
113-126: Add nvidia-nat-rag-lib to the telemetry optional dependency group.

Per coding guidelines, the telemetry group must include rag-lib. Update line 124 to include it in alphabetical order.
🔧 Suggested change
 telemetry = [
   "nvidia-nat-data-flywheel",
   "nvidia-nat-opentelemetry",
   "nvidia-nat-phoenix",
   "nvidia-nat-ragaai",
+  "nvidia-nat-rag-lib",
   "nvidia-nat-weave",
 ]

🤖 Fix all issues with AI agents

In `@examples/deploy/docker-compose.milvus.yml`:
- Around line 24-25: Update all example and docs that reference MinIO on
localhost:9000 to the new host port mapping (localhost:19000 or
http://localhost:19000) so they match docker-compose.milvus.yml's "-
"19000:9000"" mapping; specifically change endpoint_url / S3_ENDPOINT_URL
entries in examples/object_store/user_report/configs/config_s3.yml
(endpoint_url), examples/object_store/user_report/README.md (URLs),
examples/evaluation_and_profiling/simple_web_query_eval/README.md
(S3_ENDPOINT_URL), docs/source/build-workflows/object-store.md (all
localhost:9000 mentions),
docs/source/components/auth/mcp-auth/mcp-auth-token-storage.md (endpoint_url),
and docs/source/extend/custom-components/object-store.md (endpoint_url), or
alternatively add a short note in those files explaining why
docker-compose.milvus.yml maps host ports to 19000/19001 while
docker-compose.minio.yml uses 9000:9000.

In `@examples/RAG/simple_rag/configs/rag_library_mode_config.yml`:
- Around line 73-78: The generate block for rag_lib_generate incorrectly claims
reflection is used while the reflection configuration is commented out; either
re-enable the reflection settings for the generate pipeline (uncomment and wire
up the reflection configuration) or update the description field in the generate
block to remove mention of reflection so it accurately reflects current
behavior; locate the generate:_type: rag_lib_generate block and modify the
description or restore the reflection section to match the stated behavior.
- Around line 53-60: Remove the TODO/BUG comments and the commented-out
reflection block (the lines referencing reflection, enable_reflection,
max_loops, context_relevance_threshold, response_groundedness_threshold and the
BUG note about check_context_relevance()) from the example config; if you need
to preserve traceability, replace those comments with a single short
tracking-issue reference (e.g., "See issue `#1234`") rather than TODO/BUG text so
the config stays clean and guideline-compliant.

In `@packages/nvidia_nat_rag_lib/LICENSE.md`:
- Line 1: Add the standard SPDX header line "SPDX-License-Identifier:
Apache-2.0" (or the full Apache-2.0 SPDX header as required by project
guidelines) at the top of this Markdown pointer file, placing it immediately
above the existing reference line "../../LICENSE.md" so the file begins with the
SPDX identifier followed by the pointer line.

In `@packages/nvidia_nat_rag_lib/src/nat/plugins/rag_lib/client.py`:
- Line 1: Update the copyright header in client.py to match the package
convention by changing the year token "2026" to the range "2025-2026"; locate
the top-of-file SPDX comment in
packages/nvidia_nat_rag_lib/src/nat/plugins/rag_lib/client.py (the
SPDX-FileCopyrightText line) and replace the single-year value with the dashed
range used elsewhere.

In `@packages/nvidia_nat_rag_lib/src/nat/plugins/rag_lib/models.py`:
- Around line 16-43: The models import currently guards Citations behind
TYPE_CHECKING which leaves Citations undefined at runtime and breaks Pydantic v2
validation; fix by importing Citations at module runtime (remove the
TYPE_CHECKING guard and add from nvidia_rag.rag_server.response_generator import
Citations) and update the type annotations on RAGSearchResult.citations and
RAGGenerateResult.citations to use the actual Citations type (not a string) so
Pydantic can resolve the type without requiring model_rebuild().

In `@packages/nvidia_nat_rag_lib/src/nat/plugins/rag_lib/register.py`:
- Line 1: Update the SPDX copyright header line that currently reads
"SPDX-FileCopyrightText: Copyright (c) 2026, NVIDIA CORPORATION & AFFILIATES.
All rights reserved." to use the range "2025-2026" for consistency with
client.py; locate the SPDX header at the top of register.py (the
SPDX-FileCopyrightText line) and replace the single year with "2025-2026".

In `@packages/nvidia_nat_rag_lib/src/nat/plugins/rag_lib/tools/__init__.py`:
- Around line 1-14: Add a module-level docstring to the top of
nat.plugins.rag_lib.tools __init__.py (immediately after the existing
SPDX/license header) describing the package purpose (e.g., "RAG library tool
functions."). Preserve the existing header exactly, do not change imports or
re-exports (they are optional), and ensure the docstring is a simple
triple-quoted string placed at module scope so it satisfies the coding
guidelines.

In `@packages/nvidia_nat_rag_lib/src/nat/plugins/rag_lib/tools/search.py`:
- Around line 1-64: Add a module-level docstring describing the public purpose
of this file, then convert the existing docstrings for the RAGLibSearchConfig
class and rag_lib_search function into Google-style docstrings: for
RAGLibSearchConfig include a short summary, an Args section documenting
`rag_client`, `topic`, `description`, `collection_names`, `reranker_top_k`,
`vdb_top_k`, `confidence_threshold`, `enable_query_rewriting`,
`enable_reranker`, `enable_filter_generator`, and `filter_expr` (with
types/backticked identifiers) and any Defaults; for rag_lib_search include a
short summary and an Args section documenting `config` (`RAGLibSearchConfig`)
and `builder` (`Builder`) and a Returns section describing the return type (if
any) or `None`; update formatting to use Google style and ensure public symbols
RAGLibSearchConfig and rag_lib_search are covered.
- Around line 62-63: The public async generator rag_lib_search currently lacks
an explicit return type; update its signature to include
AsyncIterator[FunctionInfo] (or AsyncGenerator[FunctionInfo, None]) as the
return annotation on rag_lib_search(config: RAGLibSearchConfig, builder:
Builder) and add the corresponding typing import (AsyncIterator or
AsyncGenerator) plus ensure FunctionInfo is imported/available; this makes the
public API fully type-hinted while preserving the existing generator behavior.

In `@packages/nvidia_nat_rag_lib/tests/test_rag_lib_function.py`:
- Around line 569-576: Replace the TODO comment in the embedder parameterization
by explicitly marking the failing option as expected-fail instead of leaving a
TODO: add "nim_embedder_e5" back into the parametrized list as a pytest.param
with pytest.mark.xfail(reason="nv-embedqa-e5-v5 rejects dimensions param —
reference <issue-number or URL>"), or alternatively replace the TODO with a
one-line comment containing the issue reference; update the same pattern for the
other parametrizations. Target the parametrization that defines "embedder_ref"
in test_rag_lib_function.py to ensure the failing embedder is either xfailed
with a clear reason or annotated with the issue reference.
- Around line 15-31: Remove the "TODO: Add integration tests..." block from the
module docstring (the top-level docstring in this test module that contains
"KNOWN ISSUE - Embedding Model Compatibility"); move those TODO items into your
issue tracker or a dedicated design/roadmap doc and replace the removed TODO
with a brief issue reference (e.g., "See issue `#1234`") or a short note pointing
to the external tracking doc so the docstring no longer contains TODOs but still
links to the tracking location.
- Around line 506-525: The tests test_generate_method_exists,
test_search_method_exists, and test_health_method_exists use getattr with
constant names which triggers Ruff B009; change each assertion to use direct
attribute access after the hasattr check (e.g. replace
callable(getattr(NvidiaRAG, "generate")) with callable(NvidiaRAG.generate), and
similarly for NvidiaRAG.search and NvidiaRAG.health) so the tests reference the
methods directly.

In `@pyproject.toml`:
- Around line 113-114: The pyproject.toml was updated to include the optional
dependency "nvidia-nat-rag-lib" under the rag-lib extras but uv.lock was not
regenerated; run the sync to update the lockfile (e.g., execute the suggested
tooling command to install and sync the package: run uv pip install
nvidia-nat-rag-lib --sync) so uv.lock reflects the new dependency and versions,
then commit the updated uv.lock alongside the pyproject.toml change.

In `@src/nat/embedder/nim_embedder.py`:
- Line 53: Update the Pydantic Field definition for the dimensions attribute so
invalid values are rejected at validation time: add the gt=0 constraint to the
existing dimensions: int | None = Field(default=None, description="Embedding
output dimensions.") declaration in src/nat/embedder/nim_embedder.py (the
dimensions field in the Nim embedder config) so that zero or negative integers
are prohibited, matching the pattern used for top_k (e.g., gt=0).

🧹 Nitpick comments (6)

packages/nvidia_nat_rag_lib/src/nat/plugins/rag_lib/models.py (1)
1-15: Add a module docstring for the public models module.

Public modules should have Google-style docstrings. As per coding guidelines, add a concise module docstring.
🛠️ Proposed fix
 # SPDX-FileCopyrightText: Copyright (c) 2025-2026, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
 # SPDX-License-Identifier: Apache-2.0
 #
 # Licensed under the Apache License, Version 2.0 (the "License");
 # you may not use this file except in compliance with the License.
 # You may obtain a copy of the License at
 #
 # http://www.apache.org/licenses/LICENSE-2.0
 #
 # Unless required by applicable law or agreed to in writing, software
 # distributed under the License is distributed on an "AS IS" BASIS,
 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 # See the License for the specific language governing permissions and
 # limitations under the License.
+
+"""RAG result models."""
packages/nvidia_nat_rag_lib/src/nat/plugins/rag_lib/config.py (1)
46-51: Wrap nvidia_rag in backticks in the class docstring.

Docstrings should wrap code entities in backticks to avoid Vale false positives. As per coding guidelines, update the docstring.
🛠️ Proposed fix
-    """Native nvidia_rag pipeline settings.
+    """Native `nvidia_rag` pipeline settings.
packages/nvidia_nat_rag_lib/src/nat/plugins/rag_lib/tools/generate.py (1)
122-123: Silent exception swallowing may hide parsing issues.

The bare except Exception: continue silently ignores all parsing errors without any logging. While some malformed chunks are expected in streaming scenarios, silently discarding all exceptions makes debugging difficult when legitimate issues occur.

Consider logging at debug level to aid troubleshooting:
♻️ Suggested improvement
                 except Exception:
-                    continue
+                    logger.debug("Failed to parse chunk: %s", raw_chunk)
+                    continue
packages/nvidia_nat_rag_lib/src/nat/plugins/rag_lib/client.py (1)
147-169: Consider extracting the repeated propagation pattern.

The same three-field propagation logic (model_name, server_url, api_key) is repeated for query_rewriter, reflection, and filter_expression_generator. A helper function could reduce duplication and improve maintainability.
♻️ Optional refactor suggestion
def _propagate_llm_to_component(component, llm: NIMModelConfig) -> None:
    """Propagate LLM config to a component if not already set."""
    if "model_name" not in component.model_fields_set:
        component.model_name = llm.model_name
    if "server_url" not in component.model_fields_set and llm.base_url:
        component.server_url = llm.base_url
    if "api_key" not in component.model_fields_set and llm.api_key:
        component.api_key = llm.api_key
Then use:
_propagate_llm_to_component(rag_config.query_rewriter, llm)
_propagate_llm_to_component(rag_config.reflection, llm)
_propagate_llm_to_component(rag_config.filter_expression_generator, llm)
packages/nvidia_nat_rag_lib/tests/test_models.py (1)
32-35: Fixture naming convention.

Per coding guidelines, fixture functions should use the fixture_ prefix on the function name and a name argument in the decorator. However, this is a minor style consideration.
♻️ Optional fix for fixture naming
-    `@pytest.fixture`
-    def citations(self) -> Citations:
+    `@pytest.fixture`(name="citations")
+    def fixture_citations(self) -> Citations:
         """Create Citations object."""
         return Citations(total_results=2, results=[])
Also applies to: 54-57
packages/nvidia_nat_rag_lib/tests/test_tools.py (1)
37-54: Align pytest fixtures with the fixture_ naming convention.

Fixtures here don’t follow the fixture_ prefix + @pytest.fixture(name="...") pattern. Please update the fixture functions (and adjust usages) for consistency.
🔧 Suggested pattern
-    `@pytest.fixture`
-    def mock_builder(self) -> MagicMock:
+    `@pytest.fixture`(name="mock_builder")
+    def fixture_mock_builder(self) -> MagicMock:
         return MagicMock(spec=Builder)
Apply the same pattern to `tool_config` and `function_group`. As per coding guidelines, fixtures should follow the `fixture_` naming convention.

examples/deploy/docker-compose.milvus.yml

examples/RAG/simple_rag/configs/rag_library_mode_config.yml

packages/nvidia_nat_rag_lib/LICENSE.md

packages/nvidia_nat_rag_lib/src/nat/plugins/rag_lib/client.py

packages/nvidia_nat_rag_lib/tests/test_rag_lib_function.py

pyproject.toml

src/nat/embedder/nim_embedder.py

examples/RAG/simple_rag/configs/rag_library_mode_config.yml

willkill07

Just a round of comments to think about the configuration space.

Right now there are so many knobs and required interconnection for this to work. I think it can be simplified a lot without reducing any functionality.

Remember, this is RAG Library Mode for NeMo Agent Toolkit, not exposing every feature of RAG Library Mode within NeMo Agent Toolkit.

examples/deploy/docker-compose.milvus.yml

packages/nvidia_nat_rag_lib/src/nat/plugins/rag_lib/client.py

packages/nvidia_nat_rag_lib/src/nat/plugins/rag_lib/config.py

packages/nvidia_nat_rag_lib/src/nat/plugins/rag_lib/tools/generate.py

packages/nvidia_nat_rag_lib/src/nat/plugins/rag_lib/tools/search.py

…ag-lib-mode-integration

Signed-off-by: Eric Evans <194135482+ericevans-nv@users.noreply.github.com>

…ag-lib-mode-integration

coderabbitai

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

examples/RAG/simple_rag/README.md (1)
114-127: Update the CLI help snippet to include new flags.

The script now supports --drop_collection and --embedding_model, but the documented usage output omits them. As per coding guidelines, keep docs accurate and comprehensive.
✏️ Suggested update
-usage: langchain_web_ingest.py [-h] [--urls URLS] [--collection_name COLLECTION_NAME] [--milvus_uri MILVUS_URI] [--clean_cache]
+usage: langchain_web_ingest.py [-h] [--urls URLS] [--collection_name COLLECTION_NAME] [--milvus_uri MILVUS_URI] [--clean_cache] [--drop_collection] [--embedding_model EMBEDDING_MODEL]
 --clean_cache         If true, deletes local files (default: False)
+--drop_collection     Drop existing collection before ingesting (default: False)
+--embedding_model     Embedding model to use (default: nvidia/nv-embedqa-e5-v5)

🤖 Fix all issues with AI agents

In `@examples/RAG/simple_rag/README.md`:
- Around line 422-425: The fenced code block showing the pipeline flow ("Query →
Embed → Retrieve candidates → Rerank → Final results") is missing a language
specifier; update that code fence to include a language tag (e.g., text) after
the opening ``` so it becomes ```text to satisfy markdownlint and clarify the
content type.
- Around line 29-57: Update the documentation to replace the acronym "NAT" with
"NeMo Agent toolkit" in body text and use "NeMo Agent Toolkit" (capitalized
"Toolkit") in headings: change the Table of Contents entry "Integration with NAT
Components" and the corresponding heading "Integration with NAT Components" as
well as any other occurrences in the README (e.g., the advanced section
references and the two places mentioned around the end of the file) so body
sentences read "NeMo Agent toolkit" and headings read "NeMo Agent Toolkit";
ensure all TOC links and anchors match the updated heading text.

In `@packages/nvidia_nat_rag_lib/src/nat/plugins/rag_lib/client.py`:
- Around line 125-136: The exception handler in the chunk parsing loop (where
ChainResponse.model_validate_json is called and parsed.choices /
parsed.citations are processed) currently swallows errors; change the bare
except to catch Exception as e and call logger.exception("Error parsing
streaming chunk") (or similar) to record the stack trace and chunk context
before continuing; ensure the existing logger object is used (or
imported/defined if missing) and preserve the continue behavior after logging.
- Around line 74-76: The public async generator nvidia_rag_lib is missing an
explicit return type; update its signature to include the return type
AsyncGenerator[FunctionGroup, None] (i.e., change "async def
nvidia_rag_lib(config: NvidiaRAGLibConfig, builder: Builder):" to "async def
nvidia_rag_lib(config: NvidiaRAGLibConfig, builder: Builder) ->
AsyncGenerator[FunctionGroup, None]:") and add the required import for
AsyncGenerator (from typing) and FunctionGroup (where it is defined) so the
annotation resolves.

♻️ Duplicate comments (3)

packages/nvidia_nat_rag_lib/src/nat/plugins/rag_lib/register.py (1)
1-2: Align SPDX year range with the rest of the package.

Other files in this package use 2025-2026; please keep the header consistent here as well. As per coding guidelines, confirm copyright years are up to date.
🔧 Suggested fix
-# SPDX-FileCopyrightText: Copyright (c) 2026, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+# SPDX-FileCopyrightText: Copyright (c) 2025-2026, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
pyproject.toml (1)
113-114: Sync uv.lock after adding nvidia-nat-rag-lib.

pyproject.toml now declares the optional dependency, so the lockfile should include it as well. As per coding guidelines, keep dependency declarations and uv.lock in sync.
#!/bin/bash
# Verify the lockfile includes the new optional dependency.
rg -n "nvidia-nat-rag-lib" uv.lock
packages/nvidia_nat_rag_lib/src/nat/plugins/rag_lib/client.py (1)
1-2: Align SPDX year range with the rest of the package.

Other files in this package use 2025-2026; please keep the header consistent here as well. As per coding guidelines, confirm copyright years are up to date.
🔧 Suggested fix
-# SPDX-FileCopyrightText: Copyright (c) 2026, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+# SPDX-FileCopyrightText: Copyright (c) 2025-2026, NVIDIA CORPORATION & AFFILIATES. All rights reserved.

🧹 Nitpick comments (1)

scripts/langchain_web_ingest.py (1)

37-53: Avoid blocking the event loop when dropping collections.

MilvusClient is synchronous, and calling it inside async def main(...) can block the loop. Consider moving the drop to a worker thread (or use an async Milvus client if available).

♻️ Possible refactor using asyncio.to_thread

     if drop_collection:
-        client = MilvusClient(uri=milvus_uri)
-        if client.has_collection(collection_name):
-            logger.info("Dropping existing collection: %s", collection_name)
-            client.drop_collection(collection_name)
-        else:
-            logger.info("Collection '%s' does not exist, nothing to drop", collection_name)
+        import asyncio
+
+        def _drop_collection() -> None:
+            client = MilvusClient(uri=milvus_uri)
+            if client.has_collection(collection_name):
+                logger.info("Dropping existing collection: %s", collection_name)
+                client.drop_collection(collection_name)
+            else:
+                logger.info("Collection '%s' does not exist, nothing to drop", collection_name)
+
+        await asyncio.to_thread(_drop_collection)

examples/RAG/simple_rag/README.md

packages/nvidia_nat_rag_lib/src/nat/plugins/rag_lib/client.py

Signed-off-by: Eric Evans <194135482+ericevans-nv@users.noreply.github.com>

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In `@examples/RAG/simple_rag/README.md`:
- Line 375: Update the body text string "NeMo Agent Toolkit" to "NeMo Agent
toolkit" in the README where the NVIDIA RAG Library plugin description appears
(the line containing "The NVIDIA RAG Library plugin (`nvidia_rag_lib`)
integrates the ... NeMo Agent Toolkit"); ensure only body text uses the
lowercase "toolkit" while leaving any headings unchanged.

♻️ Duplicate comments (1)

packages/nvidia_nat_rag_lib/src/nat/plugins/rag_lib/client.py (1)
117-128: Don’t silently swallow streaming chunk parse errors.

The empty except makes malformed chunks hard to debug; log the exception and keep the continue. As per coding guidelines, use logger.exception() when catching without re-raising.
🔧 Suggested fix
-                except Exception:
-                    continue
+                except Exception:
+                    logger.exception("Failed to parse streaming chunk")
+                    continue

🧹 Nitpick comments (1)

packages/nvidia_nat_rag_lib/src/nat/plugins/rag_lib/client.py (1)
1-16: Add a module-level Google-style docstring.

This is a public module but has no top-level docstring; add a concise module docstring after the SPDX header. As per coding guidelines, public modules should have Google-style docstrings.
♻️ Suggested addition
@@
 # limitations under the License.
+
+"""NVIDIA RAG Library client integration.
+
+Provides the `nvidia_rag_lib` function group and config resolution helpers for
+NVIDIA RAG Library within NeMo Agent Toolkit.
+"""

examples/RAG/simple_rag/README.md

Signed-off-by: Eric Evans <194135482+ericevans-nv@users.noreply.github.com>

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In `@packages/nvidia_nat_rag_lib/src/nat/plugins/rag_lib/client.py`:
- Around line 15-16: The module is missing a Google-style module docstring; add
a concise docstring at the top of
packages/nvidia_nat_rag_lib/src/nat/plugins/rag_lib/client.py immediately after
the SPDX header and before the import logging line that summarizes the module
purpose, public classes/functions, and usage examples if applicable (follow
Google Python docstring style and keep it brief).

♻️ Duplicate comments (2)

packages/nvidia_nat_rag_lib/src/nat/plugins/rag_lib/client.py (1)
117-128: Log streaming chunk parse errors instead of silently continuing.
This swallow-and-continue makes malformed chunk issues hard to diagnose.
✏️ Suggested fix
-                except Exception:
-                    continue
+                except Exception:
+                    logger.exception("Failed to parse streaming chunk")
+                    continue
Based on learnings, ...
examples/RAG/simple_rag/README.md (1)
369-376: Use lowercase "toolkit" in body text.

Line 371 uses "NeMo Agent Toolkit" (capital T) in body text. Per the coding guidelines, "Toolkit" should only be capitalized in headings; body text should use "NeMo Agent toolkit" (lowercase t).
✏️ Suggested fix
-The NVIDIA RAG Library (`nvidia_rag_lib`) integrates the [NVIDIA RAG Blueprint](https://github.com/NVIDIA-AI-Blueprints/rag) pipeline into NeMo Agent Toolkit.
+The NVIDIA RAG Library (`nvidia_rag_lib`) integrates the [NVIDIA RAG Blueprint](https://github.com/NVIDIA-AI-Blueprints/rag) pipeline into NeMo Agent toolkit.

🧹 Nitpick comments (1)

packages/nvidia_nat_rag_lib/tests/test_rag_lib_function.py (1)
511-513: Mark integration tests as slow.
These tests spin up Milvus collections and call external services; add the slow marker to keep the default suite fast.
♻️ Suggested change
-@pytest.mark.integration
+@pytest.mark.integration
+@pytest.mark.slow
 class TestNvidiaRAGIntegration:
As per coding guidelines, ...

packages/nvidia_nat_rag_lib/src/nat/plugins/rag_lib/client.py

packages/nvidia_nat_rag_lib/src/nat/plugins/rag_lib/config.py

mnajafian-nv

Excellent work on this PR, I added some in line comments and suggestions as well as following two questions :

Q1) Can we add observability for production monitoring?
Right now you're logging errors, which is great! But in production, you'll want to track performance trends and alert on degradation. A quick win could be adding structured metrics to your search/generate functions.

Why this matters:

Track p50/p95/p99 latencies - "Search is suddenly taking 3 seconds instead of 500ms"
Monitor error rates - "We're seeing 5% failures on the generate endpoint"
Correlate quality metrics - "Low citation counts after the embedder upgrade"
Set up alerts - "Page if search latency > 2s for 5 minutes"

What you can do with these metrics:

Export to Prometheus/Datadog for dashboards
Set up PagerDuty alerts on SLO violations
Track before/after performance for changes
Debug production incidents with structured data

Q2) Can we level up the config documentation? e.g. Adding examples, constraints, and gotchas to your docstrings.

Your config fields have descriptions, but developers will have questions like:

"What's a good value for confidence_threshold?"
"What's the performance impact of enabling query rewriting?"
"Help, I'm getting no results, what did I misconfigure?"

packages/nvidia_nat_rag_lib/pyproject.toml

packages/nvidia_nat_rag_lib/src/nat/plugins/rag_lib/client.py

packages/nvidia_nat_rag_lib/tests/test_tools.py

packages/nvidia_nat_rag_lib/src/nat/plugins/rag_lib/client.py

Signed-off-by: Eric Evans <194135482+ericevans-nv@users.noreply.github.com>

Signed-off-by: Will Killian <wkillian@nvidia.com>

Signed-off-by: Eric Evans <194135482+ericevans-nv@users.noreply.github.com>

…ag-lib-mode-integration

coderabbitai

Actionable comments posted: 3

🤖 Fix all issues with AI agents

In `@examples/RAG/simple_rag/README.md`:
- Around line 493-510: Replace possessive forms for inanimate labels in the
example console snippet: change "Agent's thoughts" to "Agent thought" or "Agent
notes", and change "Tool's input" / "Tool's response" to "Tool input" / "Tool
response" (or "Tool request" / "Tool response") in the block showing the
cuda_search__search interaction and related lines so the documentation avoids
possessive ’s for inanimate objects.

In `@packages/nvidia_nat_rag_lib/src/nat/plugins/rag_lib/client.py`:
- Line 96: Update the ValueError message that currently reads 'base_url is
required for LLM config specified in in NVIDIA RAG Config.' to remove the
duplicated "in" so it reads 'base_url is required for LLM config specified in
NVIDIA RAG Config.'; locate the raise ValueError(...) statement in client.py
(the raise ValueError call shown) and replace the string accordingly.
- Line 113: Fix the duplicated word "in" in the ValueError message in client.py
by updating the error string raised (the raise ValueError(...) statement) to
read "base_url is required for embedder config specified in NVIDIA RAG Config."
so it matches the corrected phrasing used elsewhere (e.g., similar message at
line 96).

♻️ Duplicate comments (2)

examples/RAG/simple_rag/README.md (1)
372-375: Use lowercase “toolkit” in body text and follow first‑use naming.

Body text should use “NeMo Agent toolkit” (lowercase t). If this is the first mention in this README, it should be “NVIDIA NeMo Agent toolkit.” As written, it uses “NeMo Agent Toolkit” in body text. Based on learnings, use “toolkit” in body text and reserve “Toolkit” for headings. Based on learnings, avoid the NAT acronym and follow the toolkit naming conventions in documentation.
✏️ Suggested edit
-The NVIDIA RAG Library (`nvidia_rag_lib`) integrates the [NVIDIA RAG Blueprint](https://github.com/NVIDIA-AI-Blueprints/rag) pipeline into NeMo Agent Toolkit.
+The NVIDIA RAG Library (`nvidia_rag_lib`) integrates the [NVIDIA RAG Blueprint](https://github.com/NVIDIA-AI-Blueprints/rag) pipeline into NeMo Agent toolkit.
packages/nvidia_nat_rag_lib/src/nat/plugins/rag_lib/client.py (1)
15-16: Add a module docstring.

Public modules should include a concise Google-style docstring after the SPDX header. This was flagged in a previous review but appears unaddressed.
✏️ Suggested fix
 # limitations under the License.
 
+"""RAG library client integration for NeMo Agent Toolkit."""
+
 import logging

🧹 Nitpick comments (3)

packages/nvidia_nat_rag_lib/tests/test_rag_lib_function.py (1)
198-205: Remove unused mock_builder and retriever_ref parameters.

These parameters are injected but never used in test_search, test_generate, and test_health. The tests create NvidiaRAGConfig directly from the config dictionaries rather than resolving through mock_builder, and retriever_ref is parametrized but the test hardcodes "http://localhost:19530".

Either use these parameters to exercise the builder resolution path, or remove them to reduce test complexity and eliminate false parametrization overhead.
♻️ Option 1: Remove unused parameters (simpler)
-    `@pytest.mark.parametrize`("retriever_ref", list(RETRIEVER_CONFIGS.keys()))
     async def test_search(
         self,
-        mock_builder: MagicMock,
         create_collection,
         llm_ref: str,
         embedder_ref: str,
-        retriever_ref: str,
     ) -> None:
Apply similar changes to test_generate and test_health.
packages/nvidia_nat_rag_lib/src/nat/plugins/rag_lib/client.py (2)
156-158: Use logger.error() instead of logger.exception() when re-raising.

Per coding guidelines, when re-raising exceptions with bare raise, use logger.error() to avoid duplicate stack trace output (the caller will also log the exception).
✏️ Suggested fix
         except Exception:
-            logger.exception("RAG search failed")
+            logger.error("RAG search failed")
             raise
195-197: Use logger.error() instead of logger.exception() when re-raising.

Same issue as the search function - per coding guidelines, use logger.error() to avoid duplicate stack traces when re-raising.
✏️ Suggested fix
         except Exception:
-            logger.exception("RAG generate failed")
+            logger.error("RAG generate failed")
             raise

examples/RAG/simple_rag/README.md

packages/nvidia_nat_rag_lib/src/nat/plugins/rag_lib/client.py

willkill07 · 2026-01-22T22:07:44Z

packages/nvidia_nat_rag_lib/src/nat/plugins/rag_lib/client.py

+                    if parsed.citations and parsed.citations.results:
+                        final_citations = parsed.citations
+                except (ValueError, TypeError, KeyError) as e:
+                    logger.debug("Failed to parse RAG response chunk: %s - %s", type(e).__name__, e)


Shouldn't we fail here?

No. Individual chunk parse failures shouldn't abort the entire stream. The outer try/except handles actual operation failures, this inner handler gracefully skips malformed chunks while accumulating successful ones.

packages/nvidia_nat_rag_lib/src/nat/plugins/rag_lib/config.py

packages/nvidia_nat_rag_lib/src/nat/plugins/rag_lib/models.py

packages/nvidia_nat_rag_lib/tests/test_models.py

Signed-off-by: Eric Evans <194135482+ericevans-nv@users.noreply.github.com>

…ag-lib-mode-integration

Signed-off-by: Eric Evans <194135482+ericevans-nv@users.noreply.github.com>

ericevans-nv added 14 commits October 9, 2025 15:28

Adding RAG lib integration stubs.

ed9de49

Signed-off-by: Eric Evans <194135482+ericevans-nv@users.noreply.github.com>

Merge branch 'develop' of github.com:NVIDIA/NeMo-Agent-Toolkit into r…

2ea545b

…ag-lib-mode-integration

Merge branch 'develop' of github.com:NVIDIA/NeMo-Agent-Toolkit into r…

a76d50e

…ag-lib-mode-integration

Merge branch 'develop' of github.com:NVIDIA/NeMo-Agent-Toolkit into r…

7f77bc8

…ag-lib-mode-integration

Merge branch 'develop' of github.com:NVIDIA/NeMo-Agent-Toolkit into r…

fe095df

…ag-lib-mode-integration

Merge branch 'develop' of github.com:NVIDIA/NeMo-Agent-Toolkit into r…

0c05f40

…ag-lib-mode-integration

Merge branch 'develop' of github.com:NVIDIA/NeMo-Agent-Toolkit into r…

22c9e10

…ag-lib-mode-integration

Merge branch 'develop' of github.com:NVIDIA/NeMo-Agent-Toolkit into r…

ca34fda

…ag-lib-mode-integration

Merge branch 'develop' of github.com:NVIDIA/NeMo-Agent-Toolkit into r…

56353ee

…ag-lib-mode-integration

Pushing package structure for testing.

229028b

Signed-off-by: Eric Evans <194135482+ericevans-nv@users.noreply.github.com>

Merge branch 'develop' of github.com:NVIDIA/NeMo-Agent-Toolkit into r…

13a2dae

…ag-lib-mode-integration

Merge branch 'develop' of github.com:NVIDIA/NeMo-Agent-Toolkit into r…

0146a2f

…ag-lib-mode-integration

Add rag_lib subpackage to support NVIDIA RAG Blueprint library with p…

b3a6508

…roper testing Signed-off-by: Eric Evans <194135482+ericevans-nv@users.noreply.github.com>

Add rag library mode search and generate tools.

1978f2f

Signed-off-by: Eric Evans <194135482+ericevans-nv@users.noreply.github.com>

ericevans-nv self-assigned this Jan 20, 2026

ericevans-nv requested a review from a team as a code owner January 20, 2026 21:57

ericevans-nv added the DO NOT MERGE PR should not be merged; see PR for details label Jan 20, 2026

coderabbitai bot reviewed Jan 20, 2026

View reviewed changes

willkill07 reviewed Jan 20, 2026

View reviewed changes

examples/RAG/simple_rag/configs/rag_library_mode_config.yml Outdated Show resolved Hide resolved

willkill07 reviewed Jan 20, 2026

View reviewed changes

ericevans-nv changed the title ~~Rag lib mode integration~~ RAG Library Mode integration Jan 21, 2026

ericevans-nv added 2 commits January 21, 2026 08:55

Merge branch 'develop' of github.com:NVIDIA/NeMo-Agent-Toolkit into r…

5329557

…ag-lib-mode-integration

Add Advanced RAG example using nvidia_rag_lib package

5676a6d

Signed-off-by: Eric Evans <194135482+ericevans-nv@users.noreply.github.com>

ericevans-nv added feature request New feature or request non-breaking Non-breaking change labels Jan 22, 2026

Merge branch 'develop' of github.com:NVIDIA/NeMo-Agent-Toolkit into r…

c16d36f

…ag-lib-mode-integration

coderabbitai bot reviewed Jan 22, 2026

View reviewed changes

Improve nvidia_rag_lib optional dependency handling and test quality

ddf8d34

Signed-off-by: Eric Evans <194135482+ericevans-nv@users.noreply.github.com>

coderabbitai bot reviewed Jan 22, 2026

View reviewed changes

examples/RAG/simple_rag/README.md Outdated Show resolved Hide resolved

Refine README and update copyright dates

a6edc40

Signed-off-by: Eric Evans <194135482+ericevans-nv@users.noreply.github.com>

coderabbitai bot reviewed Jan 22, 2026

View reviewed changes

packages/nvidia_nat_rag_lib/src/nat/plugins/rag_lib/client.py Show resolved Hide resolved

willkill07 reviewed Jan 22, 2026

View reviewed changes

mnajafian-nv reviewed Jan 22, 2026

View reviewed changes

ericevans-nv and others added 4 commits January 22, 2026 10:29

Pushing temp wheel file

e1e3b64

Signed-off-by: Eric Evans <194135482+ericevans-nv@users.noreply.github.com>

enh(rag): improve config interface; change import deferment

760ba4c

Signed-off-by: Will Killian <wkillian@nvidia.com>

propagate exceptions, require collection_names, cleanup tests

5b98ec5

Signed-off-by: Eric Evans <194135482+ericevans-nv@users.noreply.github.com>

Merge branch 'develop' of github.com:NVIDIA/NeMo-Agent-Toolkit into r…

abcd328

…ag-lib-mode-integration

coderabbitai bot reviewed Jan 22, 2026

View reviewed changes

examples/RAG/simple_rag/README.md Show resolved Hide resolved

packages/nvidia_nat_rag_lib/src/nat/plugins/rag_lib/client.py Outdated Show resolved Hide resolved

packages/nvidia_nat_rag_lib/src/nat/plugins/rag_lib/client.py Outdated Show resolved Hide resolved

willkill07 reviewed Jan 22, 2026

View reviewed changes

ericevans-nv added 3 commits January 22, 2026 18:03

adding docstrings

27855e1

Signed-off-by: Eric Evans <194135482+ericevans-nv@users.noreply.github.com>

Merge branch 'develop' of github.com:NVIDIA/NeMo-Agent-Toolkit into r…

04db967

…ag-lib-mode-integration

Add reranking to Vale

0bd20be

Signed-off-by: Eric Evans <194135482+ericevans-nv@users.noreply.github.com>

RAG Library Mode integration #1440

Are you sure you want to change the base?

RAG Library Mode integration #1440

Uh oh!

Conversation

ericevans-nv commented Jan 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

What's Included

By Submitting this PR I confirm:

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Jan 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

willkill07 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mnajafian-nv left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

ericevans-nv commented Jan 20, 2026 •

edited

Loading

coderabbitai bot commented Jan 20, 2026 •

edited

Loading