Skip to content

(service) default results_dir and lancedb_uri to the current working …#1998

Open
edknv wants to merge 2 commits intoNVIDIA:mainfrom
edknv:fix/service-bundled-yaml-portable-paths
Open

(service) default results_dir and lancedb_uri to the current working …#1998
edknv wants to merge 2 commits intoNVIDIA:mainfrom
edknv:fix/service-bundled-yaml-portable-paths

Conversation

@edknv
Copy link
Copy Markdown
Collaborator

@edknv edknv commented May 8, 2026

…directory

Description

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.
  • If adjusting docker-compose.yaml environment variables have you ensured those are mimicked in the Helm values.yaml file.

@edknv edknv requested review from a team as code owners May 8, 2026 01:22
@edknv edknv requested review from jdye64, jperez999 and nkmcalli and removed request for nkmcalli May 8, 2026 01:22
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 8, 2026

Greptile Summary

This PR replaces developer-specific absolute paths (/home/local/jdyer/...) in the bundled retriever-service.yaml with portable relative defaults ("retriever_results" and "lancedb"), so retriever service start works out of the box without any configuration. It also exposes --results-dir and --lancedb-uri as explicit CLI override flags.

  • config.py / retriever-service.yaml: lancedb_uri default changed from /var/lib/nemo-retriever/lancedb (absolute container path) to "lancedb" (relative to CWD); results_dir default was already relative but the bundled YAML previously held the developer's home directory path — both are now portable.
  • cli.py: Adds --results-dir and --lancedb-uri as first-class override flags, wired through the existing overrides dict to load_config.
  • tests/test_service_bundled_config.py: New test file with two tests: one verifying the bundled YAML parses cleanly, and one asserting neither writable-state default is an absolute path.

Confidence Score: 5/5

Safe to merge; the change removes hardcoded developer paths and makes zero-config local startup work correctly.

All changed code paths are straightforward: the bundled YAML and Pydantic model defaults now both use portable relative paths, the new CLI flags wire cleanly through the existing overrides dict, and the new tests cover the intended invariants. No behavioral regressions are introduced for existing users who supply explicit paths via YAML or CLI.

No files require special attention.

Important Files Changed

Filename Overview
nemo_retriever/src/nemo_retriever/service/cli.py Adds --results-dir and --lancedb-uri CLI options; wired correctly through the overrides dict to load_config.
nemo_retriever/src/nemo_retriever/service/config.py lancedb_uri default changed from absolute container path to relative "lancedb"; comment added explaining CWD-relative semantics.
nemo_retriever/src/nemo_retriever/service/retriever-service.yaml Replaces hard-coded developer home-directory paths with portable relative defaults; adds explanatory comments about Helm override.
nemo_retriever/tests/test_service_bundled_config.py New test file; test_bundled_yaml_uses_cwd_relative_paths can pass trivially when the bundled YAML is absent (falls through to Pydantic defaults which are already relative).

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A["retriever service start\n(CLI)"] --> B{Config path\nprovided?}
    B -- "--config <path>" --> C[Load explicit YAML]
    B -- "None" --> D{./retriever-service.yaml\nexists in CWD?}
    D -- Yes --> E[Load local YAML]
    D -- No --> F{Bundled package\nYAML exists?}
    F -- Yes --> G[Load bundled YAML\ndefaults: relative paths]
    F -- No --> H[Pydantic defaults\nall relative paths]
    C --> I[Apply CLI overrides]
    E --> I
    G --> I
    H --> I
    I --> J{--results-dir?}
    J -- set --> K["processing.results_dir\n= CLI value"]
    J -- unset --> L["processing.results_dir\n= 'retriever_results'"]
    I --> M{--lancedb-uri?}
    M -- set --> N["vector_store.lancedb_uri\n= CLI value"]
    M -- unset --> O["vector_store.lancedb_uri\n= 'lancedb'"]
    K --> P[ServiceConfig]
    L --> P
    N --> P
    O --> P
Loading

Reviews (2): Last reviewed commit: "lint" | Re-trigger Greptile

Comment thread nemo_retriever/tests/test_service_bundled_config.py Outdated
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant