docs(skills): refactor d2s script by miyoungc · Pull Request #4549 · NVIDIA/NemoClaw

miyoungc · 2026-05-29T21:42:56Z

Summary

Refactors scripts/docs-to-skills.py to a simpler two-strategy generator (grouped and individual) and regenerates .agents/skills/nemoclaw-user-* output to match. The old smart strategy, 11,500-character spill logic, and *-details.md deferral files are removed; sibling pages now land in references/ unchanged, and procedure pages inline in full when they lead a group.

Related Issue

None.

Changes

Generator (scripts/docs-to-skills.py)

Replace smart default with grouped (directory groups) and keep individual (one skill per procedure page; concept/reference buckets).
Grouped lead selection: highest-priority procedure page (how_to, get_started, or tutorial) becomes SKILL.md body; all siblings go to references/.
Reference-only hubs: groups with no procedure page (for example overview, reference, configure-security) emit a thin SKILL.md with frontmatter + ## References only; full content stays in references/.
Remove MAX_SKILL_MD_CHARS, section splitting/deferral, and generated *-details.md spill files.
Fix skill description ordering to follow the partitioned lead page.
Fix markdownlint MD012 in generated hub skills (collapse_consecutive_blank_lines, append_markdown_section).

Docs source tweaks

docs/about/overview.mdx: add skill.priority: 10 for overview ordering in generated metadata.
docs/resources/agent-skills.mdx: set content.type to how_to so the single-page group inlines correctly.

Regenerated user skills

Inline full procedure content where it leads: get-started (quickstart + provider options), configure-inference, manage-policy, manage-sandboxes, deploy-remote, monitor-sandbox, agent-skills.
Convert heavy groups to reference hubs: overview, reference, configure-security.
Delete obsolete spill files: quickstart-details.md, use-local-inference-details.md, customize-network-policy-details.md, lifecycle-details.md, agent-skills.md.

Docs pipeline docs

Update docs/CONTRIBUTING.md grouped-strategy description.

Type of Change

Code change (feature, bug fix, or refactor)
Code change with doc updates
Doc only (prose changes, no code sample modifications)
Doc only (includes code sample changes)

Design notes (reviewer follow-ups)

Reference-only hub skills are intentional. Concept/reference directory groups defer all page bodies to references/ so agents load detail on demand (progressive disclosure). Routing text lives in SKILL.md frontmatter + References bullets; full task coverage remains in sibling reference files.
resolve_includes() path containment is pre-existing MyST-only behavior; this PR does not expand its surface. Follow-up hardening can be a separate change if we want docs-root allowlisting.

Verification

npx prek run --all-files passes (via CI checks)
npm test passes (via CI unit-vitest-linux)
Tests added or updated for new or changed behavior
No secrets, API keys, or credentials committed
Docs updated for user-facing behavior changes
npm run docs builds without warnings (doc changes only; Fern preview posted)
Doc pages follow the style guide (doc changes only)
New doc pages include SPDX header and frontmatter (new pages only)

Local commands run:

python3 scripts/docs-to-skills.py docs/ .agents/skills/ --prefix nemoclaw-user --doc-platform fern-mdx --dry-run
python3 scripts/docs-to-skills.py docs/ .agents/skills/ --prefix nemoclaw-user --doc-platform fern-mdx

Signed-off-by: Miyoung Choi miyoungc@nvidia.com

copy-pr-bot · 2026-05-29T21:42:59Z

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

coderabbitai · 2026-05-29T21:43:02Z

📝 Walkthrough

Walkthrough

This PR refactors the docs-to-skills generation pipeline from a smart strategy to a grouped default, consolidating detailed reference files into main skill documentation. The generator now uses two strategies: grouped (one primary procedure per directory, siblings in references/) and individual (each procedure gets its own skill). Source documentation metadata is updated, and user skill files are regenerated with inlined content.

Changes

Docs-to-Skills Generator and Skill Consolidation

Layer / File(s)	Summary
Generator Refactoring: Partitioning and Content Building `scripts/docs-to-skills.py`	`partition_skill_pages` and `partition_grouped_skill_pages` replace the old multi-list logic; new markdown helpers (`collapse_consecutive_blank_lines`, `append_markdown_section`) post-process output. `generate_skill` now accepts a `strategy` parameter, branches on strategy choice, and centers `SKILL.md` construction on a single primary page with reference-link rewriting and section filtering (skipping configured H2s, collecting related-topics separately). References directory cleanup removes stale files when `ref_files` is empty or updates existing reference files with only pages routed via `reference_pages`.
Generator CLI and Strategy Implementation `scripts/docs-to-skills.py`	Added `group_individual` and refactored `partition_grouped_skill_pages` to implement grouped (lowest-priority procedure as lead) and individual (one skill per procedure) strategies. Removed the `smart` strategy. Updated CLI `--strategy` argument default to `grouped`, epilog text, and the `generate_skill` call to pass `strategy=args.strategy`.
Source Documentation Metadata and Script Documentation `docs/about/overview.mdx`, `docs/resources/agent-skills.mdx`, `docs/CONTRIBUTING.md`	Added `skill` frontmatter priority `10` to overview page; changed agent-skills content type from `concept` to `how_to`. Updated `CONTRIBUTING.md` skill-source table mappings for multiple skills and rewrote strategy documentation to describe grouped/individual approaches and procedure-page precedence logic.
Onboarding Skills Consolidation: Agent Skills and Get-Started `.agents/skills/nemoclaw-user-agent-skills/SKILL.md`, `.agents/skills/nemoclaw-user-get-started/SKILL.md`	Inlined full agent-skills overview (skill discovery, availability table, example questions, assistant compatibility); inlined wizard onboarding flow with provider-by-provider setup steps (NVIDIA, OpenAI, Anthropic, Gemini, Local Ollama, Model Router) and non-interactive scripted example. Removed references to external detail files.
Infrastructure Skills Consolidation: Inference Config and Sandbox Management `.agents/skills/nemoclaw-user-configure-inference/SKILL.md`, `.agents/skills/nemoclaw-user-manage-sandboxes/SKILL.md`	Inlined local-inference setup (authenticated reverse proxy for non-WSL Ollama, non-interactive environment setup, API path selection, vLLM model overrides, NIM setup, timeout/verification). Inlined sandbox rebuild behavior (preserved/non-preserved state, preflight credential recovery). Updated trigger keywords and removed external reference loads.
User Skills Overview and Policy/Security Updates `.agents/skills/nemoclaw-user-manage-policy/SKILL.md`, `.agents/skills/nemoclaw-user-configure-security/SKILL.md`, `.agents/skills/nemoclaw-user-overview/SKILL.md`, `.agents/skills/nemoclaw-user-reference/SKILL.md`	Inlined custom preset removal guidance into policy skill; reordered references in security and overview skills to prioritize user-facing docs; updated overview description from ecosystem to user-focused; updated reference skill heading. Removed references to deleted detail files.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

NVIDIA/NemoClaw#4405: Prior refactor of the docs-to-skills script affecting SKILL.md/references generation for nemoclaw-user-configure-inference and related skills.
NVIDIA/NemoClaw#4539: Overlapping changes to nemoclaw-user-configure-inference documentation; adds GPU Memory Cleanup to reference file that this PR consolidates and removes.
NVIDIA/NemoClaw#2445: Earlier modifications to the same scripts/docs-to-skills.py generator logic for generate_skill signature and reference link helpers.

Suggested labels

enhancement: skill, documentation

Suggested reviewers

cv
ericksoa
jyaunches

🐰 A garden of docs once grew so wide,
Now gathered close, neatly unified;
One page to lead, the rest stand by,
The generator hops with graceful fly!
Grouped strategies bloom—a tidy design,
Where references and skills in harmony align. 🌱✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 76.92% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The PR title 'docs(skills): refactor d2s script' clearly identifies the refactored docs-to-skills script as the primary change and accurately reflects the substantial refactoring visible across skill documentation and script logic.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch d2k-refactoring

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

github-actions · 2026-05-29T21:43:19Z

🌿 Preview your docs: https://nvidia-preview-pr-4549.docs.buildwithfern.com/nemoclaw

github-actions · 2026-05-29T21:44:27Z

E2E Advisor Recommendation

Required E2E: None
Optional E2E: docs-validation-e2e, skill-agent-e2e

Workflow run

Full advisor summary

E2E Recommendation Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required E2E

None. No merge-blocking E2E is required because the PR changes documentation, generated user skill markdown, and docs-to-skills release tooling only. It does not modify installer/onboarding code, sandbox lifecycle, credentials, security boundaries, network policy enforcement, inference routing, deployment logic, or OpenClaw runtime behavior.

Optional E2E

docs-validation-e2e (low): Optional release-prep confidence for documentation changes: validates installed CLI/docs parity and local documentation links. It is not merge-blocking here because the PR does not change runtime behavior.
skill-agent-e2e (medium): Optional confidence for the adjacent real-assistant skill surface: verifies a sandboxed OpenClaw agent can consume an injected skill. It does not specifically validate the changed generated user skill content, so it should not be required.

New E2E recommendations

generated-agent-skills (medium): Existing skill-agent E2E uses a synthetic fixture skill, and nightly docs-validation does not scan .agents/skills by default. This PR deletes and rewrites generated user skill reference files, so dedicated generated-skill integrity coverage would catch broken skill links or stale generated output.
- Suggested test: Add a generated user skills validation that runs docs-to-skills in dry-run/idempotence mode and check-docs.sh --only-links --local-only --with-skills for .agents/skills/nemoclaw-user-* files.

github-actions · 2026-05-29T21:44:28Z

E2E Scenario Advisor Recommendation

Required scenario E2E: None
Optional scenario E2E: None

Workflow run

Full scenario advisor summary

E2E Scenario Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required scenario E2E

None. Changes are limited to documentation, generated user-agent skill markdown, and the docs-to-skills generation helper. They do not modify scenario E2E workflows, scenario metadata, expected-state contracts, suite definitions, runtime/runner code, onboarding/install helpers used by scenarios, or suite scripts under test/e2e-scenario/.

Optional scenario E2E

None.

Relevant changed files

None.

github-actions · 2026-05-29T21:47:11Z

PR Review Advisor

Findings: 0 needs attention, 5 worth checking, 0 nice ideas
Since last review: 0 prior items resolved, 3 still apply, 1 new item found

Review findings

🛠️ Needs attention

None.

🔎 Worth checking

Source-of-truth review needed: scripts/docs-to-skills.py resolve_includes(): The advisor marked localized patch analysis as needs_followup.
- Recommendation: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
- Evidence: `resolve_includes()` returns a placeholder for absent/unreadable files but reads any existing resolved path without a docs-root allowlist.
Docs include resolution can inline files outside the docs tree (scripts/docs-to-skills.py:724): `resolve_includes()` resolves the include target relative to the source page and reads any existing file, but it does not reject absolute paths or require the target to remain under `docs_dir`. A malicious or accidental MyST include such as `../../...` could inline repository or local files into generated user skills during release prep, which is a credential/source disclosure risk on a trusted generated-guidance surface.
- Recommendation: Pass the docs root into include resolution, reject absolute paths and resolved targets outside the docs tree before `read_text()`, and add a negative regression test proving traversal and absolute include paths are not read or emitted.
- Evidence: `resolved = (source_dir / raw_path).resolve()` is followed by `resolved.is_file()` and `resolved.read_text(encoding="utf-8")`; unlike `rewrite_doc_paths()`, this path has no `relative_to(docs_dir)` containment check.
Source-of-truth review still needed for resolve_includes() fallback behavior (scripts/docs-to-skills.py:724): `resolve_includes()` keeps a localized fallback that emits a placeholder for absent or unreadable include files, but the code and tests do not document the invalid state boundary, why invalid include references cannot be rejected or fixed at source, what regression test protects the behavior, or when the workaround should be removed.
- Recommendation: Document the intended invalid include states and source boundary, prefer making invalid include paths impossible at parse/generation time, and add fixture tests for both allowed includes and missing/unreadable/out-of-root includes.
- Evidence: `resolve_includes()` returns `> *Content included from ...*` for missing or unreadable files while still reading any existing resolved path; no nearby source-of-truth explanation or regression test was found.
Docs-to-skills refactor lacks visible generator-level regression tests (scripts/docs-to-skills.py:1570): This PR changes the core generator contract: grouped-vs-individual behavior, primary-page selection, frontmatter description source, reference file cleanup, local link rewriting, and inlining of detail pages. Without focused tests, regressions in generated skill shape or safety-sensitive include handling would be hard to catch.
- Recommendation: Add generator-level tests that call `generate_skill()`, `partition_grouped_skill_pages()`, and `resolve_includes()` on fixtures. Cover non-input-order priority selection, title/body/description alignment, stale reference deletion, local reference link rewriting, concept-only groups, and include traversal rejection.
- Evidence: No `test/*docs-to-skills*` or equivalent focused generator tests were found in the checkout, while `generate_skill()` and related grouping logic were substantially refactored.
Reference-only generated skills may no longer answer top-level user questions directly (.agents/skills/nemoclaw-user-overview/SKILL.md:10): The new grouped strategy makes groups with no procedure page reference-only. In the generated output, `nemoclaw-user-overview` and `nemoclaw-user-reference` now contain only a title plus reference links, with the actual overview/reference content deferred to `references/`. That may be intended progressive disclosure, but it weakens the top-level skill response for prompts such as 'what is NemoClaw?' or CLI reference lookups and appears in tension with the contributing guide's statement that generated skills identically cover the same tasks as their source pages.
- Recommendation: Confirm this reference-only shape is intentional for concept/reference-only groups, or inline the highest-priority page for those groups as well. Add a golden/fixture test for concept-only and reference-only grouped skills so this behavior is explicit.
- Evidence: `partition_grouped_skill_pages()` returns `None` when no procedure page exists, and generated `nemoclaw-user-overview/SKILL.md` and `nemoclaw-user-reference/SKILL.md` show only `## References` after the title.

🌱 Nice ideas

None.

Since last review details

Current findings:

Source-of-truth review needed: scripts/docs-to-skills.py resolve_includes(): The advisor marked localized patch analysis as needs_followup.
- Recommendation: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
- Evidence: `resolve_includes()` returns a placeholder for absent/unreadable files but reads any existing resolved path without a docs-root allowlist.
Docs include resolution can inline files outside the docs tree (scripts/docs-to-skills.py:724): `resolve_includes()` resolves the include target relative to the source page and reads any existing file, but it does not reject absolute paths or require the target to remain under `docs_dir`. A malicious or accidental MyST include such as `../../...` could inline repository or local files into generated user skills during release prep, which is a credential/source disclosure risk on a trusted generated-guidance surface.
- Recommendation: Pass the docs root into include resolution, reject absolute paths and resolved targets outside the docs tree before `read_text()`, and add a negative regression test proving traversal and absolute include paths are not read or emitted.
- Evidence: `resolved = (source_dir / raw_path).resolve()` is followed by `resolved.is_file()` and `resolved.read_text(encoding="utf-8")`; unlike `rewrite_doc_paths()`, this path has no `relative_to(docs_dir)` containment check.
Source-of-truth review still needed for resolve_includes() fallback behavior (scripts/docs-to-skills.py:724): `resolve_includes()` keeps a localized fallback that emits a placeholder for absent or unreadable include files, but the code and tests do not document the invalid state boundary, why invalid include references cannot be rejected or fixed at source, what regression test protects the behavior, or when the workaround should be removed.
- Recommendation: Document the intended invalid include states and source boundary, prefer making invalid include paths impossible at parse/generation time, and add fixture tests for both allowed includes and missing/unreadable/out-of-root includes.
- Evidence: `resolve_includes()` returns `> *Content included from ...*` for missing or unreadable files while still reading any existing resolved path; no nearby source-of-truth explanation or regression test was found.
Docs-to-skills refactor lacks visible generator-level regression tests (scripts/docs-to-skills.py:1570): This PR changes the core generator contract: grouped-vs-individual behavior, primary-page selection, frontmatter description source, reference file cleanup, local link rewriting, and inlining of detail pages. Without focused tests, regressions in generated skill shape or safety-sensitive include handling would be hard to catch.
- Recommendation: Add generator-level tests that call `generate_skill()`, `partition_grouped_skill_pages()`, and `resolve_includes()` on fixtures. Cover non-input-order priority selection, title/body/description alignment, stale reference deletion, local reference link rewriting, concept-only groups, and include traversal rejection.
- Evidence: No `test/*docs-to-skills*` or equivalent focused generator tests were found in the checkout, while `generate_skill()` and related grouping logic were substantially refactored.
Reference-only generated skills may no longer answer top-level user questions directly (.agents/skills/nemoclaw-user-overview/SKILL.md:10): The new grouped strategy makes groups with no procedure page reference-only. In the generated output, `nemoclaw-user-overview` and `nemoclaw-user-reference` now contain only a title plus reference links, with the actual overview/reference content deferred to `references/`. That may be intended progressive disclosure, but it weakens the top-level skill response for prompts such as 'what is NemoClaw?' or CLI reference lookups and appears in tension with the contributing guide's statement that generated skills identically cover the same tasks as their source pages.
- Recommendation: Confirm this reference-only shape is intentional for concept/reference-only groups, or inline the highest-priority page for those groups as well. Add a golden/fixture test for concept-only and reference-only grouped skills so this behavior is explicit.
- Evidence: `partition_grouped_skill_pages()` returns `None` when no procedure page exists, and generated `nemoclaw-user-overview/SKILL.md` and `nemoclaw-user-reference/SKILL.md` show only `## References` after the title.

Workflow run details

This is an automated advisory review. A human maintainer must make the final merge decision.

coderabbitai

Actionable comments posted: 3

🧹 Nitpick comments (3)

.agents/skills/nemoclaw-user-overview/SKILL.md (1)
14-14: Split into one sentence per line.

This bullet contains three sentences on a single line, which makes diffs harder to review. Each sentence should be on its own line in the source Markdown.

As per coding guidelines: One sentence per line in source makes diffs readable.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.agents/skills/nemoclaw-user-overview/SKILL.md at line 14, Edit the bullet
starting with "Load [references/overview.md](references/overview.md)" in
SKILL.md to place each sentence on its own line: break the current single-line
bullet into three separate lines so the sentence about when to use the Ecosystem
page, the sentence about internal mechanics/How It Works, and the sentence
explaining what NemoClaw covers (onboarding, lifecycle management, OpenClaw
operations, capabilities and purpose) are each on their own line; keep the
original wording but only adjust line breaks.
.agents/skills/nemoclaw-user-manage-policy/SKILL.md (1)
284-284: Use active voice.

"Custom presets applied with --from-file or --from-dir are recorded in the NemoClaw sandbox registry" uses passive voice. Consider: "NemoClaw records custom presets applied with --from-file or --from-dir in the sandbox registry."

Similarly, "can be removed" and "does not need to be kept" are passive constructions.

As per coding guidelines: Active voice is required for all documentation.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.agents/skills/nemoclaw-user-manage-policy/SKILL.md at line 284, Rewrite the
passive sentences to active voice: change "Custom presets applied with
`--from-file` or `--from-dir` are recorded in the NemoClaw sandbox registry
alongside their full YAML content, so they can be removed by name — the original
file does not need to be kept on disk" to an active form such as "NemoClaw
records custom presets applied with `--from-file` or `--from-dir` in the sandbox
registry along with their full YAML content, so you can remove them by name and
do not need to keep the original file on disk." Update the line containing
`--from-file`, `--from-dir`, "NemoClaw sandbox registry", and references to
"removed by name" / "does not need to be kept" to use active voice consistently.
.agents/skills/nemoclaw-user-configure-security/SKILL.md (1)
16-16: Split into one sentence per line.

This bullet contains multiple sentences on a single line. Each sentence should be on its own line in the source Markdown for better diff readability.

As per coding guidelines: One sentence per line in source makes diffs readable.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.agents/skills/nemoclaw-user-configure-security/SKILL.md at line 16, The
bullet in .agents/skills/nemoclaw-user-configure-security/SKILL.md contains
multiple sentences on one line; split that single bullet into separate lines so
each sentence is on its own line (e.g., break the sentence that starts "Lists
OpenClaw security controls..." into separate lines for each sentence),
preserving the same wording and the reference **Load
[references/openclaw-controls.md](references/openclaw-controls.md)** and the
list of controls (prompt injection detection, tool access control, rate
limiting, environment variable policy, audit framework, supply chain scanning,
messaging access policy, context visibility, and safe regex) so diffs show one
sentence per line.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@docs/CONTRIBUTING.md`:
- Line 99: Replace the passive, inaccurate sentence "Sibling pages are written
unchanged to `references/`." with an active, accurate statement noting that the
generator rewrites and normalizes files before output; update the line to
something like: state that the generator rewrites and normalizes reference
markdown and then writes it to `references/`, removing the word "unchanged" and
using active voice so the documentation reflects the actual behavior of the
generator.

In `@scripts/docs-to-skills.py`:
- Around line 1629-1635: The code builds reference filenames using only
page.path.stem which causes collisions across directories; change the ref_name
creation in the loop (the block using reference_pages, _page_rel(page),
skill_md_local_links and reference_local_links) to derive a unique filename/path
from the page's relative path (include parent directory segments or use the
page.path relative-to-root with a .md suffix) instead of stem-only, and update
both assignments (skill_md_local_links[rel] and reference_local_links[rel])
accordingly; apply the same fix to the similar block around the other occurrence
(lines shown 1755-1766).
- Around line 1853-1855: The current assignment groups[page.path.stem] = [page]
can clobber other procedure pages with identical filenames (e.g., multiple
index.mdx); change the key to a collision-safe unique identifier instead of
page.path.stem (for example use page.path.as_posix() or
page.path.parent.joinpath(page.path.stem).as_posix(), or a page.slug/URL if
available) so each procedure group is unique; update the branch that checks
page.content_type in PROCEDURE_CONTENT_TYPES to use the new key when creating
the group and keep the rest of grouping logic unchanged.

---

Nitpick comments:
In @.agents/skills/nemoclaw-user-configure-security/SKILL.md:
- Line 16: The bullet in
.agents/skills/nemoclaw-user-configure-security/SKILL.md contains multiple
sentences on one line; split that single bullet into separate lines so each
sentence is on its own line (e.g., break the sentence that starts "Lists
OpenClaw security controls..." into separate lines for each sentence),
preserving the same wording and the reference **Load
[references/openclaw-controls.md](references/openclaw-controls.md)** and the
list of controls (prompt injection detection, tool access control, rate
limiting, environment variable policy, audit framework, supply chain scanning,
messaging access policy, context visibility, and safe regex) so diffs show one
sentence per line.

In @.agents/skills/nemoclaw-user-manage-policy/SKILL.md:
- Line 284: Rewrite the passive sentences to active voice: change "Custom
presets applied with `--from-file` or `--from-dir` are recorded in the NemoClaw
sandbox registry alongside their full YAML content, so they can be removed by
name — the original file does not need to be kept on disk" to an active form
such as "NemoClaw records custom presets applied with `--from-file` or
`--from-dir` in the sandbox registry along with their full YAML content, so you
can remove them by name and do not need to keep the original file on disk."
Update the line containing `--from-file`, `--from-dir`, "NemoClaw sandbox
registry", and references to "removed by name" / "does not need to be kept" to
use active voice consistently.

In @.agents/skills/nemoclaw-user-overview/SKILL.md:
- Line 14: Edit the bullet starting with "Load
[references/overview.md](references/overview.md)" in SKILL.md to place each
sentence on its own line: break the current single-line bullet into three
separate lines so the sentence about when to use the Ecosystem page, the
sentence about internal mechanics/How It Works, and the sentence explaining what
NemoClaw covers (onboarding, lifecycle management, OpenClaw operations,
capabilities and purpose) are each on their own line; keep the original wording
but only adjust line breaks.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: c77f483e-8e4c-4710-ad65-6b0abc62594b

📥 Commits

Reviewing files that changed from the base of the PR and between ce4a5c3 and 26f78cd.

📒 Files selected for processing (17)

.agents/skills/nemoclaw-user-agent-skills/SKILL.md
.agents/skills/nemoclaw-user-agent-skills/references/agent-skills.md
.agents/skills/nemoclaw-user-configure-inference/SKILL.md
.agents/skills/nemoclaw-user-configure-inference/references/use-local-inference-details.md
.agents/skills/nemoclaw-user-configure-security/SKILL.md
.agents/skills/nemoclaw-user-get-started/SKILL.md
.agents/skills/nemoclaw-user-get-started/references/quickstart-details.md
.agents/skills/nemoclaw-user-manage-policy/SKILL.md
.agents/skills/nemoclaw-user-manage-policy/references/customize-network-policy-details.md
.agents/skills/nemoclaw-user-manage-sandboxes/SKILL.md
.agents/skills/nemoclaw-user-manage-sandboxes/references/lifecycle-details.md
.agents/skills/nemoclaw-user-overview/SKILL.md
.agents/skills/nemoclaw-user-reference/SKILL.md
docs/CONTRIBUTING.md
docs/about/overview.mdx
docs/resources/agent-skills.mdx
scripts/docs-to-skills.py

💤 Files with no reviewable changes (5)

.agents/skills/nemoclaw-user-get-started/references/quickstart-details.md
.agents/skills/nemoclaw-user-manage-policy/references/customize-network-policy-details.md
.agents/skills/nemoclaw-user-agent-skills/references/agent-skills.md
.agents/skills/nemoclaw-user-configure-inference/references/use-local-inference-details.md
.agents/skills/nemoclaw-user-manage-sandboxes/references/lifecycle-details.md

coderabbitai · 2026-05-29T22:43:35Z

-Sibling procedure pages, concept pages, and reference pages go into a `references/` subdirectory for progressive disclosure, keeping `SKILL.md` concise while preserving access to the full docs.
+The script reads YAML frontmatter from each doc page to determine its content type (`how_to`, `concept`, `reference`, `get_started`), then groups pages into skills using the `grouped` strategy by default.
+Within each directory group, the highest-priority procedure page (`how_to`, `get_started`, or `tutorial`) becomes the full body of `SKILL.md`.
+Sibling pages are written unchanged to `references/`.


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Do not describe reference pages as “unchanged.”

The generator rewrites and normalizes reference markdown before writing to references/, so “written unchanged” is inaccurate. Please also rewrite this sentence in active voice.

✏️ Proposed wording

- Sibling pages are written unchanged to `references/`. + The generator writes sibling pages to `references/` after markdown cleanup and link rewriting.

As per coding guidelines, "Active voice required. Flag passive constructions."

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

Sibling pages are written unchanged to `references/`.

The generator writes sibling pages to `references/` after markdown cleanup and link rewriting.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/CONTRIBUTING.md` at line 99, Replace the passive, inaccurate sentence "Sibling pages are written unchanged to `references/`." with an active, accurate statement noting that the generator rewrites and normalizes files before output; update the line to something like: state that the generator rewrites and normalizes reference markdown and then writes it to `references/`, removing the word "unchanged" and using active voice so the documentation reflects the actual behavior of the generator.

coderabbitai · 2026-05-29T22:43:35Z

+    for page in reference_pages:
        rel = _page_rel(page)
        if rel is None:
            continue
        ref_name = page.path.stem + ".md"
        skill_md_local_links[rel] = f"references/{ref_name}"
        reference_local_links[rel] = ref_name


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Avoid stem-only reference filenames in aggregated groups.

ref_name = page.path.stem + ".md" can collide in individual mode when concept/reference pages from different directories share a stem, causing last-write-wins overwrites and bad cross-links.

🔧 Proposed fix

def generate_skill( @@ - skill_md_local_links: dict[str, str] = {} - reference_local_links: dict[str, str] = {} + skill_md_local_links: dict[str, str] = {} + reference_local_links: dict[str, str] = {} + + def _reference_filename(page: DocPage) -> str: + if strategy == "individual": + rel = _page_rel(page) + source = Path(rel).with_suffix("").as_posix() if rel else page.path.with_suffix("").as_posix() + return source.strip("/").replace("/", "__") + ".md" + return page.path.stem + ".md" @@ for page in reference_pages: @@ - ref_name = page.path.stem + ".md" + ref_name = _reference_filename(page) skill_md_local_links[rel] = f"references/{ref_name}" reference_local_links[rel] = ref_name @@ for ref_page in reference_pages: - ref_name = ref_page.path.stem + ".md" + ref_name = _reference_filename(ref_page)

Also applies to: 1755-1766

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@scripts/docs-to-skills.py` around lines 1629 - 1635, The code builds reference filenames using only page.path.stem which causes collisions across directories; change the ref_name creation in the loop (the block using reference_pages, _page_rel(page), skill_md_local_links and reference_local_links) to derive a unique filename/path from the page's relative path (include parent directory segments or use the page.path relative-to-root with a .md suffix) instead of stem-only, and update both assignments (skill_md_local_links[rel] and reference_local_links[rel]) accordingly; apply the same fix to the similar block around the other occurrence (lines shown 1755-1766).

coderabbitai · 2026-05-29T22:43:35Z

+        if page.content_type in PROCEDURE_CONTENT_TYPES:
+            groups[page.path.stem] = [page]
+        elif page.content_type == "concept":


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Use collision-safe keys for individual procedure groups.

groups[page.path.stem] can silently overwrite a previous procedure page when filenames repeat (for example, multiple index.mdx pages), dropping skills from generation.

🔧 Proposed fix

def group_individual(pages: list[DocPage]) -> dict[str, list[DocPage]]: @@ for page in pages: if page.content_type in PROCEDURE_CONTENT_TYPES: - groups[page.path.stem] = [page] + base_key = re.sub( + r"[^a-z0-9-]", + "-", + page.path.with_suffix("").as_posix().lower(), + ).strip("-") + key = base_key + n = 2 + while key in groups: + key = f"{base_key}-{n}" + n += 1 + groups[key] = [page]

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@scripts/docs-to-skills.py` around lines 1853 - 1855, The current assignment groups[page.path.stem] = [page] can clobber other procedure pages with identical filenames (e.g., multiple index.mdx); change the key to a collision-safe unique identifier instead of page.path.stem (for example use page.path.as_posix() or page.path.parent.joinpath(page.path.stem).as_posix(), or a page.slug/URL if available) so each procedure group is unique; update the branch that checks page.content_type in PROCEDURE_CONTENT_TYPES to use the new key when creating the group and keep the rest of grouping logic unchanged.

docs: d2s refactoring start

52fd455

miyoungc changed the title ~~docs: d2s refactoring start~~ docs: refactor d2s script May 29, 2026

miyoungc changed the title ~~docs: refactor d2s script~~ docs(skills): refactor d2s script May 29, 2026

miyoungc added 2 commits May 29, 2026 15:22

fix: refactor more

5864b8c

chore: rm unnecessary test

26f78cd

miyoungc marked this pull request as ready for review May 29, 2026 22:34

miyoungc added documentation Improvements or additions to documentation enhancement: skill Improvements to NemoCall repository hygiene or user functionality with skills. labels May 29, 2026

coderabbitai Bot reviewed May 29, 2026

View reviewed changes

cv approved these changes May 29, 2026

View reviewed changes

cv merged commit 7aa998c into main May 29, 2026
39 checks passed

cv deleted the d2k-refactoring branch May 29, 2026 23:34

	Sibling pages are written unchanged to `references/`.
	The generator writes sibling pages to `references/` after markdown cleanup and link rewriting.

Conversation

miyoungc commented May 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Related Issue

Changes

Type of Change

Design notes (reviewer follow-ups)

Verification

Uh oh!

copy-pr-bot Bot commented May 29, 2026

Uh oh!

coderabbitai Bot commented May 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

❌ Failed checks (1 warning)

Uh oh!

github-actions Bot commented May 29, 2026

Uh oh!

github-actions Bot commented May 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

E2E Advisor Recommendation

E2E Recommendation Advisor

Required E2E

Optional E2E

New E2E recommendations

Uh oh!

github-actions Bot commented May 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

E2E Scenario Advisor Recommendation

E2E Scenario Advisor

Required scenario E2E

Optional scenario E2E

Relevant changed files

Uh oh!

github-actions Bot commented May 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Review Advisor

🛠️ Needs attention

🔎 Worth checking

🌱 Nice ideas

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 29, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 29, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 29, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

miyoungc commented May 29, 2026 •

edited

Loading

coderabbitai Bot commented May 29, 2026 •

edited

Loading

github-actions Bot commented May 29, 2026 •

edited

Loading

github-actions Bot commented May 29, 2026 •

edited

Loading

github-actions Bot commented May 29, 2026 •

edited

Loading