Skip to content

docs(skills): refactor d2s script#4549

Merged
cv merged 3 commits into
mainfrom
d2k-refactoring
May 29, 2026
Merged

docs(skills): refactor d2s script#4549
cv merged 3 commits into
mainfrom
d2k-refactoring

Conversation

@miyoungc
Copy link
Copy Markdown
Collaborator

@miyoungc miyoungc commented May 29, 2026

Summary

Refactors scripts/docs-to-skills.py to a simpler two-strategy generator (grouped and individual) and regenerates .agents/skills/nemoclaw-user-* output to match. The old smart strategy, 11,500-character spill logic, and *-details.md deferral files are removed; sibling pages now land in references/ unchanged, and procedure pages inline in full when they lead a group.

Related Issue

None.

Changes

Generator (scripts/docs-to-skills.py)

  • Replace smart default with grouped (directory groups) and keep individual (one skill per procedure page; concept/reference buckets).
  • Grouped lead selection: highest-priority procedure page (how_to, get_started, or tutorial) becomes SKILL.md body; all siblings go to references/.
  • Reference-only hubs: groups with no procedure page (for example overview, reference, configure-security) emit a thin SKILL.md with frontmatter + ## References only; full content stays in references/.
  • Remove MAX_SKILL_MD_CHARS, section splitting/deferral, and generated *-details.md spill files.
  • Fix skill description ordering to follow the partitioned lead page.
  • Fix markdownlint MD012 in generated hub skills (collapse_consecutive_blank_lines, append_markdown_section).

Docs source tweaks

  • docs/about/overview.mdx: add skill.priority: 10 for overview ordering in generated metadata.
  • docs/resources/agent-skills.mdx: set content.type to how_to so the single-page group inlines correctly.

Regenerated user skills

  • Inline full procedure content where it leads: get-started (quickstart + provider options), configure-inference, manage-policy, manage-sandboxes, deploy-remote, monitor-sandbox, agent-skills.
  • Convert heavy groups to reference hubs: overview, reference, configure-security.
  • Delete obsolete spill files: quickstart-details.md, use-local-inference-details.md, customize-network-policy-details.md, lifecycle-details.md, agent-skills.md.

Docs pipeline docs

  • Update docs/CONTRIBUTING.md grouped-strategy description.

Type of Change

  • Code change (feature, bug fix, or refactor)
  • Code change with doc updates
  • Doc only (prose changes, no code sample modifications)
  • Doc only (includes code sample changes)

Design notes (reviewer follow-ups)

  • Reference-only hub skills are intentional. Concept/reference directory groups defer all page bodies to references/ so agents load detail on demand (progressive disclosure). Routing text lives in SKILL.md frontmatter + References bullets; full task coverage remains in sibling reference files.
  • resolve_includes() path containment is pre-existing MyST-only behavior; this PR does not expand its surface. Follow-up hardening can be a separate change if we want docs-root allowlisting.

Verification

  • npx prek run --all-files passes (via CI checks)
  • npm test passes (via CI unit-vitest-linux)
  • Tests added or updated for new or changed behavior
  • No secrets, API keys, or credentials committed
  • Docs updated for user-facing behavior changes
  • npm run docs builds without warnings (doc changes only; Fern preview posted)
  • Doc pages follow the style guide (doc changes only)
  • New doc pages include SPDX header and frontmatter (new pages only)

Local commands run:

python3 scripts/docs-to-skills.py docs/ .agents/skills/ --prefix nemoclaw-user --doc-platform fern-mdx --dry-run
python3 scripts/docs-to-skills.py docs/ .agents/skills/ --prefix nemoclaw-user --doc-platform fern-mdx

Signed-off-by: Miyoung Choi miyoungc@nvidia.com

@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented May 29, 2026

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 29, 2026

📝 Walkthrough

Walkthrough

This PR refactors the docs-to-skills generation pipeline from a smart strategy to a grouped default, consolidating detailed reference files into main skill documentation. The generator now uses two strategies: grouped (one primary procedure per directory, siblings in references/) and individual (each procedure gets its own skill). Source documentation metadata is updated, and user skill files are regenerated with inlined content.

Changes

Docs-to-Skills Generator and Skill Consolidation

Layer / File(s) Summary
Generator Refactoring: Partitioning and Content Building
scripts/docs-to-skills.py
partition_skill_pages and partition_grouped_skill_pages replace the old multi-list logic; new markdown helpers (collapse_consecutive_blank_lines, append_markdown_section) post-process output. generate_skill now accepts a strategy parameter, branches on strategy choice, and centers SKILL.md construction on a single primary page with reference-link rewriting and section filtering (skipping configured H2s, collecting related-topics separately). References directory cleanup removes stale files when ref_files is empty or updates existing reference files with only pages routed via reference_pages.
Generator CLI and Strategy Implementation
scripts/docs-to-skills.py
Added group_individual and refactored partition_grouped_skill_pages to implement grouped (lowest-priority procedure as lead) and individual (one skill per procedure) strategies. Removed the smart strategy. Updated CLI --strategy argument default to grouped, epilog text, and the generate_skill call to pass strategy=args.strategy.
Source Documentation Metadata and Script Documentation
docs/about/overview.mdx, docs/resources/agent-skills.mdx, docs/CONTRIBUTING.md
Added skill frontmatter priority 10 to overview page; changed agent-skills content type from concept to how_to. Updated CONTRIBUTING.md skill-source table mappings for multiple skills and rewrote strategy documentation to describe grouped/individual approaches and procedure-page precedence logic.
Onboarding Skills Consolidation: Agent Skills and Get-Started
.agents/skills/nemoclaw-user-agent-skills/SKILL.md, .agents/skills/nemoclaw-user-get-started/SKILL.md
Inlined full agent-skills overview (skill discovery, availability table, example questions, assistant compatibility); inlined wizard onboarding flow with provider-by-provider setup steps (NVIDIA, OpenAI, Anthropic, Gemini, Local Ollama, Model Router) and non-interactive scripted example. Removed references to external detail files.
Infrastructure Skills Consolidation: Inference Config and Sandbox Management
.agents/skills/nemoclaw-user-configure-inference/SKILL.md, .agents/skills/nemoclaw-user-manage-sandboxes/SKILL.md
Inlined local-inference setup (authenticated reverse proxy for non-WSL Ollama, non-interactive environment setup, API path selection, vLLM model overrides, NIM setup, timeout/verification). Inlined sandbox rebuild behavior (preserved/non-preserved state, preflight credential recovery). Updated trigger keywords and removed external reference loads.
User Skills Overview and Policy/Security Updates
.agents/skills/nemoclaw-user-manage-policy/SKILL.md, .agents/skills/nemoclaw-user-configure-security/SKILL.md, .agents/skills/nemoclaw-user-overview/SKILL.md, .agents/skills/nemoclaw-user-reference/SKILL.md
Inlined custom preset removal guidance into policy skill; reordered references in security and overview skills to prioritize user-facing docs; updated overview description from ecosystem to user-focused; updated reference skill heading. Removed references to deleted detail files.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • NVIDIA/NemoClaw#4405: Prior refactor of the docs-to-skills script affecting SKILL.md/references generation for nemoclaw-user-configure-inference and related skills.
  • NVIDIA/NemoClaw#4539: Overlapping changes to nemoclaw-user-configure-inference documentation; adds GPU Memory Cleanup to reference file that this PR consolidates and removes.
  • NVIDIA/NemoClaw#2445: Earlier modifications to the same scripts/docs-to-skills.py generator logic for generate_skill signature and reference link helpers.

Suggested labels

enhancement: skill, documentation

Suggested reviewers

  • cv
  • ericksoa
  • jyaunches

🐰 A garden of docs once grew so wide,
Now gathered close, neatly unified;
One page to lead, the rest stand by,
The generator hops with graceful fly!
Grouped strategies bloom—a tidy design,
Where references and skills in harmony align. 🌱✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 76.92% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The PR title 'docs(skills): refactor d2s script' clearly identifies the refactored docs-to-skills script as the primary change and accurately reflects the substantial refactoring visible across skill documentation and script logic.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch d2k-refactoring

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Copy Markdown
Contributor

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 29, 2026

E2E Advisor Recommendation

Required E2E: None
Optional E2E: docs-validation-e2e, skill-agent-e2e

Workflow run

Full advisor summary

E2E Recommendation Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required E2E

  • None. No merge-blocking E2E is required because the PR changes documentation, generated user skill markdown, and docs-to-skills release tooling only. It does not modify installer/onboarding code, sandbox lifecycle, credentials, security boundaries, network policy enforcement, inference routing, deployment logic, or OpenClaw runtime behavior.

Optional E2E

  • docs-validation-e2e (low): Optional release-prep confidence for documentation changes: validates installed CLI/docs parity and local documentation links. It is not merge-blocking here because the PR does not change runtime behavior.
  • skill-agent-e2e (medium): Optional confidence for the adjacent real-assistant skill surface: verifies a sandboxed OpenClaw agent can consume an injected skill. It does not specifically validate the changed generated user skill content, so it should not be required.

New E2E recommendations

  • generated-agent-skills (medium): Existing skill-agent E2E uses a synthetic fixture skill, and nightly docs-validation does not scan .agents/skills by default. This PR deletes and rewrites generated user skill reference files, so dedicated generated-skill integrity coverage would catch broken skill links or stale generated output.
    • Suggested test: Add a generated user skills validation that runs docs-to-skills in dry-run/idempotence mode and check-docs.sh --only-links --local-only --with-skills for .agents/skills/nemoclaw-user-* files.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 29, 2026

E2E Scenario Advisor Recommendation

Required scenario E2E: None
Optional scenario E2E: None

Workflow run

Full scenario advisor summary

E2E Scenario Advisor

Base: origin/main
Head: HEAD
Confidence: high

Required scenario E2E

  • None. Changes are limited to documentation, generated user-agent skill markdown, and the docs-to-skills generation helper. They do not modify scenario E2E workflows, scenario metadata, expected-state contracts, suite definitions, runtime/runner code, onboarding/install helpers used by scenarios, or suite scripts under test/e2e-scenario/.

Optional scenario E2E

  • None.

Relevant changed files

  • None.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 29, 2026

PR Review Advisor

Findings: 0 needs attention, 5 worth checking, 0 nice ideas
Since last review: 0 prior items resolved, 3 still apply, 1 new item found

Review findings

🛠️ Needs attention

  • None.

🔎 Worth checking

  • Source-of-truth review needed: scripts/docs-to-skills.py resolve_includes(): The advisor marked localized patch analysis as needs_followup.
    • Recommendation: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
    • Evidence: `resolve_includes()` returns a placeholder for absent/unreadable files but reads any existing resolved path without a docs-root allowlist.
  • Docs include resolution can inline files outside the docs tree (scripts/docs-to-skills.py:724): `resolve_includes()` resolves the include target relative to the source page and reads any existing file, but it does not reject absolute paths or require the target to remain under `docs_dir`. A malicious or accidental MyST include such as `../../...` could inline repository or local files into generated user skills during release prep, which is a credential/source disclosure risk on a trusted generated-guidance surface.
    • Recommendation: Pass the docs root into include resolution, reject absolute paths and resolved targets outside the docs tree before `read_text()`, and add a negative regression test proving traversal and absolute include paths are not read or emitted.
    • Evidence: `resolved = (source_dir / raw_path).resolve()` is followed by `resolved.is_file()` and `resolved.read_text(encoding="utf-8")`; unlike `rewrite_doc_paths()`, this path has no `relative_to(docs_dir)` containment check.
  • Source-of-truth review still needed for resolve_includes() fallback behavior (scripts/docs-to-skills.py:724): `resolve_includes()` keeps a localized fallback that emits a placeholder for absent or unreadable include files, but the code and tests do not document the invalid state boundary, why invalid include references cannot be rejected or fixed at source, what regression test protects the behavior, or when the workaround should be removed.
    • Recommendation: Document the intended invalid include states and source boundary, prefer making invalid include paths impossible at parse/generation time, and add fixture tests for both allowed includes and missing/unreadable/out-of-root includes.
    • Evidence: `resolve_includes()` returns `> *Content included from ...*` for missing or unreadable files while still reading any existing resolved path; no nearby source-of-truth explanation or regression test was found.
  • Docs-to-skills refactor lacks visible generator-level regression tests (scripts/docs-to-skills.py:1570): This PR changes the core generator contract: grouped-vs-individual behavior, primary-page selection, frontmatter description source, reference file cleanup, local link rewriting, and inlining of detail pages. Without focused tests, regressions in generated skill shape or safety-sensitive include handling would be hard to catch.
    • Recommendation: Add generator-level tests that call `generate_skill()`, `partition_grouped_skill_pages()`, and `resolve_includes()` on fixtures. Cover non-input-order priority selection, title/body/description alignment, stale reference deletion, local reference link rewriting, concept-only groups, and include traversal rejection.
    • Evidence: No `test/*docs-to-skills*` or equivalent focused generator tests were found in the checkout, while `generate_skill()` and related grouping logic were substantially refactored.
  • Reference-only generated skills may no longer answer top-level user questions directly (.agents/skills/nemoclaw-user-overview/SKILL.md:10): The new grouped strategy makes groups with no procedure page reference-only. In the generated output, `nemoclaw-user-overview` and `nemoclaw-user-reference` now contain only a title plus reference links, with the actual overview/reference content deferred to `references/`. That may be intended progressive disclosure, but it weakens the top-level skill response for prompts such as 'what is NemoClaw?' or CLI reference lookups and appears in tension with the contributing guide's statement that generated skills identically cover the same tasks as their source pages.
    • Recommendation: Confirm this reference-only shape is intentional for concept/reference-only groups, or inline the highest-priority page for those groups as well. Add a golden/fixture test for concept-only and reference-only grouped skills so this behavior is explicit.
    • Evidence: `partition_grouped_skill_pages()` returns `None` when no procedure page exists, and generated `nemoclaw-user-overview/SKILL.md` and `nemoclaw-user-reference/SKILL.md` show only `## References` after the title.

🌱 Nice ideas

  • None.
Since last review details

Current findings:

  • Source-of-truth review needed: scripts/docs-to-skills.py resolve_includes(): The advisor marked localized patch analysis as needs_followup.
    • Recommendation: Identify the invalid state, source boundary, source-fix constraint, regression test, and removal condition before merging the localized behavior.
    • Evidence: `resolve_includes()` returns a placeholder for absent/unreadable files but reads any existing resolved path without a docs-root allowlist.
  • Docs include resolution can inline files outside the docs tree (scripts/docs-to-skills.py:724): `resolve_includes()` resolves the include target relative to the source page and reads any existing file, but it does not reject absolute paths or require the target to remain under `docs_dir`. A malicious or accidental MyST include such as `../../...` could inline repository or local files into generated user skills during release prep, which is a credential/source disclosure risk on a trusted generated-guidance surface.
    • Recommendation: Pass the docs root into include resolution, reject absolute paths and resolved targets outside the docs tree before `read_text()`, and add a negative regression test proving traversal and absolute include paths are not read or emitted.
    • Evidence: `resolved = (source_dir / raw_path).resolve()` is followed by `resolved.is_file()` and `resolved.read_text(encoding="utf-8")`; unlike `rewrite_doc_paths()`, this path has no `relative_to(docs_dir)` containment check.
  • Source-of-truth review still needed for resolve_includes() fallback behavior (scripts/docs-to-skills.py:724): `resolve_includes()` keeps a localized fallback that emits a placeholder for absent or unreadable include files, but the code and tests do not document the invalid state boundary, why invalid include references cannot be rejected or fixed at source, what regression test protects the behavior, or when the workaround should be removed.
    • Recommendation: Document the intended invalid include states and source boundary, prefer making invalid include paths impossible at parse/generation time, and add fixture tests for both allowed includes and missing/unreadable/out-of-root includes.
    • Evidence: `resolve_includes()` returns `> *Content included from ...*` for missing or unreadable files while still reading any existing resolved path; no nearby source-of-truth explanation or regression test was found.
  • Docs-to-skills refactor lacks visible generator-level regression tests (scripts/docs-to-skills.py:1570): This PR changes the core generator contract: grouped-vs-individual behavior, primary-page selection, frontmatter description source, reference file cleanup, local link rewriting, and inlining of detail pages. Without focused tests, regressions in generated skill shape or safety-sensitive include handling would be hard to catch.
    • Recommendation: Add generator-level tests that call `generate_skill()`, `partition_grouped_skill_pages()`, and `resolve_includes()` on fixtures. Cover non-input-order priority selection, title/body/description alignment, stale reference deletion, local reference link rewriting, concept-only groups, and include traversal rejection.
    • Evidence: No `test/*docs-to-skills*` or equivalent focused generator tests were found in the checkout, while `generate_skill()` and related grouping logic were substantially refactored.
  • Reference-only generated skills may no longer answer top-level user questions directly (.agents/skills/nemoclaw-user-overview/SKILL.md:10): The new grouped strategy makes groups with no procedure page reference-only. In the generated output, `nemoclaw-user-overview` and `nemoclaw-user-reference` now contain only a title plus reference links, with the actual overview/reference content deferred to `references/`. That may be intended progressive disclosure, but it weakens the top-level skill response for prompts such as 'what is NemoClaw?' or CLI reference lookups and appears in tension with the contributing guide's statement that generated skills identically cover the same tasks as their source pages.
    • Recommendation: Confirm this reference-only shape is intentional for concept/reference-only groups, or inline the highest-priority page for those groups as well. Add a golden/fixture test for concept-only and reference-only grouped skills so this behavior is explicit.
    • Evidence: `partition_grouped_skill_pages()` returns `None` when no procedure page exists, and generated `nemoclaw-user-overview/SKILL.md` and `nemoclaw-user-reference/SKILL.md` show only `## References` after the title.

Workflow run details

This is an automated advisory review. A human maintainer must make the final merge decision.

@miyoungc miyoungc changed the title docs: d2s refactoring start docs: refactor d2s script May 29, 2026
@miyoungc miyoungc changed the title docs: refactor d2s script docs(skills): refactor d2s script May 29, 2026
@miyoungc miyoungc marked this pull request as ready for review May 29, 2026 22:34
@miyoungc miyoungc added documentation Improvements or additions to documentation enhancement: skill Improvements to NemoCall repository hygiene or user functionality with skills. labels May 29, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (3)
.agents/skills/nemoclaw-user-overview/SKILL.md (1)

14-14: Split into one sentence per line.

This bullet contains three sentences on a single line, which makes diffs harder to review. Each sentence should be on its own line in the source Markdown.

As per coding guidelines: One sentence per line in source makes diffs readable.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.agents/skills/nemoclaw-user-overview/SKILL.md at line 14, Edit the bullet
starting with "Load [references/overview.md](references/overview.md)" in
SKILL.md to place each sentence on its own line: break the current single-line
bullet into three separate lines so the sentence about when to use the Ecosystem
page, the sentence about internal mechanics/How It Works, and the sentence
explaining what NemoClaw covers (onboarding, lifecycle management, OpenClaw
operations, capabilities and purpose) are each on their own line; keep the
original wording but only adjust line breaks.
.agents/skills/nemoclaw-user-manage-policy/SKILL.md (1)

284-284: Use active voice.

"Custom presets applied with --from-file or --from-dir are recorded in the NemoClaw sandbox registry" uses passive voice. Consider: "NemoClaw records custom presets applied with --from-file or --from-dir in the sandbox registry."

Similarly, "can be removed" and "does not need to be kept" are passive constructions.

As per coding guidelines: Active voice is required for all documentation.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.agents/skills/nemoclaw-user-manage-policy/SKILL.md at line 284, Rewrite the
passive sentences to active voice: change "Custom presets applied with
`--from-file` or `--from-dir` are recorded in the NemoClaw sandbox registry
alongside their full YAML content, so they can be removed by name — the original
file does not need to be kept on disk" to an active form such as "NemoClaw
records custom presets applied with `--from-file` or `--from-dir` in the sandbox
registry along with their full YAML content, so you can remove them by name and
do not need to keep the original file on disk." Update the line containing
`--from-file`, `--from-dir`, "NemoClaw sandbox registry", and references to
"removed by name" / "does not need to be kept" to use active voice consistently.
.agents/skills/nemoclaw-user-configure-security/SKILL.md (1)

16-16: Split into one sentence per line.

This bullet contains multiple sentences on a single line. Each sentence should be on its own line in the source Markdown for better diff readability.

As per coding guidelines: One sentence per line in source makes diffs readable.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.agents/skills/nemoclaw-user-configure-security/SKILL.md at line 16, The
bullet in .agents/skills/nemoclaw-user-configure-security/SKILL.md contains
multiple sentences on one line; split that single bullet into separate lines so
each sentence is on its own line (e.g., break the sentence that starts "Lists
OpenClaw security controls..." into separate lines for each sentence),
preserving the same wording and the reference **Load
[references/openclaw-controls.md](references/openclaw-controls.md)** and the
list of controls (prompt injection detection, tool access control, rate
limiting, environment variable policy, audit framework, supply chain scanning,
messaging access policy, context visibility, and safe regex) so diffs show one
sentence per line.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@docs/CONTRIBUTING.md`:
- Line 99: Replace the passive, inaccurate sentence "Sibling pages are written
unchanged to `references/`." with an active, accurate statement noting that the
generator rewrites and normalizes files before output; update the line to
something like: state that the generator rewrites and normalizes reference
markdown and then writes it to `references/`, removing the word "unchanged" and
using active voice so the documentation reflects the actual behavior of the
generator.

In `@scripts/docs-to-skills.py`:
- Around line 1629-1635: The code builds reference filenames using only
page.path.stem which causes collisions across directories; change the ref_name
creation in the loop (the block using reference_pages, _page_rel(page),
skill_md_local_links and reference_local_links) to derive a unique filename/path
from the page's relative path (include parent directory segments or use the
page.path relative-to-root with a .md suffix) instead of stem-only, and update
both assignments (skill_md_local_links[rel] and reference_local_links[rel])
accordingly; apply the same fix to the similar block around the other occurrence
(lines shown 1755-1766).
- Around line 1853-1855: The current assignment groups[page.path.stem] = [page]
can clobber other procedure pages with identical filenames (e.g., multiple
index.mdx); change the key to a collision-safe unique identifier instead of
page.path.stem (for example use page.path.as_posix() or
page.path.parent.joinpath(page.path.stem).as_posix(), or a page.slug/URL if
available) so each procedure group is unique; update the branch that checks
page.content_type in PROCEDURE_CONTENT_TYPES to use the new key when creating
the group and keep the rest of grouping logic unchanged.

---

Nitpick comments:
In @.agents/skills/nemoclaw-user-configure-security/SKILL.md:
- Line 16: The bullet in
.agents/skills/nemoclaw-user-configure-security/SKILL.md contains multiple
sentences on one line; split that single bullet into separate lines so each
sentence is on its own line (e.g., break the sentence that starts "Lists
OpenClaw security controls..." into separate lines for each sentence),
preserving the same wording and the reference **Load
[references/openclaw-controls.md](references/openclaw-controls.md)** and the
list of controls (prompt injection detection, tool access control, rate
limiting, environment variable policy, audit framework, supply chain scanning,
messaging access policy, context visibility, and safe regex) so diffs show one
sentence per line.

In @.agents/skills/nemoclaw-user-manage-policy/SKILL.md:
- Line 284: Rewrite the passive sentences to active voice: change "Custom
presets applied with `--from-file` or `--from-dir` are recorded in the NemoClaw
sandbox registry alongside their full YAML content, so they can be removed by
name — the original file does not need to be kept on disk" to an active form
such as "NemoClaw records custom presets applied with `--from-file` or
`--from-dir` in the sandbox registry along with their full YAML content, so you
can remove them by name and do not need to keep the original file on disk."
Update the line containing `--from-file`, `--from-dir`, "NemoClaw sandbox
registry", and references to "removed by name" / "does not need to be kept" to
use active voice consistently.

In @.agents/skills/nemoclaw-user-overview/SKILL.md:
- Line 14: Edit the bullet starting with "Load
[references/overview.md](references/overview.md)" in SKILL.md to place each
sentence on its own line: break the current single-line bullet into three
separate lines so the sentence about when to use the Ecosystem page, the
sentence about internal mechanics/How It Works, and the sentence explaining what
NemoClaw covers (onboarding, lifecycle management, OpenClaw operations,
capabilities and purpose) are each on their own line; keep the original wording
but only adjust line breaks.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: c77f483e-8e4c-4710-ad65-6b0abc62594b

📥 Commits

Reviewing files that changed from the base of the PR and between ce4a5c3 and 26f78cd.

📒 Files selected for processing (17)
  • .agents/skills/nemoclaw-user-agent-skills/SKILL.md
  • .agents/skills/nemoclaw-user-agent-skills/references/agent-skills.md
  • .agents/skills/nemoclaw-user-configure-inference/SKILL.md
  • .agents/skills/nemoclaw-user-configure-inference/references/use-local-inference-details.md
  • .agents/skills/nemoclaw-user-configure-security/SKILL.md
  • .agents/skills/nemoclaw-user-get-started/SKILL.md
  • .agents/skills/nemoclaw-user-get-started/references/quickstart-details.md
  • .agents/skills/nemoclaw-user-manage-policy/SKILL.md
  • .agents/skills/nemoclaw-user-manage-policy/references/customize-network-policy-details.md
  • .agents/skills/nemoclaw-user-manage-sandboxes/SKILL.md
  • .agents/skills/nemoclaw-user-manage-sandboxes/references/lifecycle-details.md
  • .agents/skills/nemoclaw-user-overview/SKILL.md
  • .agents/skills/nemoclaw-user-reference/SKILL.md
  • docs/CONTRIBUTING.md
  • docs/about/overview.mdx
  • docs/resources/agent-skills.mdx
  • scripts/docs-to-skills.py
💤 Files with no reviewable changes (5)
  • .agents/skills/nemoclaw-user-get-started/references/quickstart-details.md
  • .agents/skills/nemoclaw-user-manage-policy/references/customize-network-policy-details.md
  • .agents/skills/nemoclaw-user-agent-skills/references/agent-skills.md
  • .agents/skills/nemoclaw-user-configure-inference/references/use-local-inference-details.md
  • .agents/skills/nemoclaw-user-manage-sandboxes/references/lifecycle-details.md

Comment thread docs/CONTRIBUTING.md
Sibling procedure pages, concept pages, and reference pages go into a `references/` subdirectory for progressive disclosure, keeping `SKILL.md` concise while preserving access to the full docs.
The script reads YAML frontmatter from each doc page to determine its content type (`how_to`, `concept`, `reference`, `get_started`), then groups pages into skills using the `grouped` strategy by default.
Within each directory group, the highest-priority procedure page (`how_to`, `get_started`, or `tutorial`) becomes the full body of `SKILL.md`.
Sibling pages are written unchanged to `references/`.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Do not describe reference pages as “unchanged.”

The generator rewrites and normalizes reference markdown before writing to references/, so “written unchanged” is inaccurate. Please also rewrite this sentence in active voice.

✏️ Proposed wording
- Sibling pages are written unchanged to `references/`.
+ The generator writes sibling pages to `references/` after markdown cleanup and link rewriting.

As per coding guidelines, "Active voice required. Flag passive constructions."

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
Sibling pages are written unchanged to `references/`.
The generator writes sibling pages to `references/` after markdown cleanup and link rewriting.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@docs/CONTRIBUTING.md` at line 99, Replace the passive, inaccurate sentence
"Sibling pages are written unchanged to `references/`." with an active, accurate
statement noting that the generator rewrites and normalizes files before output;
update the line to something like: state that the generator rewrites and
normalizes reference markdown and then writes it to `references/`, removing the
word "unchanged" and using active voice so the documentation reflects the actual
behavior of the generator.

Comment thread scripts/docs-to-skills.py
Comment on lines +1629 to 1635
for page in reference_pages:
rel = _page_rel(page)
if rel is None:
continue
ref_name = page.path.stem + ".md"
skill_md_local_links[rel] = f"references/{ref_name}"
reference_local_links[rel] = ref_name
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Avoid stem-only reference filenames in aggregated groups.

ref_name = page.path.stem + ".md" can collide in individual mode when concept/reference pages from different directories share a stem, causing last-write-wins overwrites and bad cross-links.

🔧 Proposed fix
 def generate_skill(
@@
-    skill_md_local_links: dict[str, str] = {}
-    reference_local_links: dict[str, str] = {}
+    skill_md_local_links: dict[str, str] = {}
+    reference_local_links: dict[str, str] = {}
+
+    def _reference_filename(page: DocPage) -> str:
+        if strategy == "individual":
+            rel = _page_rel(page)
+            source = Path(rel).with_suffix("").as_posix() if rel else page.path.with_suffix("").as_posix()
+            return source.strip("/").replace("/", "__") + ".md"
+        return page.path.stem + ".md"
@@
     for page in reference_pages:
@@
-        ref_name = page.path.stem + ".md"
+        ref_name = _reference_filename(page)
         skill_md_local_links[rel] = f"references/{ref_name}"
         reference_local_links[rel] = ref_name
@@
     for ref_page in reference_pages:
-        ref_name = ref_page.path.stem + ".md"
+        ref_name = _reference_filename(ref_page)

Also applies to: 1755-1766

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@scripts/docs-to-skills.py` around lines 1629 - 1635, The code builds
reference filenames using only page.path.stem which causes collisions across
directories; change the ref_name creation in the loop (the block using
reference_pages, _page_rel(page), skill_md_local_links and
reference_local_links) to derive a unique filename/path from the page's relative
path (include parent directory segments or use the page.path relative-to-root
with a .md suffix) instead of stem-only, and update both assignments
(skill_md_local_links[rel] and reference_local_links[rel]) accordingly; apply
the same fix to the similar block around the other occurrence (lines shown
1755-1766).

Comment thread scripts/docs-to-skills.py
Comment on lines +1853 to +1855
if page.content_type in PROCEDURE_CONTENT_TYPES:
groups[page.path.stem] = [page]
elif page.content_type == "concept":
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Use collision-safe keys for individual procedure groups.

groups[page.path.stem] can silently overwrite a previous procedure page when filenames repeat (for example, multiple index.mdx pages), dropping skills from generation.

🔧 Proposed fix
 def group_individual(pages: list[DocPage]) -> dict[str, list[DocPage]]:
@@
     for page in pages:
         if page.content_type in PROCEDURE_CONTENT_TYPES:
-            groups[page.path.stem] = [page]
+            base_key = re.sub(
+                r"[^a-z0-9-]",
+                "-",
+                page.path.with_suffix("").as_posix().lower(),
+            ).strip("-")
+            key = base_key
+            n = 2
+            while key in groups:
+                key = f"{base_key}-{n}"
+                n += 1
+            groups[key] = [page]
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@scripts/docs-to-skills.py` around lines 1853 - 1855, The current assignment
groups[page.path.stem] = [page] can clobber other procedure pages with identical
filenames (e.g., multiple index.mdx); change the key to a collision-safe unique
identifier instead of page.path.stem (for example use page.path.as_posix() or
page.path.parent.joinpath(page.path.stem).as_posix(), or a page.slug/URL if
available) so each procedure group is unique; update the branch that checks
page.content_type in PROCEDURE_CONTENT_TYPES to use the new key when creating
the group and keep the rest of grouping logic unchanged.

@cv cv merged commit 7aa998c into main May 29, 2026
39 checks passed
@cv cv deleted the d2k-refactoring branch May 29, 2026 23:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation enhancement: skill Improvements to NemoCall repository hygiene or user functionality with skills.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants