Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
98 changes: 37 additions & 61 deletions .agents/skills/datadesigner-docs/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,34 +12,31 @@ description: >

Unified skill for adding, updating, moving, and removing pages on the NeMo Data Designer Fern docs site.

Production URL: **`docs.nvidia.com/nemo/datadesigner`** (see `instances` in [`fern/docs.yml`](../../../fern/docs.yml)). Source of truth for everything user-facing is `fern/`.
Current URL: **`datadesigner.docs.buildwithfern.com/nemo/datadesigner`** (see `instances` in [`fern/docs.yml`](../../../fern/docs.yml)). Source of truth for everything user-facing is `fern/`.

## Scope Rule

**ALL doc edits happen under `fern/`.** The legacy `docs/` directory is the original MkDocs source β€” `docs/notebook_source/*.py` (jupytext) and `docs/devnotes/*.md` are still canonical for notebooks and dev notes (they feed Fern), but **do not add new top-level prose pages under `docs/`**. Concept pages, recipes, plugins, code reference β€” all of these live under `fern/versions/v0.5.8/pages/`.

## Versioning Model: Floating Latest

DataDesigner uses a **floating-latest pointer** (matches NeMo Curator). One canonical MDX tree backs both the "Latest" tab and the current frozen train.
DataDesigner currently has Fern-native version entries backed by one shared migrated MDX tree. `latest` is a real rolling nav file; `v0.5.9` and `v0.5.8` are release nav files. All currently reuse `v0.5.8/pages/`.

```
fern/versions/
β”œβ”€β”€ latest.yml ← Unix symlink β†’ v0.5.8.yml
β”œβ”€β”€ v0.5.8.yml ← real nav file (paths point at ./v0.5.8/pages/...)
└── v0.5.8/pages/ ← single canonical MDX tree
β”œβ”€β”€ latest.yml ← rolling nav file (reuses ./v0.5.8/pages/...)
β”œβ”€β”€ v0.5.9.yml ← release nav file (reuses ./v0.5.8/pages/...)
β”œβ”€β”€ v0.5.8.yml ← release nav file (reuses ./v0.5.8/pages/...)
└── v0.5.8/pages/ ← shared migrated MDX tree
```

`docs.yml` registers both `slug: latest` and `slug: v0.5.8`. When you edit a page, **you only edit the v0.5.8 copy** β€” the symlink means `latest` automatically tracks. No mirror step.
`docs.yml` registers `slug: latest`, `slug: v0.5.9`, and `slug: v0.5.8`. When you edit shared docs, edit `v0.5.8/pages/`. Add version-specific page copies only when content diverges.

When a new release ships (e.g. v0.5.9):
Dev Notes are rolling release: `latest.yml` can include posts from `main` that are not in the release nav yet. Frozen release navs (`v0.5.9.yml`, `v0.5.8.yml`) should include only posts available at that release point.

1. `cp -R fern/versions/v0.5.8 fern/versions/v0.5.9`
2. `cp fern/versions/v0.5.8.yml fern/versions/v0.5.9.yml`
3. `sed -i '' 's|./v0.5.8/pages/|./v0.5.9/pages/|g' fern/versions/v0.5.9.yml`
4. `ln -sf v0.5.9.yml fern/versions/latest.yml` (re-target the symlink)
5. Add a `v0.5.9` entry to `docs.yml`'s `versions:` list and update the `latest` entry's `display-name`.
Released versions older than `v0.5.8` remain on the legacy MkDocs archive at `https://nvidia-nemo.github.io/DataDesigner/<version>/`. `docs.yml` redirects `/nemo/datadesigner/v<version>/...` to those archives for versions without a real Fern tree.

Old `v0.5.8` then renders unchanged from its frozen tree at `/v0.5.8/...`. The published site at `/latest/...` follows the symlink to `v0.5.9`.
For future Fern-native releases, add a version YAML that reuses shared pages by default. Copy only pages that need version-specific content.

## Layout at a Glance

Expand All @@ -59,20 +56,21 @@ fern/
β”‚ β”œβ”€β”€ TrajectoryViewer.tsx ← multi-turn tool-call traces (research dev notes)
β”‚ β”œβ”€β”€ BadgeLinks.tsx ← header shields (license, github, etc.)
β”‚ β”œβ”€β”€ Tag.tsx, CustomCard.tsx, CustomFooter.tsx
β”‚ β”œβ”€β”€ notebooks/ ← per-tutorial *.json + *.ts (MDX import target)
β”‚ β”œβ”€β”€ notebooks/ ← gitignored per-tutorial *.json + *.ts output
β”‚ └── devnotes/ ← .authors.yml, authors-data.ts, per-post trajectory data
β”œβ”€β”€ scripts/
β”‚ └── ipynb-to-fern-json.py ← .ipynb β†’ fern/components/notebooks/*.{json,ts}
β”œβ”€β”€ code-reference/ ← gitignored; populated by `fern docs md generate`
└── versions/
β”œβ”€β”€ latest.yml -> v0.5.8.yml
β”œβ”€β”€ latest.yml ← rolling navigation tree
β”œβ”€β”€ v0.5.9.yml ← release navigation tree, reuses v0.5.8/pages/
β”œβ”€β”€ v0.5.8.yml ← navigation tree
└── v0.5.8/pages/ ← MDX content
└── v0.5.8/pages/ ← shared MDX content
```

## URL Routing Rules

Fern's URL is computed from the **section/page titles in `v0.5.8.yml`**, not the file path:
Fern's URL is computed from the **section/page titles in the active version YAML**, not the file path:

```
File system Published URL
Expand All @@ -88,7 +86,7 @@ Rules:
- **Page title β†’ kebab-case slug**: `page: Text-to-SQL for Nemotron Super` β†’ `text-to-sql-for-nemotron-super` (the filename `text-to-sql.mdx` is irrelevant for routing).
- **Subdirectories in the file path are dropped** β€” `devnotes/posts/foo.mdx` becomes `/dev-notes/<page-title>` (no `/posts/`).

When in doubt, recompute by walking the page's position in `v0.5.8.yml` and slugifying each title.
When in doubt, recompute by walking the page's position in the active version YAML and slugifying each title.

## Operations

Expand Down Expand Up @@ -116,7 +114,7 @@ When in doubt, recompute by walking the page's position in `v0.5.8.yml` and slug
```

4. The URL becomes `/<section-slug>/<page-title-slug>`. Update any cross-references in other MDX accordingly.
5. **Do not edit `latest.yml`** β€” it's the symlink and auto-tracks.
5. If the page is a rolling Dev Note that should appear before the next release, add it to `latest.yml` only.

### Update a Page

Expand Down Expand Up @@ -261,12 +259,14 @@ layout: overview # optional β€” only on landing pages
---
```

Do **not** add `position:` (we use explicit nav order in `v0.5.8.yml`), `date:`, or `authors:` to frontmatter β€” Fern's runtime treats `authors:` as a JSX scope variable and explodes when a component tries to reference it. For dev notes, see "Dev Notes" below.
Do **not** add `position:` (we use explicit nav order in the version YAML), `date:`, or `authors:` to frontmatter β€” Fern's runtime treats `authors:` as a JSX scope variable and explodes when a component tries to reference it. For dev notes, see "Dev Notes" below.

## Dev Notes (Blog Posts)

Dev notes live under `fern/versions/v0.5.8/pages/devnotes/posts/`. They use the dev-notes kit components: **`<Authors>`, `<MetricsTable>`, `<TrajectoryViewer>`, `<ExpandableCode>`, `<CustomCard>`** (sources in `fern/components/`, CSS in `fern/styles/`).

For rolling posts on `main`, add the page and card to `latest.yml` first. Add the same nav entry to a release YAML only when that post is part of that release.

### Authors Byline

Author registry: `fern/components/devnotes/.authors.yml` (source of truth) + `fern/components/devnotes/authors-data.ts` (typed copy that `Authors.tsx` imports). Edit both together.
Expand Down Expand Up @@ -364,26 +364,27 @@ docs/notebook_source/*.py (jupytext format β€” canonical source, edit
β”‚ make convert-execute-notebooks # jupytext --execute (needs NVIDIA_API_KEY)
β–Ό
docs/notebooks/*.ipynb (executed; outputs captured)
β”‚ make generate-colab-notebooks # injects "Open in Colab" badge
β–Ό
docs/colab_notebooks/*.ipynb (committed; for "Open in Colab" links)
β”‚ make generate-fern-notebooks # auto-prefers docs/notebooks/ when present
β”‚
β”‚ make generate-fern-notebooks # per-file prefers executed docs/notebooks/
β”‚ # otherwise converts notebook_source/*.py directly
β–Ό
fern/components/notebooks/*.{json,ts}
fern/components/notebooks/*.{json,ts} (gitignored; generated before preview/publish)
```

The `.ts` is what the wrapper MDX imports β€” Fern's bundler doesn't follow `.json` imports cleanly.
`docs/colab_notebooks/*.ipynb` is a separate committed output for "Open in Colab" links. It is generated by `make generate-colab-notebooks`, but it is not a Fern docs build input.

The `.ts` is what the wrapper MDX imports. Fern's bundler doesn't follow `.json` imports cleanly.

### Make Targets

| Command | When |
|---------|------|
| `make generate-fern-notebooks` | Notebook prose changed, no need to re-execute. Auto-detects `docs/notebooks/` (executed) vs `docs/colab_notebooks/` (un-executed snapshots). |
| `make generate-fern-notebooks` | Notebook prose changed, no need to re-execute. Per file, prefers `docs/notebooks/` (executed) and falls back to converting `docs/notebook_source/*.py` directly. |
| `make generate-fern-notebooks-with-outputs` | Notebook code changed, want fresh outputs. Needs `NVIDIA_API_KEY` (and `OPENROUTER_API_KEY` for image notebooks 5–6). |

Both targets pin to `DOCS_PYTHON ?= 3.13` because `pyarrow` lacks Python 3.14 wheels. Override via `DOCS_PYTHON=3.12 make ...`.
Install notebook docs dependencies first with `make install-dev-notebooks`. Docs setup pins to `DOCS_PYTHON_VERSION ?= 3.13` because `pyarrow` lacks Python 3.14 wheels. Override via `DOCS_PYTHON_VERSION=3.12 make ...`.

The `convert-execute-notebooks` step loops per-file with `|| failed=...`, so one notebook missing an API key won't kill the others β€” failures are reported at the end and the chain continues with whatever succeeded.
The `convert-execute-notebooks` step loops per file so one notebook missing an API key does not prevent later notebooks from running. Any failure is reported after the loop and the make target exits non-zero.

### Wrapper Page

Expand All @@ -408,13 +409,13 @@ The converter (`fern/scripts/ipynb-to-fern-json.py`) **auto-strips the leading C

## Python API Reference (`libraries:`)

`docs.yml` declares a `libraries:` block pointing at `packages/data-designer-config/src/data_designer/config`. Generated output lands at `fern/code-reference/data-designer/` β€” **gitignored**. To populate locally:
`docs.yml` declares a `libraries:` block pointing at `packages/data-designer-config/src/data_designer/config`. Local generation uses `py2fern` against that same source. Generated output lands at `fern/code-reference/data-designer/` - **gitignored**. To populate locally:

```bash
cd fern && fern docs md generate
make generate-fern-api-reference
```

No FERN_TOKEN required. Re-run when the upstream Python source changes.
This does not require Fern auth. Re-run when the upstream Python source changes. If you need to compare with Fern's native generator, use `make generate-fern-api-reference-native` with Fern auth.

The generated tree is wired into the nav via `versions/v0.5.8.yml`'s "Code Reference > Python API" folder entry (`folder: ../code-reference/data-designer`). The nav also includes prose pages under "Topic Overviews" β€” those are conceptual landings that link to the auto-generated reference.

Expand Down Expand Up @@ -466,7 +467,7 @@ fern docs dev # localhost:3000 hot-reload preview
To generate the API reference for local preview:

```bash
cd fern && fern docs md generate # populates fern/code-reference/ (gitignored)
make generate-fern-api-reference # py2fern; populates fern/code-reference/ (gitignored)
```

If the "Python API" sidebar folder is empty, you forgot this step.
Expand All @@ -484,32 +485,7 @@ When the team adds a Fern preview workflow (modeled after Gym's `fern-docs-previ

## Cutting a New Version Train

When a release ships (e.g. `v0.5.9`):

1. Copy `fern/versions/v0.5.8/` β†’ `fern/versions/v0.5.9/` (frozen snapshot).
2. Copy `fern/versions/v0.5.8.yml` β†’ `fern/versions/v0.5.9.yml` and rewrite `./v0.5.8/pages/` β†’ `./v0.5.9/pages/` paths.
3. Re-target the symlink: `ln -sf v0.5.9.yml fern/versions/latest.yml`.
4. Add to `fern/docs.yml`:

```yaml
versions:
- display-name: "Latest Β· v0.5.9"
path: versions/latest.yml
slug: latest
availability: stable
- display-name: "v0.5.9"
path: versions/v0.5.9.yml
slug: v0.5.9
availability: stable
- display-name: "v0.5.8"
path: versions/v0.5.8.yml
slug: v0.5.8
availability: stable
```

5. Add v0.5.9-specific redirect entries (`/nemo/datadesigner/v0.5.9/:path*/index.html` β†’ `/nemo/datadesigner/v0.5.9/:path*` etc.).

The `v0.5.8` tree continues to render at frozen URLs.
Do not copy page trees by hand. Add a new version YAML that reuses shared pages by default; copy only pages that need version-specific content. If that becomes tedious, add a build-time materialization script before `fern generate --docs`.

## Debugging

Expand All @@ -523,8 +499,8 @@ The `v0.5.8` tree continues to render at frozen URLs.
| "Something went wrong!" runtime error | A custom component is throwing β€” check `<Authors ids={authors} />` (use literal array) or `<ExpandableCode>` (currently broken in SSR) |
| Notebook page renders raw `<a href=colab...>` HTML | `.ts` was generated before the colab-strip improvement; re-run `make generate-fern-notebooks` |
| Notebook page has no cell outputs | Ran without `NVIDIA_API_KEY` or `convert-execute-notebooks` failed; run `make generate-fern-notebooks-with-outputs` |
| `URLError: [SSL: CERTIFICATE_VERIFY_FAILED]` during notebook execution | `DOCS_CERTS` not propagated; ensure you're invoking via the make target, not raw `uv run` |
| `Failed to build pyarrow==X` from source | `DOCS_PYTHON` resolved to 3.14+; override with `DOCS_PYTHON=3.13 make ...` (or just rely on the default) |
| `URLError: [SSL: CERTIFICATE_VERIFY_FAILED]` during notebook execution | `DOCS_CERTS` not propagated; ensure you're invoking via the make target, not raw Python |
| `Failed to build pyarrow==X` from source | `DOCS_PYTHON_VERSION` resolved to 3.14+; override with `DOCS_PYTHON_VERSION=3.13 make ...` (or just rely on the default) |
| Cards on landing all link to the same wrong URL | `href` not matching Fern's slugified-title rule β€” recompute as `/<section-slug>/<page-title-slug>` |
| Image broken in preview, file exists at `fern/assets/...` | Reference uses relative `../assets/...` β€” change to absolute `/assets/...` (relative paths break across version slugs) |

Expand Down
65 changes: 65 additions & 0 deletions .github/workflows/build-fern-docs.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
name: Build Fern docs

on:
workflow_dispatch:
inputs:
use_cache:
description: "Use cached notebooks for unchanged sources"
type: boolean
default: true
release:
types:
- published

permissions: {}

jobs:
build-notebooks:
uses: ./.github/workflows/build-notebooks.yml
permissions:
actions: read
contents: write
with:
use_cache: ${{ github.event_name == 'workflow_dispatch' && inputs.use_cache || false }}
secrets: inherit

publish:
needs: build-notebooks
runs-on: ubuntu-latest
permissions:
contents: read
env:
FERN_TOKEN: ${{ secrets.DOCS_FERN_TOKEN }}
steps:
- name: Checkout repository
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6

- name: Install uv
uses: astral-sh/setup-uv@08807647e7069bb48b6ef5acd8ec9567f424441b # v8.1.0
with:
version: "0.9.5"

- name: Set up Python
run: uv python install 3.13

- name: Install docs dependencies
run: uv sync --python 3.13 --all-packages --group docs --group notebooks

- name: Download executed notebooks
uses: actions/download-artifact@3e5f45b2cfb9172054b4087a40e8e0b5a5461e7c # v8.0.1
with:
name: notebooks
path: docs/notebooks

- name: Check Fern docs
run: make check-fern-docs

- name: Publish Fern docs
run: |
if [ -z "$FERN_TOKEN" ]; then
echo "::error::DOCS_FERN_TOKEN secret is required to publish Fern docs."
exit 1
fi

cd fern
npx -y fern-api@$(jq -r .version fern.config.json) generate --docs --no-prompt
2 changes: 2 additions & 0 deletions .github/workflows/build-notebooks.yml
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,8 @@ jobs:
version: "0.9.5"
- name: Set up Python
run: uv python install 3.11
- name: Install notebook dependencies
run: uv sync --all-packages --group notebooks --group docs
- name: Restore notebook cache
if: inputs.use_cache
id: cache
Expand Down
34 changes: 33 additions & 1 deletion .github/workflows/docs-preview.yml
Original file line number Diff line number Diff line change
Expand Up @@ -5,11 +5,16 @@ on:
types: [opened, synchronize, reopened]
paths:
- "docs/**"
- "fern/**"
- "mkdocs.yml"
- ".github/workflows/docs-preview.yml"

permissions: {}

concurrency:
group: docs-preview-${{ github.event.pull_request.number }}
cancel-in-progress: true

jobs:
build-and-deploy:
if: github.actor != 'dependabot[bot]'
Expand Down Expand Up @@ -78,7 +83,30 @@ jobs:
- name: Build docs
run: uv run mkdocs build

- name: Check Fern docs
run: make check-fern-docs

- name: Skip hosted previews for fork PRs
if: github.event.pull_request.head.repo.full_name != github.repository
run: echo "::notice::Skipping hosted docs previews because this PR comes from a fork."

- name: Deploy Fern preview
if: github.event.pull_request.head.repo.full_name == github.repository
id: fern-preview
env:
FERN_TOKEN: ${{ secrets.DOCS_FERN_TOKEN }}
run: |
if [ -z "$FERN_TOKEN" ]; then
echo "::error::DOCS_FERN_TOKEN secret is required to publish Fern preview docs."
exit 1
fi

cd fern
npx -y fern-api@$(jq -r .version fern.config.json) generate --docs --preview --id pr-${{ github.event.pull_request.number }} --force --no-prompt
echo "url=https://nvidia-preview-pr-${{ github.event.pull_request.number }}.docs.buildwithfern.com/nemo/datadesigner" >> "$GITHUB_OUTPUT"
Comment on lines +93 to +106
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Fern failure cascades to block MkDocs deploy and PR comment

Deploy Fern preview sits before Deploy to Cloudflare Pages with no continue-on-error guard. GitHub Actions skips all subsequent steps when any step fails (the implicit success() condition applies even to steps with explicit event-based if: guards). A Fern API error, timeout, or token issue will therefore prevent the MkDocs site from being deployed and prevent the PR comment from being posted β€” silently dropping the MkDocs preview URL from the PR.

Adding continue-on-error: true to the Fern step lets the Cloudflare deploy and comment steps always run for same-repo PRs regardless of Fern's outcome.

Prompt To Fix With AI
This is a comment left during a code review.
Path: .github/workflows/docs-preview.yml
Line: 93-106

Comment:
**Fern failure cascades to block MkDocs deploy and PR comment**

`Deploy Fern preview` sits before `Deploy to Cloudflare Pages` with no `continue-on-error` guard. GitHub Actions skips all subsequent steps when any step fails (the implicit `success()` condition applies even to steps with explicit event-based `if:` guards). A Fern API error, timeout, or token issue will therefore prevent the MkDocs site from being deployed *and* prevent the PR comment from being posted β€” silently dropping the MkDocs preview URL from the PR.

Adding `continue-on-error: true` to the Fern step lets the Cloudflare deploy and comment steps always run for same-repo PRs regardless of Fern's outcome.

How can I resolve this? If you propose a fix, please make it concise.


- name: Deploy to Cloudflare Pages
if: github.event.pull_request.head.repo.full_name == github.repository
id: deploy
uses: cloudflare/wrangler-action@9acf94ace14e7dc412b076f2c5c20b8ce93c79cd # v3
with:
Expand All @@ -87,6 +115,7 @@ jobs:
command: pages deploy site/ --project-name=dd-docs-preview --branch=pr-${{ github.event.pull_request.number }}

- name: Find existing comment
if: github.event.pull_request.head.repo.full_name == github.repository
uses: peter-evans/find-comment@b30e6a3c0ed37e7c023ccd3f1db5c6c0b0c23aad # v4
id: find-comment
with:
Expand All @@ -95,13 +124,16 @@ jobs:
body-includes: "<!-- docs-preview -->"

- name: Post or update PR comment
if: github.event.pull_request.head.repo.full_name == github.repository
uses: peter-evans/create-or-update-comment@e8674b075228eee787fea43ef493e45ece1004c9 # v5
with:
comment-id: ${{ steps.find-comment.outputs.comment-id }}
issue-number: ${{ github.event.pull_request.number }}
edit-mode: replace
body: |
<!-- docs-preview -->
**Docs preview:** ${{ steps.deploy.outputs.deployment-url }}
**MkDocs preview:** ${{ steps.deploy.outputs.deployment-url }}

**Fern preview:** ${{ steps.fern-preview.outputs.url }}

> Notebook tutorials are placeholder-only in previews.
Loading
Loading