Skip to content

feat: add hand-crafted llms.txt with build-time validation#784

Open
crtahlin wants to merge 4 commits intomasterfrom
fix/llms-plugin-config
Open

feat: add hand-crafted llms.txt with build-time validation#784
crtahlin wants to merge 4 commits intomasterfrom
fix/llms-plugin-config

Conversation

@crtahlin
Copy link
Collaborator

@crtahlin crtahlin commented Mar 20, 2026

Summary

Closes ethersphere/DevRel#838.

  • static/llms.txt — Hand-crafted, llmstxt.org-compliant index file with 75 curated doc links organised into 8 task-oriented sections (install, configure, upload, host, build, concepts, reference, optional). Includes intro context (chunks, DISC, xBZZ, ports) and top-level resource links (API ref, OpenAPI spec, bee-js, Book of Swarm, whitepaper).
  • scripts/validate-llms-txt.mjs — Prebuild validation script that cross-checks llms.txt URLs against actual doc files. Reports stale links and missing coverage as warnings (exit 0, non-blocking).
  • package.json — Chains the validation script into the existing prebuild hook.
  • README.md — Documents the llms.txt workflow and what to do when validation warnings appear.

Also includes the plugin config fix from b62582c that reconfigures docusaurus-plugin-llms to only generate llms-full.txt.

How it works

npm run build
  → prebuild: fetch-awesome-swarm.mjs && validate-llms-txt.mjs
  → docusaurus build (generates llms-full.txt via plugin)
  → static/llms.txt copied to build/ as-is

Every build warns if llms.txt has drifted from the actual docs, without blocking.

Test plan

  • npm run build succeeds, validation script runs with only expected warnings (4 intro pages + 1 stub)
  • npx docusaurus serve/llms.txt returns the hand-crafted file, /llms-full.txt returns plugin output
  • Spot-check 5–10 URLs from llms.txt to confirm they resolve (not 404)
  • Temporarily rename a doc file, run build, confirm the script warns about the stale link

🤖 Generated with AI

Fix incorrect parameter names (anchor_01/anchor_02 → anchor1/anchor2),
wrong response field reference (Time → durationSeconds), broken anchor
link, unnecessary sudo, and replace 120-line failure example with a
concise successful one. Add specific hardware guidance in a warning
admonition.
The plugin config referenced non-existent paths (docs/api/**, docs/learn/**)
causing it to silently fall back to unfiltered defaults.

Changes:
- Disable llms.txt generation (will be replaced by hand-crafted static file)
- Fix include ordering to follow logical reading order
- Enable excludeImports and removeDuplicateHeadings for cleaner output
- Set proper title and description for the llmstxt.org standard header

Refs: ethersphere/DevRel#839
Create static/llms.txt — a curated, llmstxt.org-compliant index that
helps AI agents quickly find relevant Swarm documentation. Organised
into 8 task-oriented sections (install, configure, upload, host, build,
concepts, reference, optional) covering 75 doc pages.

Add scripts/validate-llms-txt.mjs that runs during prebuild to detect
stale links and missing coverage, keeping llms.txt in sync as docs
evolve.
@netlify
Copy link

netlify bot commented Mar 20, 2026

Deploy Preview for test-twitter-preview-testing-3 ready!

Name Link
🔨 Latest commit 865e735
🔍 Latest deploy log https://app.netlify.com/projects/test-twitter-preview-testing-3/deploys/69bd44f52f702200096f3222
😎 Deploy Preview https://deploy-preview-784--test-twitter-preview-testing-3.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@darkobas2
Copy link
Contributor

And who is gonna check the logs and fix ?

@crtahlin
Copy link
Collaborator Author

If you have any suggestions to use some channel for alert messages, if that is possible? In any case, one of the files is manually generated, so checks just alert if they find discrepancies. It can and should be checked like the other files, in the "human" docs, for perhaps being obsolete.

Update the OpenAPI spec URL to the local copy that will be served from
the docs site (see #827) instead of the GitHub raw URL.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants