For a high-level overview of AI integration, see the AI Integration Guide.
MarkUpsideDown uses a Cloudflare Worker for four features:
- Document Import — Convert PDF, Office docs, images, etc. to Markdown via Workers AI
AI.toMarkdown() - Rendered Fetch — Fetch JavaScript-rendered pages as Markdown via Browser Rendering
- Structured Extraction — Extract structured JSON data from web pages using AI (Browser Rendering + Workers AI LLM)
- Website Crawl — Crawl an entire website and save all pages as Markdown files via Browser Rendering
/crawlAPI
Each user deploys their own Worker instance.
On first launch, MarkUpsideDown opens the Settings panel with a Setup with Cloudflare button. This automates the entire process:
- Checks that
wrangleris installed globally - Runs
wrangler login(opens browser for Cloudflare OAuth) - Creates Cloudflare resources — KV namespace (cache), R2 bucket (publish), Queue (batch conversion), Vectorize index (semantic search). Each is optional and created in parallel; failures are non-fatal
- Deploys the Worker with a randomized URL (e.g.
markupsidedown-a3f8k2xp7m9qb.example.workers.dev) to prevent third-party URL guessing - (Optional) Configures secrets for Render JS via OAuth — creates a scoped API token automatically
- Verifies the deployment
After step 4, Document Import is ready to use. Step 5 (secrets) is only needed for Render JS (fetching JavaScript-rendered pages). If OAuth token creation fails, you can add secrets later via wrangler secret put.
Resources created in step 3 enable additional features (see Feature Status). If a resource fails to create, the Worker still deploys without that binding — features degrade gracefully.
If you have multiple Cloudflare accounts, you'll be prompted to select one.
Prerequisites: npm install -g wrangler
- A Cloudflare account (free tier works for document import)
- wrangler CLI installed globally
Create a single API token that covers deployment, document import, and rendered fetch:
- Go to API Tokens → Create Token
- Use the "Edit Cloudflare Workers" template → Use template
- Add these additional permissions:
Account→Workers AI→ReadAccount→Browser Rendering→Edit
- Account Resources: select your account
- Create the token and save it
| Scope | Permission | Purpose |
|---|---|---|
| Account > Workers Scripts | Edit | wrangler deploy |
| Account > Workers AI | Read | AI.toMarkdown() (document import) |
| Account > Browser Rendering | Edit | /render and /crawl endpoints |
export CLOUDFLARE_API_TOKEN="your-token-here"
cd worker && wrangler deployThe secrets are required for the /render and /crawl endpoints (Rendered Fetch and Website Crawl). If you only need Document Import, you can skip this — /convert uses the AI binding directly.
Get your Account ID from Cloudflare Dashboard → Workers & Pages → right sidebar.
cd worker
wrangler secret put CLOUDFLARE_ACCOUNT_ID # paste your Account ID
wrangler secret put CLOUDFLARE_API_TOKEN # paste the same API token- Open Settings in the toolbar (or wait for the first-launch prompt)
- Paste your Worker URL (e.g.
https://markupsidedown-XXXXXX.example.workers.dev) - Click Test to verify the connection
- Check Feature Status to see which capabilities are ready
- Click Save
The URL is saved in localStorage and persists across sessions.
| Scenario | Use |
|---|---|
| Static pages, blogs, docs | Fetch (fast, free) |
| SPAs (React, Vue, Angular) | Render |
| Pages behind JS-based loading | Render |
| Dynamic dashboards | Render |
| Entire documentation site | Crawl |
| Blog or wiki archival | Crawl |
| Building a local Markdown corpus for AI/RAG | Crawl |
The Render pipeline strips boilerplate (nav, header, footer, cookie banners, ads) using HTMLRewriter before converting to Markdown, producing cleaner output than raw HTML conversion.
The Crawl feature uses the Browser Rendering /crawl REST API to discover and convert pages starting from a URL. Results are saved as organized .md files under a local directory (e.g. domain/path.md).
| Plan | Browser Time | Rate Limit | Cost |
|---|---|---|---|
| Free | 10 min/day | 6 req/min | Free |
| Paid | 10 hrs/month | 600 req/min | $5/month |
The free tier is sufficient for occasional use. Render responses are cached for 1 hour. Crawl with render: true (default) is billed as Browser Rendering hours; render: false runs on Workers and is free during beta. The app defaults to a 50-page limit to prevent accidental cost overrun. See Browser Rendering Pricing.
| Format | Cost |
|---|---|
| PDF, DOCX, XLSX, HTML, CSV, XML | Free (no AI Neurons) |
| Images (JPG, PNG, WebP, SVG) | AI Neurons (OCR) |
The app shows a confirmation dialog before processing images.
| Category | Extensions |
|---|---|
| Documents | .pdf, .docx, .xlsx |
| Web/data | .html, .htm, .csv, .xml |
| Images | .jpg, .jpeg, .png, .webp, .svg |
Ensure CLOUDFLARE_API_TOKEN is set and the token has Workers Scripts: Edit permission. See API Token.
Set the secrets as described in Set Worker Secrets. This is required for Render and Crawl features.
- 403: Check that your API token has
Browser Rendering - Editpermission - 429: Rate limit exceeded (free tier: 6 req/min)
- Timeout: Complex pages may take longer; try standard fetch instead
If the app can't auto-detect your API token, it will prompt you to paste one manually. Create a token with the permissions listed in API Token.
If you skip this step, the Worker is still deployed and Document Import works — only Render and Crawl require the secrets.
The app shows an "Update available" badge in Settings when your deployed Worker is older than the version bundled with the app. To update:
cd worker && wrangler deployNo app-side changes needed — the URL stays the same. Secrets persist across deploys. Click Test in Settings to verify the new version.
When to update: After installing a new version of MarkUpsideDown, check Settings → Worker Status. If it shows "Update available", redeploy the Worker. New MCP tools (e.g., extract_json) may require Worker endpoints that don't exist in older versions.
The Worker exposes its version via GET /health:
{ "status": "ok", "version": 6, "capabilities": { "fetch": true, "convert": true, "render": true, "json": true, "crawl": true, "cache": true, "batch": true, "publish": true, "search": true } }The capabilities object shows which features are available:
render,json,crawl— require Worker secrets (CLOUDFLARE_ACCOUNT_IDandCLOUDFLARE_API_TOKEN)cache— requires KV namespace bindingbatch— requires Queue + KV bindingspublish— requires R2 bucket bindingsearch— requires Vectorize index binding
If capabilities show false, the corresponding resource was not created during setup. Re-run setup or create resources manually.
The Worker includes permissive CORS headers (*). To restrict origins, edit CORS_HEADERS in worker/src/index.ts.