crawl4ai version
0.8.6
Expected Behavior
MCP scrape tools (crawl4ai_md, crawl4ai_html, etc.) should accept the same wait_until, delay_before_return_html, and cache_mode parameters as the REST API (POST /crawl) and CLI (crwl -c). This would allow MCP-based agents to reliably scrape JavaScript-heavy pages by waiting for content to fully load.
Example desired usage:
{
"url": "https://example.com/dynamic-content",
"wait_until": "networkidle",
"delay_before_return_html": 2
}
Environment
| Component |
Version |
| Crawl4AI |
0.8.6 |
| OS |
Debian GNU/Linux 12 (bookworm) |
| Python |
3.12.13 |
| Image |
unclecode/crawl4ai:latest |
Suggested Fix
- Expose
crawler_config parameters on MCP tool schemas — map wait_until, delay_before_return_html, cache_mode, etc. to the existing REST/CLI options
- Document MCP defaults vs REST/CLI behavior
Acceptance Criteria
Current Behavior
MCP tools return immediately after initial DOM load, without waiting for dynamic content. No parameters are exposed to control wait behavior.
- REST API with
crawler_config.wait_until: "networkidle" → ✅ Complete rendered content
- CLI with
-c 'wait_until=networkidle' → ✅ Complete rendered content
- MCP tools → ❌ Incomplete content (dynamic elements missing)
Current workaround: Bypass MCP entirely and use POST /crawl directly.
Is this reproducible?
Yes
Inputs Causing the Bug
Any JavaScript-heavy / AJAX-driven page where content loads after initial page load.
Steps to Reproduce
1. Start Crawl4AI with MCP enabled at `/mcp/sse`
2. Configure any MCP client (e.g., OpenCode) with the Crawl4AI MCP server
3. Call MCP scrape tool:
crawl4ai_md(url="https://example.com/dynamic-content")
4. Observe: Content is incomplete (dynamic elements not rendered)
5. Compare with working REST call:
curl -s http://localhost:11235/crawl \
-H 'Content-Type: application/json' \
-d '{
"urls": ["https://example.com/dynamic-content"],
"crawler_config": {
"wait_until": "networkidle",
"cache_mode": "bypass"
}
}'
6. Observe: REST returns complete rendered content
Code snippets
OS
Debian GNU/Linux 12 (bookworm)
Python version
3.12.13
Browser
No response
Browser version
No response
Error logs & Screenshots (if applicable)
No response
crawl4ai version
0.8.6
Expected Behavior
MCP scrape tools (
crawl4ai_md,crawl4ai_html, etc.) should accept the samewait_until,delay_before_return_html, andcache_modeparameters as the REST API (POST /crawl) and CLI (crwl -c). This would allow MCP-based agents to reliably scrape JavaScript-heavy pages by waiting for content to fully load.Example desired usage:
{ "url": "https://example.com/dynamic-content", "wait_until": "networkidle", "delay_before_return_html": 2 }Environment
unclecode/crawl4ai:latestSuggested Fix
crawler_configparameters on MCP tool schemas — mapwait_until,delay_before_return_html,cache_mode, etc. to the existing REST/CLI optionsAcceptance Criteria
wait_untilparameter (load,domcontentloaded,networkidle,commit)delay_before_return_htmlparameterCurrent Behavior
MCP tools return immediately after initial DOM load, without waiting for dynamic content. No parameters are exposed to control wait behavior.
crawler_config.wait_until: "networkidle"→ ✅ Complete rendered content-c 'wait_until=networkidle'→ ✅ Complete rendered contentCurrent workaround: Bypass MCP entirely and use
POST /crawldirectly.Is this reproducible?
Yes
Inputs Causing the Bug
Steps to Reproduce
Code snippets
OS
Debian GNU/Linux 12 (bookworm)
Python version
3.12.13
Browser
No response
Browser version
No response
Error logs & Screenshots (if applicable)
No response