feat add fetch_url_tool so AI chat can read direct URLs#7328
feat add fetch_url_tool so AI chat can read direct URLs#7328krushnarout wants to merge 5 commits into
Conversation
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Greptile SummaryThis PR adds
Confidence Score: 2/5Not safe to merge as-is — any authenticated user can point the tool at cloud metadata services or internal network addresses and read the response directly in their AI chat. The new tool makes outbound HTTP calls to arbitrary user-controlled URLs without filtering link-local or private-network destinations. On a cloud-hosted backend this directly exposes IAM credential endpoints and internal services to any user who can send a message to the AI chat. backend/utils/retrieval/tools/web_tools.py needs a hostname/IP block-list and a response-size guard before this ships.
|
| Filename | Overview |
|---|---|
| backend/utils/retrieval/tools/web_tools.py | New tool that fetches arbitrary user-supplied URLs with no SSRF protection, no response-size cap before decode, and shared use of the webhook connection pool. |
| backend/utils/retrieval/agentic.py | Adds fetch_url_tool to CORE_TOOLS and the display-name map; change is mechanical and correct. |
| backend/utils/retrieval/tools/init.py | Exports fetch_url_tool from the new web_tools module; straightforward import/export addition. |
Sequence Diagram
sequenceDiagram
participant User as User (Mobile Chat)
participant Claude as Claude LLM
participant Tool as fetch_url_tool
participant Client as get_webhook_client()
participant Ext as External URL
User->>Claude: "Summarize https://example.com/article"
Claude->>Tool: fetch_url_tool(url)
Tool->>Tool: Validate scheme (http/https only)
Tool->>Client: "GET url (timeout=15s, follow_redirects=True)"
Client->>Ext: HTTP GET request
Ext-->>Client: HTTP response (status, content-type, body)
Client-->>Tool: response object
Tool->>Tool: Check status code
Tool->>Tool: Check content-type
Tool->>Tool: _html_to_text(response.text) — full body in memory
Tool->>Tool: Truncate to 8000 chars
Tool-->>Claude: Content from url
Claude-->>User: Summary of page content
Reviews (1): Last reviewed commit: "feat wire fetch_url_tool into CORE_TOOLS..." | Re-trigger Greptile
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…x content-type check Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Summary
fetch_url_toolthat fetches and strips HTML from a given URL, returning up to 8000 chars of readable textCORE_TOOLSinagentic.pyso Claude uses it when a user shares a linkTest plan
web_searchstill works for non-URL queries🤖 Generated with Claude Code