A codebase that enables websites to be agentic-browser friendly. BotSight crawls a webpage, validates the scrape, extracts structured data, generates metadata/snippets, and ships it as an npm module/snippet that site owners can embed. The goal is to make pages visible and actionable for AI agents and LLM crawlers.
This monorepo contains the following packages:
Core scraping, validation, and extraction logic with FireCrawl integration for enhanced data extraction.
CLI tool to run BotSight locally.
Reusable npm package snippet for websites.
(Planned) Analytics and simulation dashboard.
Backend server with telemetry, configuration, and simulation endpoints.
Client-side snippet for easy website integration.
- Node.js >= 16
- pnpm
- PostgreSQL database
- Redis server
pnpm installpnpm build# Initialize configuration
botsight init
# Crawl a website
botsight crawl https://example.com
# Validate a website
botsight validate https://example.com
# Generate snippet from JSON data
botsight generate data.jsonnpm install botsight-snippetimport { injectBotSight } from "botsight-snippet";
injectBotSight({
name: "Client Site",
url: "https://client.com",
offers: [...]
});# Navigate to server directory
cd server
# Set environment variables
export DATABASE_URL=postgresql://user:password@localhost:5432/botsight_db
export REDIS_URL=redis://localhost:6379
# Run database migrations
npm run migrate
# Seed initial data
npm run seed
# Start the server
npm start
# In separate terminals, start worker and agent sync
npm run worker
npm run sync-agentsAdd this to your website's HTML:
<script
src="https://your-server.com/botsight.iife.js"
data-site-id="your-site-id"
async>
</script>βββ botsight-core/ # Core logic with FireCrawl integration
β βββ crawler.ts # Page crawling (static + dynamic + FireCrawl)
β βββ validator.ts # Scrape validation
β βββ extractor.ts # Structured data extraction
β βββ snippet-generator.ts # Snippet generation
βββ botsight-cli/ # CLI interface
β βββ cli.ts # Command definitions
βββ botsight-npm/ # Embeddable snippet
β βββ index.ts # Browser-compatible JS
βββ botsight-server/ # Backend server
β βββ src/
β β βββ routes/ # API endpoints
β β βββ workers/ # Playwright simulation worker
β β βββ jobs/ # Agent sync job
β β βββ db/ # Database connection
β βββ db/ddl/ # Database schema
βββ botsight-snippet/ # Client-side snippet
β βββ src/
βββ botsight-dashboard/ # (Future) Analytics dashboard
- Enhanced Page Crawling: Static + dynamic rendering with Playwright and FireCrawl integration
- Scrape Validation: Confidence scoring and missing element detection
- Structured Data Extraction: JSON-LD, OpenGraph, Twitter Cards, etc.
- Snippet Generation: Agent-readable structured data with BotSight-specific metadata
- NPM Package: One-line integration for site owners
- CLI Tool: Easy local testing and onboarding
- Auto Agent Detection: Automatically detect new AI agents via user-agent database
- Site-owner Analytics: Track which agents visited and what they extracted
- Dashboard Simulation: See how GPT/Perplexity and other agents view your page
- Production Ready: Security, privacy, and performance optimized
POST /v1/telemetry- Accepts telemetry data from client snippets- Automatically matches user agents to known AI agents
- Stores visit data and extracted fields for analytics
GET /v1/config/:siteId- Returns site configuration for snippets
POST /v1/simulate- Enqueues Playwright jobs to simulate how agents see pages- Captures JSON-LD, meta tags, content, and screenshots
- Automatic sync with remote agents database
- Unknown agent detection and candidate review system
- Comprehensive database schema for tracking visits, agents, and extracted data
- Example SQL queries for insights and reporting
- Dashboard options for real-time monitoring
- Performance and security optimized
- Quick Start Guide - Get up and running in under 30 minutes
- Setup Guide - Detailed deployment instructions
- Database Setup - Database configuration and initialization
- Database Operations - Monitoring and analysis queries
- Monitoring Dashboard - Dashboard setup options
- Troubleshooting - Common issues and solutions
MIT