A semantic chunking service for documents, GitHub repos, webpages, and Confluence. It creates semantically relevant chunks from your content that can be queried later.
- Multi-source ingestion: Markdown files, GitHub repositories, webpages, Confluence pages
- Website crawling: Ingest entire documentation sites via sitemap or link crawling
- Semantic chunking: Intelligent text chunking with code-awareness
- Code-aware: Extracts code blocks with context and language detection
- Per-library indexing: Each library/site gets its own Qdrant collection plus a central
librariesindex - Vector search: Powered by Qdrant for fast semantic search
- Fast embeddings: Uses FastEmbed / BAAI bge models for efficient embedding generation
- HTTP API + background worker: Queue-based ingestion API with an async worker that processes jobs in the background
- MCP server: First-class Model Context Protocol (MCP) server exposing DeepContext as a documentation tool
# Using Docker
docker compose up -d
# Or using Podman
podman compose up -duv sync# From a local markdown file
uv run deepcontext ingest ./docs/README.md
# From a GitHub URL
uv run deepcontext ingest https://github.com/vercel/next.js/blob/canary/contributing/core/developing.md
# From a webpage
uv run deepcontext ingest https://nextjs.org/docs/app/building-your-application/routing/middleware
# From a Confluence page (requires auth)
export CONFLUENCE_EMAIL=you@company.com
export CONFLUENCE_TOKEN=your_api_token
uv run deepcontext ingest https://company.atlassian.net/wiki/spaces/DOCS/pages/123456/Page-Title
# From an entire Confluence space
uv run deepcontext ingest-confluence https://company.atlassian.net/wiki DOCS --limit 50
# From an entire website (uses sitemap or crawls links)
uv run deepcontext ingest-website https://ui.shadcn.com/docs --max-pages 50uv run deepcontext search "middleware authentication"You can also run DeepContext as a long-lived HTTP service that exposes:
- REST API for ingestion, search, stats, and library metadata
- MCP server mounted at
/mcpfor streamable-http MCP clients
# Start the API + MCP service (FastAPI + Uvicorn)
uv run deepcontext serve --host 0.0.0.0 --port 8000
# Open interactive docs
# http://localhost:8000/docsFor stdio-based MCP clients (e.g. local tools), you can also run:
uv run deepcontext-mcpand configure your MCP client to talk to the deepcontext-mcp command.
Search Results for: how to redirect in middleware
────────────────────────────────────────────────────────────
### Example
Source: /path/to/middleware.md
Score: 0.6693
Example typescript code demonstrating implementation.
┌──────────────────────────────────────────────────────────┐
│ import { NextResponse } from 'next/server' │
│ import type { NextRequest } from 'next/server' │
│ │
│ export function middleware(request: NextRequest) { │
│ return NextResponse.redirect(new URL('/home', ...)) │
│ } │
│ │
│ export const config = { │
│ matcher: '/about/:path*', │
│ } │
└──────────────────────────────────────────────────────────┘
────────────────────────────────────────────────────────────
Ingest content from any supported source.
SOURCE can be:
- A local file path (e.g.,
./docs/README.md) - A GitHub URL (e.g.,
https://github.com/vercel/next.js/blob/canary/docs/...) - A Confluence page (e.g.,
https://company.atlassian.net/wiki/spaces/DOCS/pages/123456) - A webpage URL (e.g.,
https://example.com/docs)
Options:
--chunk-size: Maximum chunk size in tokens (default: 1024)--threshold: Similarity threshold for semantic chunking (default: 0.7)--code-aware/--no-code-aware: Use code-aware chunking (default: enabled)--collection: Qdrant collection name (default: deepcontext_chunks)--host: Qdrant host (default: localhost)--port: Qdrant port (default: 6333)
Ingest all documentation from a GitHub repository.
Options:
--branch: Branch to fetch from (default: auto-detect)--path: Path within the repo to start from--extensions: File extensions to fetch (default: .md,.mdx)--collection: Qdrant collection name (default: deepcontext_chunks)--host: Qdrant host (default: localhost)--port: Qdrant port (default: 6333)
Authentication: Set GITHUB_TOKEN environment variable for higher rate limits (5,000/hour vs 60/hour for unauthenticated requests).
# Create token at https://github.com/settings/tokens with 'public_repo' scope
export GITHUB_TOKEN=ghp_your_token_here
uv run deepcontext ingest-repo vercel/next.js --path docsIngest all pages from a Confluence space.
Requires environment variables:
CONFLUENCE_EMAIL: Your Atlassian account emailCONFLUENCE_TOKEN: Your Atlassian API token (create one here)
Options:
--limit: Maximum pages to fetch (default: 100)--collection: Qdrant collection name (default: deepcontext_chunks)--host: Qdrant host (default: localhost)--port: Qdrant port (default: 6333)
Example:
export CONFLUENCE_EMAIL=you@company.com
export CONFLUENCE_TOKEN=your_api_token
uv run deepcontext ingest-confluence https://company.atlassian.net/wiki ENGINEERINGIngest all pages from a website using sitemap or link crawling.
The command will:
- Try to find and parse sitemap.xml
- If no sitemap, crawl links from the base URL
- Filter URLs to only those under the base path
Options:
--max-pages: Maximum pages to fetch (default: 100)--pattern: URL pattern to filter pages (regex)--no-sitemap: Don't try to use sitemap.xml, crawl links instead--collection: Qdrant collection name (default: deepcontext_chunks)--host: Qdrant host (default: localhost)--port: Qdrant port (default: 6333)
Examples:
# Ingest shadcn docs (uses sitemap automatically)
uv run deepcontext ingest-website https://ui.shadcn.com/docs
# Ingest with URL pattern filter
uv run deepcontext ingest-website https://nextjs.org/docs --pattern "/docs/app/"
# Force crawling instead of sitemap
uv run deepcontext ingest-website https://example.com/docs --no-sitemapSearch for semantically similar chunks.
Options:
--limit: Maximum results (default: 5)--language: Filter by code language--source-type: Filter by source type (markdown, github, webpage, confluence)--collection: Qdrant collection name (default: deepcontext_chunks)--host: Qdrant host (default: localhost)--port: Qdrant port (default: 6333)
Show statistics about the vector store.
Options:
--collection: Qdrant collection name (default: deepcontext_chunks)--host: Qdrant host (default: localhost)--port: Qdrant port (default: 6333)
Clear all data from the vector store. Requires confirmation.
Options:
--collection: Qdrant collection name (default: deepcontext_chunks)--host: Qdrant host (default: localhost)--port: Qdrant port (default: 6333)
DeepContext also supports background ingestion via a simple job queue. This is useful when you want to queue many large ingestions and process them with a worker.
Queue a single source for ingestion (queued version of deepcontext ingest).
Options mirror deepcontext ingest:
--host,--port,--collection--chunk-size,--threshold,--code-aware/--no-code-aware
The command prints a job ID; use deepcontext jobs / deepcontext job to inspect it.
Queue a GitHub repository for ingestion (queued version of ingest-repo).
Options mirror deepcontext ingest-repo:
--branch,--path,--extensions,--host,--port,--collection
Queue a Confluence space for ingestion (queued version of ingest-confluence).
Options mirror deepcontext ingest-confluence:
--limit,--host,--port,--collection
Queue a website for ingestion (queued version of ingest-website).
Options mirror deepcontext ingest-website:
--max-pages,--pattern,--no-sitemap,--host,--port,--collection
List jobs in the queue with optional filters:
--status:pending,processing,completed,failed--limit: maximum number of jobs to show (default: 20)
Show details for a specific job, including status, counts, timestamps and any error message.
Run a background worker that polls the queue and processes ingestion jobs:
--once: process a single job and exit--poll-interval: seconds between polling for new jobs (default: 2.0)
This is useful alongside the HTTP API, which also uses the same queue internally.
Cancel/delete a job from the queue. If the job is currently processing, it is removed from the queue but may still complete.
Clear all completed and failed jobs from the queue.
Start the HTTP API server (FastAPI + Uvicorn) with the MCP HTTP server mounted at /mcp:
uv run deepcontext serve --host 0.0.0.0 --port 8000Options:
--host: Host to bind to (default:0.0.0.0)--port: Port to bind to (default:8000)--reload: Enable auto-reload for development
┌─────────────────┐ ┌─────────────────┐ ┌────────────────────────────┐
│ ContentFetcher │ ──▶ │ DocumentChunker │ ──▶ │ VectorStore (per-lib) │
│ │ │ /SemanticChunker│ │ + LibraryStore (meta) │
│ - Markdown │ │ - Code-aware │ │ │
│ - GitHub │ │ - Section-based │ │ Qdrant collections: │
│ - Webpages │ │ - Smart titles │ │ - one per library/site │
│ - Websites │ │ │ │ - one global `libraries` │
│ - Confluence │ │ │ │ │
└─────────────────┘ └─────────────────┘ └────────────────────────────┘
┌──────────────────────────────────────────────────────────┐
│ DeepContext Service │
│ - HTTP API (/ingest, /search, /libraries, /stats, …) │
│ - Background worker processing queued jobs │
│ - MCP server (streamable-http) mounted at `/mcp` │
└──────────────────────────────────────────────────────────┘
from deepcontext import ContentFetcher, VectorStore
from deepcontext.chunker import ChunkConfig, DocumentChunker
# Fetch content
with ContentFetcher() as fetcher:
doc = fetcher.fetch("https://github.com/vercel/next.js/blob/canary/docs/...")
# Chunk with code awareness
config = ChunkConfig(
max_chunk_size=2000, # Max characters per chunk
min_chunk_size=200, # Min characters for a chunk
overlap_size=100, # Overlap between chunks
)
chunker = DocumentChunker(config)
chunks = chunker.chunk_document(doc)
# Store and search
store = VectorStore()
store.index_chunks(chunks)
results = store.search("middleware authentication")
for result in results:
print(result.chunk.to_display_format())
print(f"Score: {result.score}")
store.close()from deepcontext import ContentFetcher
with ContentFetcher() as fetcher:
# Fetch entire GitHub repo
docs = fetcher.fetch_github_repo("vercel/next.js", path="docs")
# Fetch Confluence space
docs = fetcher.fetch_confluence_space(
base_url="https://company.atlassian.net/wiki",
space_key="DOCS",
limit=50,
)
# Fetch entire website
docs = fetcher.fetch_website(
base_url="https://ui.shadcn.com/docs",
max_pages=100,
url_pattern=r"/docs/", # Optional regex filter
)| Variable | Description |
|---|---|
GITHUB_TOKEN |
GitHub personal access token for higher API rate limits |
CONFLUENCE_EMAIL |
Atlassian account email for Confluence access |
CONFLUENCE_TOKEN |
Atlassian API token for Confluence access |
QDRANT_HOST |
Host for the Qdrant instance used by the HTTP API / MCP server (default: localhost) |
QDRANT_PORT |
Port for the Qdrant instance (default: 6333) |
DEFAULT_COLLECTION |
Default collection name for legacy CLI commands (default: deepcontext_chunks) |
You can also use a .env file in the project root - it will be loaded automatically.
- Python 3.12+
- Docker or Podman (for Qdrant)
MIT