Describe your product. Define your target market. The AI finds the leads for you.
OpenOutreach is a self-hosted, open-source LinkedIn automation tool for B2B lead generation. Unlike other tools, you don't need a list of profiles to contact β you describe your product and your target market, and the system autonomously discovers, qualifies, and contacts the right people.
How it works:
- You provide a product description and a campaign objective (e.g. "SaaS analytics platform" targeting "VP of Engineering at Series B startups")
- The AI generates LinkedIn search queries to discover candidate profiles
- A Bayesian ML model (Gaussian Process on profile embeddings) learns which profiles match your ideal customer β using an explore/exploit strategy to balance finding the best leads now vs. learning what makes a good lead
- Early on, an LLM classifies each profile; as the model learns, it auto-decides with increasing confidence, reducing LLM calls
- Qualified leads are automatically contacted with personalized connection requests and follow-up messages
The system gets smarter with every decision. It starts by exploring broadly, then progressively focuses on the highest-value profiles as it learns your ideal customer profile from its own classification history.
Why choose OpenOutreach?
- π§ Autonomous lead discovery β No contact lists needed; AI finds your ideal customers
- π‘οΈ Undetectable β Playwright + stealth plugins mimic real user behavior
- πΎ Self-hosted + full data ownership β Everything runs locally, browse your CRM in a web UI
- π³ One-command setup β Dockerized deployment, interactive onboarding
- β¨ AI-powered messaging β LLM-generated personalized outreach (bring your own model)
Perfect for founders, sales teams, and agencies who want powerful automation without account bans or subscription lock-in.
| # | What | Example |
|---|---|---|
| 1 | A LinkedIn account | Your email + password |
| 2 | An LLM API key | OpenAI, Anthropic, or any OpenAI-compatible endpoint |
| 3 | A product description + target market | "We sell cloud cost optimization for DevOps teams at mid-market SaaS companies" |
That's it. No spreadsheets, no lead databases, no scraping setup.
Pre-built images are published to GitHub Container Registry on every push to master.
docker run --pull always -it -p 5900:5900 --user "$(id -u):$(id -g)" -v ./assets:/app/assets ghcr.io/eracle/openoutreach:latestThe interactive onboarding walks you through the three inputs above on first run. Your data persists in the local assets/ directory across restarts β the same database used by python manage.py.
Connect a VNC client to localhost:5900 to watch the browser live.
For Docker Compose, build-from-source, and more options see the Docker Guide.
For contributors or if you prefer running directly on your machine.
git clone https://github.com/eracle/OpenOutreach.git
cd OpenOutreach
# Install deps, Playwright browsers, run migrations, and bootstrap CRM
make setupmake runThe interactive onboarding will prompt for LinkedIn credentials, LLM API key, and campaign details on first run. Fully resumable β stop/restart anytime without losing progress.
OpenOutreach includes a full CRM web interface powered by DjangoCRM:
# Create an admin account (first time only)
python manage.py createsuperuser
# Start the web server
make adminThen open:
- Django Admin: http://localhost:8000/admin/
- CRM UI: http://localhost:8000/crm/
| Feature | Description |
|---|---|
| π§ Autonomous Lead Discovery | No contact lists needed β LLM generates search queries from your product description and campaign objective. |
| π― Bayesian Active Learning | Gaussian Process model on profile embeddings learns your ideal customer via explore/exploit, auto-qualifying with increasing accuracy. |
| π€ Stealth Browser Automation | Playwright + stealth plugins mimic real user behavior for undetectable interactions. |
| π‘οΈ Voyager API Scraping | Uses LinkedIn's internal API for accurate, structured profile data (no fragile HTML parsing). |
| π Stateful Pipeline | Tracks profile states (NEW β PENDING β CONNECTED β COMPLETED) in a local DB β fully resumable. |
| β±οΈ Smart Rate Limiting | Configurable daily/weekly limits per action type, respects LinkedIn's own limits automatically. |
| πΎ Built-in CRM | Full data ownership via DjangoCRM with Django Admin UI β browse Leads, Contacts, Companies, and Deals. |
| π³ One-Command Deployment | Dockerized setup with interactive onboarding and VNC browser view (localhost:5900). |
| βοΈ AI-Powered Messaging | LLM-generated personalized connection and follow-up messages via Jinja2 templates. |
The daemon runs a continuous loop with priority-scheduled action lanes:
| Priority | Lane | What it does |
|---|---|---|
| 1 | Connect | Ranks qualified profiles by Bayesian model probability, sends connection requests (daily + weekly limits) |
| 2 | Check Pending | Checks if pending requests were accepted (exponential backoff) |
| 3 | Follow Up | Sends LLM-personalized messages to connected profiles (daily limit) |
| Gap-filler | Qualify | Bayesian active learning β embeds profiles, then explore/exploit to select and classify candidates |
| Lowest | Search | LLM-generated LinkedIn People search keywords discover new profiles when the pipeline runs low |
The qualification loop in detail:
Profiles discovered during navigation are automatically scraped and embedded (384-dim FastEmbed vectors). The Qualify lane then decides which profile to evaluate next using a balance-driven strategy:
- When negatives outnumber positives β exploit: pick the profile with highest predicted qualification probability (seek likely positives to fill the pipeline)
- Otherwise β explore: pick the profile with highest BALD (Bayesian Active Learning by Disagreement) score (seek the most informative label to improve the model)
For each selected profile, the Gaussian Process model checks if it's confident enough to auto-decide (low entropy + low posterior uncertainty). If confident, it qualifies or disqualifies automatically. If uncertain, it falls back to an LLM call. Every decision β human or auto β feeds back into the model, making it progressively smarter.
Cold start: With fewer than 2 labelled profiles, the model can't fit β all decisions go through the LLM. As labels accumulate, the GP auto-decides more profiles, reducing LLM calls over time.
Cost curve: The system gets cheaper to run the longer it operates. Early on, every profile requires an LLM call (~100% LLM usage). As the Gaussian Process learns your preferences, it auto-decides with high confidence on an increasing share of profiles β the LLM is only queried for genuinely uncertain cases. A mature model can auto-decide the majority of profiles, cutting LLM costs dramatically.
Configure rate limits and behavior via Django Admin (LinkedInProfile + Campaign models).
βββ analytics/ # dbt project (DuckDB analytics, ML training sets)
β βββ models/staging/ # Staging views (stg_leads, stg_deals, stg_stages)
β βββ models/marts/ # ML training set (ml_connection_accepted)
βββ assets/
β βββ data/ # crm.db (SQLite), analytics.duckdb (embeddings + analytics)
β βββ models/ # Persisted ML model (model.joblib)
βββ docs/
β βββ architecture.md # System architecture
β βββ configuration.md # Configuration reference
β βββ docker.md # Docker setup guide
β βββ templating.md # Message template guide
β βββ testing.md # Testing strategy
βββ linkedin/
β βββ actions/ # Browser actions (connect, message, scrape)
β βββ api/ # Voyager API client + parser
β βββ conf.py # Configuration loading (.env + defaults)
β βββ daemon.py # Main daemon loop (priority-scheduled lanes)
β βββ db/crm_profiles.py # CRM-backed profile CRUD (Lead, Contact, Company, Deal)
β βββ django_settings.py # Django/CRM settings (SQLite at assets/data/crm.db)
β βββ lanes/ # Action lanes (qualify, connect, check_pending, follow_up, search)
β βββ management/setup_crm.py # Idempotent CRM bootstrap (Dept, Stages, Users)
β βββ ml/ # Bayesian qualifier, DuckDB embeddings, profile text, search keywords
β βββ navigation/ # Login, throttling, browser utilities, enums
β βββ onboarding.py # Interactive onboarding (campaign, credentials, LLM config)
β βββ gdpr.py # GDPR location detection for newsletter
β βββ rate_limiter.py # Daily/weekly rate limiting
β βββ sessions/ # Session management (AccountSession)
β βββ templates/ # Message rendering (Jinja2 / AI-prompt)
βββ manage.py # Entry point (no args = daemon, or Django commands)
βββ local.yml # Docker Compose
βββ Makefile # Shortcuts (setup, run, admin, analytics, test)
Join for support and discussions: Telegram Group
Got a specific use case, feature request, or questions about setup?
Book a free 15-minute call β I'd love to hear your needs and improve the tool based on real feedback.
This project is built in spare time to provide powerful, free open-source growth tools. Your sponsorship funds faster updates and keeps it free for everyone.
| Tier | Monthly | Benefits |
|---|---|---|
| β Supporter | $5 | Huge thanks + name in README supporters list |
| π Booster | $25 | All above + priority feature requests + early access to new campaigns |
| π¦Έ Hero | $100 | All above + personal 1-on-1 support + influence roadmap |
| π Legend | $500+ | All above + custom feature development + shoutout in releases |
GNU GPLv3 β see LICENCE.md
Not affiliated with LinkedIn.
By using this software you accept the Legal Notice. It covers LinkedIn ToS risks, built-in self-promotional actions, automatic newsletter subscription for non-GDPR accounts, and liability disclaimers.
Use at your own risk β no liability assumed.
Made with β€οΈ

