Automatically collect, analyze, and track 8,000+ Kickstarter creators with a modern web interface.
π Live Demo
- One row per creator with all their projects
- 22 columns including: name, projects, location, categories, social media URLs
- Social Media: Instagram, Facebook, Twitter, YouTube, TikTok, LinkedIn, Patreon, Discord, Twitch, Bluesky
- ~8,000 creators with complete contact information
- Formatted with auto-sized columns, frozen headers, and wrapping enabled
Sync to PostgreSQL (Supabase) with 3 smart tables:
| Table | Purpose |
|---|---|
creators |
Creator profiles, avatars, websites, social media |
projects |
Project details, funding data, categories, deadlines |
creator_outreach |
Track outreach status, notes, tags, and follow-ups |
Features:
- β Smart updates (only changes data that actually changed)
- β Auto-detects creators without contact info
- β Tracks social media presence
- β Manual outreach tracking (contacted, accepted, declined, etc.)
- β Preserves your manual notes and tags
- β Detects when creators add contact info in new scrapes
- π Advanced search & filtering by name, category, country, funding
- π Creator & project dashboards
- πΌ Outreach management system
- π± Responsive design with dark/light mode
- π€ Export filtered results to Excel
- β Infinite scroll with smart pagination
- Projects: Search by name, category, country, state, funding goals
- Creators: Filter by backed projects count, social media presence
- Multi-filter support: Combine multiple criteria for precise results
- Real-time search: Instant results as you type
- 8,000+ Projects: All upcoming Kickstarter campaigns with complete details
- Creator Profiles: Full creator info including bio, location, and statistics
- Social Media: 10+ platforms (Instagram, Twitter, LinkedIn, TikTok, YouTube, etc.)
- Project Metrics: Funding goals, backers count, deadlines, categories
- Automatic Updates: Data synced regularly from Kickstarter
- Track communication status with creators
- Custom status labels: Not Contacted, Email Sent, Follow-ups, Partnership, etc.
- Notes and tags for each creator
- Timeline tracking for follow-ups
- Total projects tracked
- Creator statistics
- Funding goals aggregation
- Category breakdowns
- Social media presence insights
- Framework: Next.js 16 (App Router)
- Language: TypeScript
- Styling: Tailwind CSS + shadcn/ui
- State Management: React Hooks
- Data Fetching: Server Components + Client-side RPC
- Icons: Lucide React
- Excel Export: SheetJS (xlsx)
- Database: PostgreSQL (Supabase)
- API: Supabase RPC Functions
- Authentication: Supabase Auth (ready for implementation)
- Real-time: Supabase Realtime subscriptions
- Web Scraping: Cloudscraper (bypasses Cloudflare)
- HTTP: Requests with retry logic
- Data Processing: Pandas for exports
- Database Sync: Supabase Python client
- Proxy Support: Rotating proxies for large-scale scraping
- Contact Extraction: Firecrawl Map + Scrape with Supabase-managed API keys, domain blocking, and rotation when credits are exhausted
The scraper + contact extraction runs automatically every hour (0 * * * *). No setup needed!
- Fork this repository
- Enable Actions in your fork (Actions tab β Enable workflows)
- Download results from the Actions tab after each run
- Files are kept for 24 hours
- (Optional) Configure settings in GitHub Secrets
Manual Trigger:
- Go to Actions tab β Scheduled Kickstarter Scraper β Run workflow β Wait ~10 minutes
Change Schedule:
- Edit
.github/workflows/scheduled-scraper.yml(line 5:- cron: '0 * * * *') - Use crontab.guru to generate cron expressions
# Clone and install
git clone https://github.com/TechBeme/kickstarter-scraper.git
cd kickstarter-scraper
pip install -r requirements.txt
# Run scraper (Excel only, no database)
python run.py --skip-supabaseTime: ~6-10 minutes for 8,000+ projects
# 1. Install dependencies
pip install -r requirements.txt
# 2. Configure environment variables
cp .env.example .env
# Edit .env with your Supabase credentials
# 3. Create database (run in Supabase SQL Editor):
# Tables:
# - supabase/tables/creators.sql
# - supabase/tables/projects.sql
# - supabase/tables/creator_outreach.sql
# - supabase/tables/firecrawl_accounts.sql # Firecrawl API keys + status
# - supabase/tables/firecrawl_blocked_domains.sql # Domains that failed in Firecrawl
# - supabase/tables/pipeline_state.sql # Tracks last contact run
# Functions:
# - supabase/functions/bulk_upsert.sql
# - supabase/functions/search_projects.sql
# - supabase/functions/search_creators.sql
# - supabase/functions/get_home_stats.sql
# - supabase/functions/get_projects_metadata.sql
# - supabase/functions/firecrawl_block_domain.sql
# 4. Run scraper (syncs to Supabase by default)
python run.py- Node.js 18+ and npm
- Python 3.9+
- PostgreSQL database (Supabase recommended)
# 1. Clone the repository
git clone https://github.com/TechBeme/kickstarter-scraper.git
cd kickstarter-scraper
# 2. Set up Python environment
python -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
pip install -r requirements.txt
# 3. Set up the database
# - Create a Supabase project at https://supabase.com
# - Run SQL migrations in supabase/ folder (in order):
# - supabase/tables/creators.sql
# - supabase/tables/projects.sql
# - supabase/tables/creator_outreach.sql
# - supabase/tables/firecrawl_accounts.sql
# - supabase/tables/firecrawl_blocked_domains.sql
# - supabase/tables/pipeline_state.sql
# - supabase/functions/bulk_upsert.sql
# - supabase/functions/search_projects.sql
# - supabase/functions/search_creators.sql
# - supabase/functions/get_home_stats.sql
# - supabase/functions/get_projects_metadata.sql
# - supabase/functions/firecrawl_block_domain.sql
# 4. Configure environment variables
# Create .env in root:
cat > .env << EOF
SUPABASE_URL=https://your-project-id.supabase.co
SUPABASE_KEY=your-service-role-key
EOF
# Create website/.env.local:
cat > website/.env.local << EOF
SUPABASE_URL=https://your-project-id.supabase.co
SUPABASE_SERVICE_ROLE_KEY=your-service-role-key
EOF
# 5. Install frontend dependencies
cd website
npm install
# 6. Run the scraper (optional - syncs to Supabase by default)
# From project root:
python run.py
# 7. Start the development server
cd website
npm run devVisit http://localhost:3000 or kickstarter.techbe.me for the live demo.
# Basic run (collects, enriches, exports, and syncs to Supabase)
python run.py
# Excel only (no database sync)
python run.py --skip-supabase
# π Filter projects from last 90 days
python run.py --days-filter 90
# π Filter projects from last 7 days (quick scan)
python run.py --days-filter 7 --skip-fetch
# Filter and sync only (skip collection & enrichment)
python run.py --skip-fetch --skip-enrich --days-filter 90
python run.py --skip-contacts # Skip Firecrawl step if you just want Supabase sync
# With proxy (env or CLI)
PROXY_URL=http://user:pass@host:port python run.py
# or:
python run.py --proxy-url http://user:pass@host:port
# Test with limited data
python run.py --enrich-limit 200
# Re-create Excel from existing data
python run.py --skip-fetch
# Show detailed progress
python run.py --debug
# Skip enrichment phase
python run.py --skip-enrich
# Skip both export and database sync
python run.py --skip-export --skip-supabase
# Contacts only (parallel Firecrawl map/scrape using Supabase queue)
PYTHONPATH=src python -m firecrawl_tools.contact_runner --limit-contacts 200 --contacts-workers 50 --contacts-batch-size 10
# Dry-run contacts (no writes): add --dry-run-contacts
# View all options
python run.py --helpFilter projects by creation date (in Kickstarter):
# Extract only projects from the last 90 days
python run.py --days-filter 90
# Extract only from the last 30 days
python run.py --days-filter 30
# Quick test: last 7 days, limited enrichment
python run.py --days-filter 7 --enrich-limit 100
# Use with existing data (faster) - skip fetch/enrich, only sync
python run.py --skip-fetch --skip-enrich --days-filter 90How it works:
- Filters based on project's
created_attimestamp from Kickstarter API - Applied AFTER collection (use
--max-pagesto limit collection) - Shows before/after count:
π Applied 90-day filter: 8500 β 2340 projects - Useful for focused outreach campaigns or testing
- Runs automatically after Supabase sync unless you pass
--skip-contacts. - Selects creators missing email/form or with a new primary site (tracked via
site_hashandpipeline_state.last_contact_check_at). - Uses Supabase tables for state:
firecrawl_accounts(API key rotation),firecrawl_blocked_domains(shared blocklist), andcreator_outreach(sync_creator_outreach_bulkRPC). - Mapping (
search="contact", limit 5) + scraping to find visible emails/forms; stops early when both are found. - Parallel by default (100 workers / batch size 20). Tune via
PYTHONPATH=src python -m firecrawl_tools.contact_runner --contacts-workers ... --contacts-batch-size ....
| Variable | Default | Description |
|---|---|---|
SUPABASE_URL |
(required) | Supabase project URL |
SUPABASE_KEY |
(required) | Supabase service role key (kept as secret) |
FETCH_TIMEOUT |
20 |
Timeout for requests (seconds) |
ENRICH_TIMEOUT |
30 |
Timeout for enrichment (seconds) |
FETCH_MAX_RETRIES |
10 |
Number of retry attempts |
ENRICH_MAX_RETRIES |
5 |
Number of retry attempts |
FETCH_RETRY_WAIT |
5.0 |
Wait between retries (seconds) |
ENRICH_RETRY_WAIT |
60.0 |
Wait between retries (seconds) |
ENRICH_DELAY |
0.0 |
Delay between enrichment requests |
USE_PROXY |
false |
Enable proxy usage |
PROXY_URL |
(empty) | Proxy URL (e.g., http://user:pass@host:port) |
For GitHub Actions: Set as repository secrets in Settings β Secrets and variables β Actions
Default values work well for most cases! Only change if needed.
creators
- Creator profiles with avatar, bio, location
- Social media websites array (JSONB)
- Backing statistics
projects
- Project details, funding goals, deadlines
- Category and location information
- Creator relationships
- Photo and media URLs
creator_outreach
- Outreach status tracking
- Social media presence flags
- Contact history and notes
- Tags for organization
firecrawl_accounts
- Firecrawl API keys and status (
active/exhausted) shared across workers
firecrawl_blocked_domains
- Shared blocklist for unsupported or failed domains during contact extraction
pipeline_state
- Tracks last contact extraction run to avoid reprocessing unchanged creators
search_projects()- Advanced project search with filterssearch_creators()- Creator search with social media filtersget_home_stats()- Dashboard statisticsget_projects_metadata()- Filter dropdown optionsupsert_creators_bulk()- Bulk insert/update creatorsupsert_projects_bulk()- Bulk insert/update projectssync_creator_outreach_bulk()- Bulk sync outreach datafirecrawl_block_domain()- Upsert blocked domains from Firecrawl errors
See docs/supabase-functions.md for detailed setup instructions.
The spreadsheet includes 22 columns:
Creator Information:
- Name, Profile URL, Slug
- Project count
Projects:
- Project names (one per line)
- Project URLs (one per line)
Location:
- Country
- City/State
Categories:
- Category
- Parent category
Social Media & Websites:
- Instagram, Facebook, Twitter, YouTube, TikTok
- LinkedIn, Patreon, Discord, Twitch, Bluesky
- Other websites
Features:
- Text wrapping enabled
- Auto-adjusted column widths
- Frozen header row
- Alphabetically sorted
- All sensitive credentials stored in environment variables
.envfiles ignored by Git- No hardcoded API keys or passwords
- Supabase Row Level Security ready
- HTTPS enforced in production
- CORS properly configured
- Server-side rendering with Next.js
- PostgreSQL indexes on all search fields
- Efficient RPC functions with computed columns
- Infinite scroll pagination (50 items per page)
- Image optimization with Next.js Image
- SQL query optimization with proper joins
Collection Phase:
- 3-5 minutes for 8,000+ projects
Enrichment Phase:
- 2-3 minutes for metadata extraction
Excel Export:
- 10-20 seconds
Database Sync:
- 30-60 seconds (if enabled)
Total: 6-10 minutes
This project is licensed under the MIT License - see the LICENSE file for details.
This project is independent and is not officially affiliated with or endorsed by Kickstarter. It's a third-party tool designed to help businesses and marketers discover partnership opportunities. All Kickstarter trademarks and data belong to their respective owners.
Important Notes:
- Purpose: Educational and research use only
- Respect: Kickstarter's Terms of Service
- Rate limiting: Don't run too frequently
- Privacy: Never commit
.envfile or expose API keys
Looking for a tailored automation solution for your business?
I specialize in building high-quality, production-ready automation systems.
- π Web Scraping - Extract data from any website
- β‘ Automation - Automate repetitive tasks and workflows
- π» Website Development - Build modern web applications
- π€ AI Integrations - Connect AI models to your applications
- Fiverr: Tech_Be
- Upwork: Profile
- GitHub: TechBeme
- Email: contact@techbe.me
Rafael Vieira - Full-Stack Developer & AI Automation Specialist
- π Web Scraping & Data Collection
- β‘ Automation & Process Optimization
- π» Full-Stack Web Development
- π€ AI & Machine Learning Integration
- π Database Design & Optimization
πΊπΈ English β’ πͺπΈ EspaΓ±ol β’ π§π· PortuguΓͺs
- Website: techbe.me
- Email: contact@techbe.me
- GitHub: @TechBeme
- Fiverr: Tech_Be
- Upwork: Profile
Looking for a custom automation solution? I build production-ready systems tailored to your needs:
- π§ Web scraping & data extraction
- π Automated workflows & integrations
- π Modern web applications
- π§ AI-powered solutions
Get a free quote today! contact@techbe.me