Skip to content

RFC: Bot Detection & CAPTCHA Integration #504

@lakhansamani

Description

@lakhansamani

RFC: Bot Detection & CAPTCHA Integration

Phase: 1 — Security Hardening & Enterprise Foundation
Priority: P1 — High
Estimated Effort: Low
Depends on: Rate Limiting (#501)


Problem Statement

Authorizer has no bot protection on authentication endpoints. Automated scripts can hit signup/login endpoints freely. WorkOS Radar and Clerk both have sophisticated bot protection. CAPTCHA is a baseline requirement for any production auth system exposed to the internet.


Current Architecture Context

  • Auth endpoints: GraphQL mutations (signup, login, verify_otp, magic_link_login, forgot_password) via /graphql
  • REST endpoints: /oauth/token, /authorize
  • Frontend apps: web/app/ (login UI built with React + Vite), web/dashboard/
  • Config via Cobra CLI flags in cmd/root.go
  • GraphQL input types defined in internal/graph/schema.graphqls
  • No CAPTCHA library in current go.mod

Proposed Solution

1. Pluggable CAPTCHA Provider

New package: internal/captcha/

// Provider interface for pluggable CAPTCHA backends
type Provider interface {
    // VerifyToken validates a CAPTCHA token server-side
    VerifyToken(ctx context.Context, token string, remoteIP string) (bool, error)
    // GetProviderName returns the provider name for logging
    GetProviderName() string
}

type Dependencies struct {
    Log    *zerolog.Logger
    Config *config.Config
}

Supported providers:

  1. Cloudflare Turnstile (recommended default — free, privacy-friendly)

    • Verification endpoint: POST https://challenges.cloudflare.com/turnstile/v0/siteverify
    • Body: secret={secret_key}&response={token}&remoteip={ip}
    • Response: {"success": true/false, "error-codes": [...]}
  2. Google reCAPTCHA v3 (score-based, no user friction)

    • Verification endpoint: POST https://www.google.com/recaptcha/api/siteverify
    • Body: secret={secret_key}&response={token}&remoteip={ip}
    • Response: {"success": true, "score": 0.9, "action": "login"}
    • Configurable score threshold (default: 0.5)
  3. hCaptcha (privacy-focused alternative)

    • Verification endpoint: POST https://api.hcaptcha.com/siteverify
    • Same request/response pattern

Factory pattern (consistent with codebase DI pattern):

func New(cfg *config.Config, deps *Dependencies) (Provider, error) {
    switch cfg.CaptchaProvider {
    case "turnstile":
        return newTurnstileProvider(cfg, deps)
    case "recaptcha":
        return newRecaptchaProvider(cfg, deps)
    case "hcaptcha":
        return newHcaptchaProvider(cfg, deps)
    default:
        return nil, nil // CAPTCHA disabled
    }
}

2. Integration with Auth Endpoints

GraphQL schema changes — Add captcha_token to auth input types:

input SignUpInput {
    # ... existing fields
    captcha_token: String
}

input LoginInput {
    # ... existing fields
    captcha_token: String
}

input MagicLinkLoginInput {
    # ... existing fields
    captcha_token: String
}

input ForgotPasswordInput {
    # ... existing fields
    captcha_token: String
}

Verification logic — Add to auth mutation handlers (internal/graphql/):

// At the top of signup/login handlers, before any business logic:
if cfg.CaptchaProvider != "" {
    if params.CaptchaToken == nil || *params.CaptchaToken == "" {
        return nil, fmt.Errorf("captcha_token is required")
    }
    valid, err := captchaProvider.VerifyToken(ctx, *params.CaptchaToken, clientIP)
    if err != nil || !valid {
        return nil, fmt.Errorf("captcha verification failed")
    }
}

REST endpoint (/oauth/token): Accept captcha_token as form parameter.

3. Honeypot Fields

In the default login/signup UI (web/app/):

  • Add hidden form fields (<input type="text" name="website" style="display:none" tabindex="-1" autocomplete="off">)
  • If honeypot fields are populated → silent rejection (bot filled invisible fields)
  • Server-side: GraphQL mutations accept optional honeypot field, reject if non-empty
  • Zero impact on legitimate users, catches naive bots

4. Configurable Challenge Triggers

Three modes via --captcha-challenge-mode:

Mode Behavior
always Every login/signup requires CAPTCHA
never CAPTCHA disabled (default when no provider configured)
risk_based Require CAPTCHA only after N failed attempts from same IP (integrates with LoginAttempt table from #501)

Risk-based trigger logic:

if challengeMode == "risk_based" {
    ipFailures, _ := store.CountFailedAttemptsByIP(ctx, clientIP, windowStart)
    if ipFailures >= captchaTriggerThreshold { // default: 3
        // Require CAPTCHA for this request
        if params.CaptchaToken == nil {
            return nil, fmt.Errorf("captcha_required") // special error code for frontend
        }
    }
}

The frontend (web/app/) detects the captcha_required error and dynamically renders the CAPTCHA widget.


CLI Configuration Flags

--captcha-provider=                         # turnstile | recaptcha | hcaptcha (empty = disabled)
--captcha-site-key=                         # Public site key for frontend widget
--captcha-secret-key=                       # Secret key for server-side verification
--captcha-challenge-mode=risk_based         # always | never | risk_based
--captcha-trigger-threshold=3               # Failed attempts from IP before requiring CAPTCHA (risk_based mode)
--recaptcha-score-threshold=0.5             # Minimum score for reCAPTCHA v3 (0.0-1.0)

Dependency Injection

Wire into cmd/root.go initialization order (after config, before HTTP handlers):

// After memory store, before HTTP handlers
captchaProvider, err := captcha.New(cfg, &captcha.Dependencies{Log: log})

// Pass to HTTP handler dependencies
httpProvider := http_handlers.New(&http_handlers.Dependencies{
    // ... existing deps
    CaptchaProvider: captchaProvider,
})

Frontend Integration

web/app/ (React login/signup UI):

  • Conditionally load CAPTCHA widget script based on meta query response (new field: captcha_provider, captcha_site_key)
  • Render widget before submit button
  • Pass token in GraphQL mutation variables
  • Handle captcha_required error in risk-based mode by dynamically showing widget

meta query enhancement:

type Meta {
    # ... existing fields
    captcha_provider: String     # "turnstile" | "recaptcha" | "hcaptcha" | ""
    captcha_site_key: String     # Public key for frontend widget
    captcha_challenge_mode: String
}

Testing Plan

  • Unit tests for each CAPTCHA provider (mock HTTP verification endpoints)
  • Integration test: signup with valid/invalid CAPTCHA token
  • Integration test: risk-based mode triggers CAPTCHA after N failures
  • Test honeypot rejection
  • Test CAPTCHA bypass when provider is not configured (graceful no-op)
  • Frontend E2E tests for widget rendering

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions