Skip to content

RFC: Advanced Bot Protection (Radar-style Risk Engine) #521

@lakhansamani

Description

@lakhansamani

RFC: Advanced Bot Protection (Radar-style Risk Engine)

Phase: 5 — Advanced Security & Enterprise
Priority: P3 — Medium
Estimated Effort: High
Depends on: Audit Logs (#505), Rate Limiting (#501)


Problem Statement

Basic rate limiting and CAPTCHA (#501, #504) protect against simple attacks, but sophisticated attackers rotate IPs, use headless browsers, and mimic legitimate traffic. WorkOS Radar provides device fingerprinting, risk scoring, and behavioral analysis. A risk engine that considers multiple signals provides significantly stronger protection.


Proposed Solution

1. Device Fingerprinting

Client-side: Lightweight JavaScript fingerprinting in web/app/ that collects stable browser/device signals:

// Signals collected (privacy-conscious — no invasive tracking):
const fingerprint = {
    userAgent: navigator.userAgent,
    language: navigator.language,
    languages: navigator.languages,
    platform: navigator.platform,
    screenResolution: `${screen.width}x${screen.height}`,
    colorDepth: screen.colorDepth,
    timezone: Intl.DateTimeFormat().resolvedOptions().timeZone,
    touchSupport: navigator.maxTouchPoints > 0,
    hardwareConcurrency: navigator.hardwareConcurrency,
    // Canvas and WebGL fingerprints — optional, configurable
};

// Hash client-side, send hash + raw signals to server
const fingerprintHash = await crypto.subtle.digest('SHA-256', 
    new TextEncoder().encode(JSON.stringify(fingerprint)));

Server-side schema: internal/storage/schemas/device_fingerprint.go

type DeviceFingerprint struct {
    ID              string `json:"id" gorm:"primaryKey;type:char(36)"`
    UserID          string `json:"user_id" gorm:"type:char(36);index:idx_device_user"`
    FingerprintHash string `json:"fingerprint_hash" gorm:"type:varchar(64);index:idx_device_hash"`
    RawSignals      string `json:"raw_signals" gorm:"type:text"`           // JSON of collected signals
    TrustLevel      string `json:"trust_level" gorm:"type:varchar(20)"`    // trusted | neutral | suspicious
    FirstSeenAt     int64  `json:"first_seen_at" gorm:"autoCreateTime"`
    LastSeenAt      int64  `json:"last_seen_at"`
    LoginCount      int64  `json:"login_count" gorm:"default:0"`
}

2. Risk Scoring Engine

New package: internal/risk/

type RiskSignal struct {
    Name   string
    Score  float64 // 0.0 (safe) to 1.0 (dangerous)
    Weight float64 // importance multiplier
    Reason string
}

type RiskAssessment struct {
    Score    float64      // weighted average: 0.0-1.0
    Level    string       // low | medium | high | critical
    Signals  []RiskSignal
    Decision string       // allow | challenge | block
}

func (e *RiskEngine) Assess(ctx context.Context, req RiskRequest) *RiskAssessment {
    signals := []RiskSignal{}
    
    // 1. IP Reputation (weight: 0.2)
    // - Known VPN/proxy/Tor exit node → +0.3
    // - IP country matches user's usual country → 0.0
    // - IP country is new for this user → +0.4
    // - IP has high failed login rate across accounts → +0.8
    
    // 2. Device Fingerprint (weight: 0.25)
    // - Known/trusted device → 0.0
    // - New device for this user → +0.3
    // - Device seen with many failed logins → +0.7
    // - No fingerprint provided (possible bot) → +0.5
    
    // 3. Login Velocity (weight: 0.2)
    // - Normal frequency → 0.0
    // - Multiple logins in short period → +0.4
    // - Failed attempts within window (#501) → scaled by count
    
    // 4. Time-of-Day Pattern (weight: 0.1)
    // - Login during user's usual hours → 0.0
    // - Login at unusual hour → +0.3
    
    // 5. Geographic Anomaly (weight: 0.15)
    // - "Impossible travel": login from NYC, then Tokyo 30 min later → +0.9
    
    // 6. User Agent Anomaly (weight: 0.1)
    // - Common browser → 0.0
    // - Headless browser / curl / bot signature → +0.6
    // - UA doesn't match device fingerprint signals → +0.5
    
    // Compute weighted score
    totalScore := weightedAverage(signals)
    
    // Map to decision
    decision := "allow"
    if totalScore > challengeThreshold { decision = "challenge" }  // require MFA or CAPTCHA
    if totalScore > blockThreshold { decision = "block" }
    
    return &RiskAssessment{Score: totalScore, Signals: signals, Decision: decision}
}

Default thresholds (configurable):

  • < 0.3 → Allow
  • 0.3 - 0.7 → Challenge (require MFA or CAPTCHA)
  • > 0.7 → Block

3. Integration with Auth Flow

In login handlers, after credential verification but before issuing tokens:

risk := riskEngine.Assess(ctx, risk.RiskRequest{
    UserID:          user.ID,
    Email:           user.Email,
    IPAddress:       clientIP,
    UserAgent:       userAgent,
    DeviceFingerprint: fingerprintHash,
    LoginMethod:     "password",
})

switch risk.Decision {
case "block":
    // Reject with generic error (don't reveal risk analysis)
    auditLog("user.login_blocked_risk", risk)
    return error("login_blocked")
    
case "challenge":
    // Step-up authentication
    if user.IsMultiFactorAuthEnabled {
        return requireMFA(user)  // existing MFA flow
    } else if captchaProvider != nil {
        return requireCAPTCHA()  // from #504
    }
    // If no challenge method available, allow with warning
    auditLog("user.login_risk_warning", risk)
    
case "allow":
    // Proceed normally
}

4. Credential Stuffing Detection

Building on LoginAttempt table from #501:

// Detect credential stuffing: high failed rate from one IP across MANY accounts
func detectCredentialStuffing(ctx context.Context, ip string, window time.Duration) bool {
    // Count unique emails with failed logins from this IP
    uniqueEmails, _ := store.CountUniqueFailedEmailsByIP(ctx, ip, window)
    totalFailed, _ := store.CountFailedAttemptsByIP(ctx, ip, window)
    
    // If one IP is hitting many different accounts → credential stuffing
    if uniqueEmails > 10 && totalFailed > 20 {
        // Auto-block IP temporarily
        store.AddIPRule(ctx, &schemas.IPRule{
            IP: ip, Type: "block", Reason: "credential_stuffing_detected",
            ExpiresAt: time.Now().Add(1 * time.Hour).Unix(),
        })
        return true
    }
    return false
}

5. New Device Alerts

When risk assessment detects a new device:

if risk.HasSignal("new_device") && cfg.EnableNewDeviceAlerts {
    emailProvider.SendNewDeviceAlert(user.Email, DeviceAlertData{
        DeviceName:  parseUserAgent(userAgent),
        IPAddress:   clientIP,
        Location:    geolocate(clientIP),
        Time:        time.Now(),
        RiskLevel:   risk.Level,
    })
}

CLI Configuration Flags

--enable-risk-engine=false                 # Enable risk scoring
--risk-challenge-threshold=0.3             # Score above which to challenge
--risk-block-threshold=0.7                 # Score above which to block
--enable-device-fingerprinting=false       # Enable client-side fingerprinting
--enable-new-device-alerts=true            # Email on new device login
--enable-impossible-travel-detection=false # Geographic anomaly detection

Testing Plan

  • Unit tests for each risk signal calculator
  • Unit tests for weighted score computation
  • Integration test: new device triggers challenge
  • Integration test: impossible travel triggers block
  • Test credential stuffing detection auto-blocks IP
  • Test risk decisions (allow/challenge/block) at threshold boundaries
  • Test with various user agent strings (legitimate vs. bot)

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions