Skip to content

banditcallback: emit per-device first-seen to lantern-cloud#668

Merged
reflog merged 2 commits into
mainfrom
reflog/phase3-arm-callback
May 22, 2026
Merged

banditcallback: emit per-device first-seen to lantern-cloud#668
reflog merged 2 commits into
mainfrom
reflog/phase3-arm-callback

Conversation

@reflog
Copy link
Copy Markdown
Contributor

@reflog reflog commented May 21, 2026

Summary

Adds a per-device first-seen callback emitter so the bandit gets EXP3 reward signal for http-proxy arms. Without it, legacy clients can't fire the callback themselves (sing-box clients do at URL-test completion natively), so http-proxy arm weights stayed flat at their cold-start prior — explored via gamma but never reinforced by traffic.

What's new

  • banditcallback package: Emitter with TTL-bounded in-memory dedup keyed on device-id, async best-effort GET to the configured URL on first-seen, opportunistic map sweep (no reaper goroutine). Filter wraps the emitter as a proxy filter.
  • CLI flags / INI keys: banditcallbacktoken, banditcallbackurl, banditcallbackttl. Plumbed by the lantern-cloud provisioner. Empty token → emitter Enabled() is false → filter is installed but no-ops.
  • Filter chain placement: after tokenfilter (auth-gated, no firing on unauthenticated noise) but before devicefilter (which skips for pro proxies — we still want signal for pro arms).

API side

lantern-cloud PR https://github.com/getlantern/lantern-cloud/pull/{TBD} adds the matching /v1/bandit/callback arm-token handler and the provisioner plumbing.

Test plan

  • go test ./banditcallback/ — unit tests: disabled-when-unconfigured, first-seen fires, dedup suppresses repeats, concurrent first-seen is single-fire, re-emit after TTL
  • go vet ./banditcallback/ ./common/ ./devicefilter/
  • Goreleaser nightly build (auto on merge)
  • Staging soak via cmd/config-test + SigNoz: confirm arm callback received log lines, EXP3 weight movement for r13:t13 arm

🤖 Generated with Claude Code

The lantern-cloud bandit catalog now selects http-proxy arms for
legacy clients via the unified action space, but those arms had no
EXP3 reward signal because clients on the lantern-http-proxy backend
don't make the callback at URL-test completion the way sing-box
clients do natively. Without a signal, EXP3 weights for http-proxy
arms stayed flat at their cold-start prior — explored via gamma but
never reinforced.

This change has the http-proxy daemon emit the callback on behalf of
those clients. On the first request from a device-id within a TTL
window, the new banditcallback.Emitter fires an async best-effort GET
to the API's /v1/bandit/callback?token=<arm-token>&did=<device_id>
endpoint. The arm-token is plumbed at provisioning via two new INI
keys (banditcallbacktoken / banditcallbackurl); the API discriminates
arm-tokens from per-probe tokens by the `arm-` prefix and writes a
flat-reward update straight to EXP3 + per-route signals.

A device's first connection within the TTL window triggers one emit;
subsequent connections from the same device are suppressed in-memory
until the window rolls over. Map sweep is opportunistic on the same
TTL cadence, so worst-case memory is ~2× unique devices per window
without a dedicated reaper goroutine. The API does its own
SET-NX-based dedup as defense against a restarted/multi-replica
daemon losing the LRU and re-firing within the window.

The filter installs after tokenfilter (auth) but before devicefilter
(throttling) — devicefilter skips when ReportingRedisClient is nil
for pro proxies, but the bandit still wants signal for pro arms.
OnFirstOnly because we only need the device-id header once per
connection. No-op when Token/URL are empty, so non-bandit-eligible
tracks carrying the same binary stay silent.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new bandit callback mechanism to emit a best-effort, per-device “first seen” signal to lantern-cloud so EXP3 arm weights can be reinforced by real proxy traffic (especially for legacy clients that can’t emit the callback themselves).

Changes:

  • Introduces banditcallback package with an Emitter (TTL-based in-memory dedup) and a proxy Filter that triggers async callbacks.
  • Adds CLI/INI configuration (banditcallbacktoken, banditcallbackurl, banditcallbackttl) and plumbs these into the proxy.
  • Inserts the callback filter into the proxy filter chain (auth-adjacent placement) and adds unit tests for dedup + concurrency behavior.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.

File Description
http-proxy/main.go Adds CLI/INI flags and passes bandit-callback settings into the proxy configuration.
http_proxy.go Stores bandit callback config on Proxy, constructs an emitter, and conditionally appends the filter to the chain.
banditcallback/banditcallback.go Implements the TTL-deduped emitter and filter that emits the callback asynchronously.
banditcallback/banditcallback_test.go Adds unit tests validating disabled behavior, dedup, concurrency, and TTL re-emission.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread http_proxy.go Outdated
Comment thread http_proxy.go Outdated
Comment thread banditcallback/banditcallback.go
Comment thread banditcallback/banditcallback.go
Address Copilot review feedback on #668: the two comments around the
banditcallback filter implied the filter "installs but stays silent"
when token/URL are empty, but createFilterChain actually skips the
append entirely when the emitter is disabled. Update both comments
to match the real behavior:

  - banditcallback.New is still cheap to call with empty inputs (the
    emitter's Enabled() reports false), but the filter is never
    installed in that case — zero per-request work on
    non-bandit-eligible builds.
  - Note the benchmark-mode caveat (no tokenfilter), since the
    "after auth" placement only holds in the production path.

No behavior change; comments only.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@reflog reflog merged commit ba4275a into main May 22, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants