Skip to content

qa: route all radiance egress through an upstream SOCKS5 + bandit probe driver#445

Open
myleshorton wants to merge 108 commits into
mainfrom
qa/outbound-socks-egress
Open

qa: route all radiance egress through an upstream SOCKS5 + bandit probe driver#445
myleshorton wants to merge 108 commits into
mainfrom
qa/outbound-socks-egress

Conversation

@myleshorton
Copy link
Copy Markdown
Contributor

@myleshorton myleshorton commented Apr 27, 2026

Adds a development/QA-only path for routing every outbound network call radiance makes through a configured upstream SOCKS5. Pair with pinger bridge --country ru (lantern-cloud PR) to exercise the bandit + tunnel as if from a Russia residential client — no Android emulator required for the bandit-correctness test, but also unlocks the same path inside the Android client (lantern PR).

Summary

  • common/env: new key RADIANCE_OUTBOUND_SOCKS_ADDRESS, distinct from the existing RADIANCE_SOCKS_ADDRESS (which sets up an inbound listener — different concept).
  • kindling/client: when the env var is set, swap kindling's stacked transport for a plain http.Transport whose DialContext dials through the upstream SOCKS5. Kindling's fronted/AMP/dnstt/proxyless circumvention paths are bypassed (kindling has no generic dialer override) — tracked as a follow-up if we want full fidelity.
  • vpn/boxoptions: append a synthetic SOCKS5 outbound (_dev_outbound_socks) and walk every leaf outbound, setting DialerOptions.Detour to it. Selector / urltest / block / dns / direct are skipped (direct rejects Detour at runtime: `detour is not supported in direct context`).
  • bypass/bypass: route through the same SOCKS5 instead of the local-bypass-proxy / direct-fallback chain.
  • telemetry/otel: skip OTLP gRPC init in this mode — the gRPC exporter can't be routed via http.Transport.DialContext, so leaving it on would leak the test process's real IP.
  • backend/radiance: skip publicip.Detect() when the env var is set. Without this, radiance would talk directly to AWS/ifconfig.me, get the host's real IP, and stuff it into X-Lantern-Config-Client-IP — which the API trusts over the actual TCP source for the bandit lookup.
  • config/fetcher: when the env var is set, drop UDP-only protocols (hysteria/hysteria2/wireguard/tuic/amnezia) from the request's Protocols list. The bridge SOCKS5 listener on the lantern-cloud side is TCP-only; without this filter, the bandit assigns UDP outbounds that fail with SOCKS5 `code=7` (Command not supported) until URLTest converges. Hysteria-class protocols don't work in Russia today anyway.
  • common/platform: change Platform from const to var; honor RADIANCE_PLATFORM env override in common.Init(). Lets a Go process running on macOS impersonate Android for the bandit's view of the client.
  • cmd/qa-bandit: focused QA driver that boots a radiance backend with all of the above wired, captures the first /v1/config-new response, dumps the bandit assignment, then `ConnectVPN(AutoSelect)` and probes a target URL through the local SOCKS5 inbound to verify both the API view and the actual tunnel egress IP.

Non-obvious: TZ + locale also need to match the country

This caught us during testing and is worth surfacing — without these, the API still sees `country=US` even when the request egresses through a Russia residential IP, and the bandit serves US-tier outbounds.

`cmd/api/maxmind.go:LookupCountryASNState` overrides the GeoIP-derived country with the timezone-derived country (or `X-Lantern-Locale`-derived) when they disagree, on the assumption that a mismatch means the client is behind a VPN. Production behavior is correct — a user in Moscow whose VPN egresses through a US server should still be treated as Russian for content/track decisions. But for QA, our request egresses through Russia residential while `time.Now().Location()` on the dev box is `America/Denver` (or wherever), so the API sees:

  • IP geo: Russia (from the residential pool)
  • Timezone hint: `America/Denver` → US
  • → "client is behind a VPN, prefer header-derived country" → returns `country=US` ❌

The fix in `cmd/qa-bandit` is two flags, both default to Russia:

```
--tz Europe/Moscow # exported as `TZ` env var before any time call;
# radiance picks it up via timezone.IANANameForTime
--locale ru_RU # passed to backend.Options.Locale → X-Lantern-Locale header
```

Pass matching values for any other country (`--tz Asia/Tehran --locale fa_IR` for Iran, etc.). The Android-side equivalent is `adb shell setprop debug.lantern.tz Europe/Moscow` plus the locale set in the Lantern app's settings.

Quick start

Pair with the lantern-cloud bridge:

```

Terminal 1 — bridge (lantern-cloud)

mise r pinger:bridge

Terminal 2 — radiance

RADIANCE_OUTBOUND_SOCKS_ADDRESS=127.0.0.1:1080 \
go run -tags 'with_quic,with_gvisor,with_wireguard,with_utls' ./cmd/qa-bandit
```

Verified behavior

```
API saw client as : country=RU ip=85.172.81.50 # Yaroslavl, Rostelecom AS12389
Outbounds (6): samizdat (5x) + reflex (1x), all in Russia-tier locations
probe OK in 1.30s — egress IP: 141.148.227.249 # NL Oracle (samizdat-NL)
```

Test plan

  • `go run -tags '...' ./cmd/qa-bandit` against a running pinger bridge: confirm `country=RU` in the bandit assignment dump and a non-home egress IP from the probe.
  • `go run ... ./cmd/qa-bandit --tz America/Denver --locale en_US` (TZ override only): confirm the API now reports `country=US` even with the Russia residential IP — verifies the MaxMind/timezone gotcha above.
  • `go test ./vpn/... ./kindling/... ./bypass/... ./common/...` (existing tests still pass with the var/const change).
  • Without the env var: existing behavior (kindling stacked transports, direct sing-box outbound dials, full publicip.Detect, telemetry on) — no regression.

🤖 Generated with Claude Code

garmr-ulfr and others added 30 commits February 25, 2026 17:30
…ith VPNStatus type

Introduce a Server-Sent Events endpoint (/status/events) that streams
VPN status changes to clients in real time, replacing the previous
poll-based approach. Refactor status representation from string constants
(StatusRunning, StatusClosed, etc.) to a typed VPNStatus enum (Connected,
Disconnected, Connecting, Disconnecting, Restarting, ErrorStatus) and
move status emission from the IPC server into the tunnel layer. The
tracer middleware is scoped to standard routes so it no longer buffers
long-lived SSE connections, and the HTTP transport is upgraded to
unencrypted HTTP/2 for multiplexed streaming support.
…terns

Reorganize the project architecture to establish clear data ownership
and dependency flow, inspired by Tailscale's LocalBackend/localapi
pattern.
Co-authored-by: Wendel Hime <6754291+WendelHime@users.noreply.github.com>
garmr-ulfr and others added 15 commits April 22, 2026 10:15
 Remove unreachable post-normalization length checks in buildOptions — if
 normalization returns a non-empty slice, ToOptions always yields entries
 — and move the "no valid rules" warning to the else branch that catches
 when input was present but normalization dropped everything.

 Expand TestBuildOptions_Rulesets with a "direct" smart-routing category
 in the test fixture, isolate each subtest with its own config to prevent
 mutation leaks, and drive the ad-block expectation off
 AdBlockRules.ToOptions to avoid RejectActionOptions default-field
 mismatches.
 Subscribe to VPN status events and start AutoSelectedChangeListener only
 once the tunnel is Connected, instead of launching it eagerly from
 Start(). Also unexport startAutoSelectedListener since it's only called
 internally.
Go v1.26.2 includes a patch to CGo that addresses some of
the bulkBarrierPreWrite panics.
 - backend: skip config-response country write when RADIANCE_COUNTRY is
   set so the env override is respected for issue reports.
 - vpn: own URLTestHistoryStorage on the tunnel, registering one if the
   context doesn't already carry it, instead of going through
   clashServer.HistoryStorage().
 - deps: bump keepcurrent and lantern-box.
…nly support

 Move SSE event stream methods (VPNStatusEvents, AutoSelectedEvents, ConfigEvents,
 DataCapStream) into platform-specific files. Add sseRetryLoop for automatic
 reconnection with backoff, and gate DataCapStream on VPN connected status.
 Mobile builds subscribe to in-process events and short-circuit SSE when the
 client is in local-only mode.
When the Flutter UI picks a new server or toggles Smart Routing while the
tunnel is already up, it calls ConnectVPN with the new tag expecting a
seamless outbound swap. The backend was unconditionally invoking
VPNClient.Connect, which returns ErrTunnelAlreadyConnected, surfacing to
the user as "start service failed: ipc: status 500: failed to connect
VPN: tunnel already connected".

Short-circuit to selectServer when Status() == Connected so the active
outbound switches in place without tearing down and rebuilding the
tunnel, matching the UI's expectation and the comment in
lib/features/vpn/server_selection.dart.

Fixes the IPC errors documented in getlantern/engineering#3291 for
Smart Routing while connected (free) and server selection while connected
(pro).

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
)"

This reverts commit 42d491c.
SelectServer should be used to manually switch servers while connected.
Restart set Restarting on the current tunnel, then close() dropped it
and start() built a new one, so the marker was orphaned on a
torn-down object and observers saw Disconnected -> Connecting ->
Connected instead of Restarting. The platform path has the same
shape because RestartService drives Disconnect + Connect.

Move status to VPNClient (atomic.Value plus a setStatus guard that
lets only Connected or ErrorStatus succeed Restarting). start/close
bracket the tunnel call with the appropriate transitions; tunnel no
longer carries VPNStatus at all, and selectMode gates on lbService.

Also fold Close into Disconnect, move PostServiceClose into close(),
rename ClearNetErrorState to AttemptFixNetState, and collapse the
duplicated tests into subtests.
* vpn: instrument tunnel.start phases + VPNClient.Restart (#3299)

Port of #442 to the refactor branch. Adds child spans around the phases
inside tunnel.start so we can attribute the 10s+ tail observed on
/service/start (max 11.25s across 170 calls in 24h, matching Freshdesk
#173696).

This branch's in-process VPNClient.Restart also had no span — the whole
settings-toggle restart path was invisible in SigNoz. Wrapped it
end-to-end so restart latency shows up alongside connect latency, with
a path=direct|platform_ifce attribute to distinguish the two flows.

- VPNClient.Restart span wrapping close+start (new)
- tunnel.start span (options_size, platform, is_restart attributes)
- tunnel.init + tunnel.connect spans
- child spans: libbox.Setup, libbox.NewServiceWithContext,
  libbox.BoxService.Start, newMutableGroupManager

Note: this branch has no loadURLTestHistory call (urltest history is
pulled from context via service.FromContext, not from disk), so that
phase is absent compared to main. The set of span names otherwise
matches #442 so dashboards built against main work here too.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* vpn: address review feedback on #443

Three fixes from Copilot:

1. tunnel.init: record errors on the init span via deferred closure
   on the named return — previously failures only showed up on child
   spans (libbox.Setup, libbox.NewServiceWithContext) but the phase
   span itself stayed green.

2. tunnel.connect: same issue — the panic-recovery path sets the
   named err, but the span wasn't marked errored. Added a deferred
   error-recording closure before the recover closure so the recover
   runs first (LIFO) and the span-recording sees the post-recover err.

3. tunnel.start is_restart attribute: VPNClient.Restart creates a
   fresh tunnel{} via c.start, so t.status is always the zero value
   (never Restarting) when t.start is called — is_restart was always
   false. Replaced the status sniff with an explicit isRestart
   parameter threaded through VPNClient.start → tunnel.start.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ver (#444)

fac9089 normalized the empty-string → AutoSelectTag convention in
VPNClient.SelectServer, but LocalBackend.SelectServer — which wraps
vpnClient.SelectServer and then performs its own settings update —
still tag-compared against vpn.AutoSelectTag directly. With tag == ""
the wrapper's vpnClient.SelectServer call succeeds (fac9089 handles it),
then the outer

    if tag == vpn.AutoSelectTag { ...auto path... return nil }

check is false (tag is still "") and execution falls through to
srvManager.GetServerByTag(""), which isn't found, returning
"no server found with tag " (with trailing space). The IPC layer
propagates the error as HTTP 500.

Reproduced on Lantern 9.0.30 (getlantern/lantern@6de3c9aa9 refactor)
when the user clicks Smart from a live tunnel:

    ffi.go:startVPN → c.ConnectVPN("")
    → LanternCore.ConnectVPN → vpn_tunnel.ConnectToServer
    → VPNStatus == Connected → client.SelectServer(ctx, "")
    → POST /server/selected {Tag: ""}
    → LocalBackend.SelectServer("") ← bug site

Surfaces as "start service failed: ipc: status 500: no server found
with tag " in the Dart UI.

Fix: normalize tag == "" to vpn.AutoSelectTag at the top of
LocalBackend.SelectServer, mirroring the same normalization in
LocalBackend.ConnectVPN. Finishes fac9089's intent by aligning the
outer wrapper with VPNClient.SelectServer's behavior.

Internal tester report: getlantern/lantern's Freshdesk #173773.

Co-authored-by: Adam Fisk <afisk@mini.local>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Introduce a build-tagged fileperm.File constant so application-owned files
use 0600 on Linux, Windows, and standalone macOS, and 0644 on mobile and
non-standalone macOS where other sandbox processes need read access.
Copilot AI review requested due to automatic review settings April 27, 2026 21:21
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot wasn't able to review this pull request because it exceeds the maximum number of lines (20,000). Try reducing the number of changed lines and requesting a review from Copilot again.

garmr-ulfr and others added 7 commits April 27, 2026 16:19
Drop comments that merely restate identifier names, and tighten the few
that remain to document contracts rather than mechanism. AGENTS.md now
records the comment and Go doc-comment guidelines the cleanup applies.
Seed the tunnel's URL test history storage from servers.Server on
init so prior latency results survive reconnects, and coalesce hook
notifications into a periodic flush so per-result writes don't
re-marshal the servers file for each parallel test completion.
UpdateURLTestResults now persists to disk.
InitialServer fixes the outbound selected at tunnel start, replacing the
post-Connect SelectServer round-trip; a stub clashServer plus a CacheFile
wrapper own the selection so libbox's on-disk last-selected value can't
override it. Also re-attaches the URL-test listener on Connected to fix
a race where unordered Restarting/Connected events could leave it bound
to a closed storage.
…fig (#446)

When a fresh /v1/config-new response arrives while the VPN is up:

  1. setServers(list, true) runs first, which calls
     vpnClient.UpdateOutbounds(list) → tunnel.updateOutbounds → addOutbounds.
     addOutbounds loads the new outbounds into the running sing-box,
     installs the bandit URL overrides on the AutoSelect group via
     mutGrpMgr.SetURLOverrides, and (if any overrides were present)
     synchronously triggers an immediate URL test cycle via
     mutGrpMgr.CheckOutbounds — see vpn/tunnel.go:436-450.
  2. Then RunOfflineURLTests() runs and is gated by
     `if c.tunnel != nil` in vpn.go:462, returning
     ErrTunnelAlreadyConnected.

So the offline pre-warm is intentionally skipped while the tunnel is
up — the in-tunnel path already covered it. But we were logging the
expected sentinel as level=ERROR, which made it look like URL tests
weren't running after a config update. They are: just via the running
sing-box's URLTest selector instead of the offline pre-warm code path.

Skip the log when the error is ErrTunnelAlreadyConnected; keep it for
genuine failures (e.g. "offline tests already running"). Behavior is
unchanged — just stops a misleading ERROR line that's been showing up
on every config refresh while connected.

Co-authored-by: Adam Fisk <afisk@mini.local>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… egress

When this env var is set, every outbound connection radiance opens is
routed through that SOCKS5 server. Intended for censorship-circumvention
QA: pair with `pinger bridge --country ru` so the API request, sing-box
tunnel dials, and the bypass dialer all egress through a residential
proxy in the chosen country, simulating a real client there.

  * common/env: new key OutboundSocksAddress.
  * kindling/client: when set, swap kindling's stacked transport for a
    plain http.Transport whose DialContext goes via SOCKS5. Kindling's
    fronted/AMP/dnstt circumvention paths are skipped (kindling lacks a
    generic dialer override) — see comment.
  * vpn/boxoptions: append a SOCKS5 outbound and walk every leaf outbound
    setting DialerOptions.Detour to it; selector/urltest/block/dns are
    left alone (they don't dial directly).
  * bypass/bypass: route DialContext through SOCKS5 instead of the
    local-bypass-proxy / direct-fallback chain.
  * telemetry/otel: skip OTLP gRPC init in this mode — the gRPC exporter
    can't be routed via http.Transport.DialContext, so leaving it would
    leak the test process's real IP.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
cmd/qa-bandit is a focused QA driver that boots a radiance backend
impersonating an Android client (Platform=android, version, locale=ru_RU,
TZ=Europe/Moscow), captures the first /v1/config-new response, dumps the
bandit assignment (country/IP the API saw, assigned outbounds and
locations), then optionally ConnectVPN(AutoSelect) and probes a target
URL through the local SOCKS5 inbound to verify the full egress path.

Pair with `pinger bridge --country ru` running on 127.0.0.1:1080:

  RADIANCE_OUTBOUND_SOCKS_ADDRESS=127.0.0.1:1080 \
    go run -tags 'with_quic,with_gvisor,with_wireguard,with_utls' \
    ./cmd/qa-bandit

Plumbing required to make the API actually see us as a Russia client:

  * common.Platform: const → var, plus RADIANCE_PLATFORM env override in
    common.Init(). Lets the QA driver impersonate Android while running
    on macOS.
  * backend.Start: skip publicip.Detect() when OutboundSocksAddress is
    set. Otherwise it talks directly to AWS/ifconfig.me, gets the host's
    real IP, and stuffs it into X-Lantern-Config-Client-IP — which the
    API trusts over the actual TCP source for bandit lookups.
  * vpn/boxoptions: add C.TypeDirect to the Detour skip-list. Sing-box
    rejects "detour is not supported in direct context" at runtime
    otherwise.
  * Spoof TZ + locale (--tz Europe/Moscow, --locale ru_RU) so the
    request's X-Lantern-Time-Zone / locale don't trigger the API's
    "GeoIP says X but timezone says Y, must be VPN" override path
    (cmd/api/maxmind.go:LookupCountryASNState).

Known limitation: the bridge SOCKS5 listener only implements TCP
CONNECT, not UDP ASSOCIATE. So UDP outbounds (hysteria/hysteria2/
wireguard/tuic) fail with code=7 when chained through the detour;
TCP-based outbounds (samizdat/reflex/vmess/vless/trojan/shadowsocks)
work. URLTest will fall back to a working outbound on retry.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Two related improvements that make the QA path complete end-to-end on
TCP outbounds without the bridge needing UDP ASSOCIATE.

config/fetcher: when RADIANCE_OUTBOUND_SOCKS_ADDRESS is set, filter
hysteria/hysteria2/wireguard/tuic/amnezia out of the request's
supportedProtocols list. The bandit then doesn't assign UDP-only tracks
the bridge can't relay, and URLTest converges immediately on a working
TCP outbound (samizdat / reflex / vmess / vless / trojan / shadowsocks
/ etc.). Hysteria-class protocols don't work in Russia today anyway, so
this is a fine match for the test scope.

cmd/qa-bandit: after ConnectVPN, retry the egress probe for up to 30s
(every 3s) while URLTest is settling. With UDP outbounds gone this is
mostly a safety net — most probes will succeed on attempt 1 — but it
also handles transient residential-proxy hiccups (PacketStream
occasionally returns "general SOCKS server failure" on the first dial
of a fresh session). Default --timeout bumped to 180s to match.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@myleshorton myleshorton force-pushed the qa/outbound-socks-egress branch from 2cbdf55 to b884143 Compare April 28, 2026 13:07
myleshorton pushed a commit to getlantern/lantern that referenced this pull request Apr 28, 2026
Adds a debug-only path that lets a developer route every outbound
network call radiance makes from the Android client through an
upstream SOCKS5 (typically the local pinger bridge), so the bandit
treats the client as a real Russia-residential user end-to-end.

Pairs with:
  * radiance:    getlantern/radiance#445
  * lantern-cloud: https://github.com/getlantern/lantern-cloud/pull/2649

  * `lantern-core/mobile`: new gomobile-exported `SetQAEnvOverrides(socks, tz)`
    that does `os.Setenv` for `RADIANCE_OUTBOUND_SOCKS_ADDRESS` and `TZ`.
    Must be called before `SetupRadiance`/`StartIPCServer` to take effect.
  * `android/.../LanternApp.kt`: override `onCreate` and call the new setter
    with values from Android system properties:
      `debug.lantern.outbound_socks` -> `RADIANCE_OUTBOUND_SOCKS_ADDRESS`
      `debug.lantern.tz`             -> `TZ`
    Set with `adb shell setprop debug.lantern.outbound_socks 10.0.2.2:1080`.
    No-op when the props are unset, so production builds aren't affected
    unless someone deliberately sets them on the device.
  * `go.mod`: bump radiance to the qa/outbound-socks-egress branch tip
    (will swap back to a pinned tag once that PR lands).

Verified end-to-end in an `lantern_test` AVD with packetstream + Russia
upstream:
  - LanternApp logs `QA env overrides applied: outbound_socks=10.0.2.2:1080`
  - Radiance's `/v1/config-new` response: `country=RU ip=85.172.81.50`
  - Bandit serves Russia-tier outbounds (samizdat / reflex in DE/SE/SG/etc.)
  - All sing-box outbound dials wrapped in `_dev_outbound_socks` detour
  - Browsing in the emulator's Chrome egresses from a Lantern entry server
    (e.g. Stockholm/Singapore — bandit-assigned, not the Mac's home IP)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Base automatically changed from refactor to main April 28, 2026 18:10
atavism added a commit to getlantern/lantern that referenced this pull request May 4, 2026
)

* migrate to new ipc.Client api, first-pass

* pullin a couple fixes, update linux vpn status poller

* start ipc server ios/macos

* update radiance, fix linux daemon build

* start ipc server windows service

* fix datacap stream

* decode user response data to json

* gofmt

* update ipc request path check for linux smoke test

* Fixed issue with user apis

* redo linux packaging changes undone by merge

* move RunOffCgoStack from radiance to here, small cleanup

* fetch radiance-owned settings on demand instead of caching locally

* add missing smart-routing, ad-block, oauth calls

* clean up

* fix ref async issue for IPC calls

* gofmt

* fix test, linux package verification

* update radiance, remove server groups

* fix: return added server tags from AddServersByURL

Server tags are determined by URL content, not caller-supplied names.
addServerBasedOnURLs now returns the tags of added servers so callers
can connect using the actual tag. Also sends VPN status updates from
connectToServer on Linux so the UI reflects connection state changes.

* wrap ffi calls in runOnGoStack, update win service

* add explicit not linux build tag

* update radiance

* use RADIANCE_REPO in lanternd src

* flatten server model to match radiance, fix tests

* use loopback ipc client for mobile

* update radiance, log service install error in smoke test

* retrieve selected server from radiance instead of cacheing

* stop lantern before unintall, revert accidental service name change

* remove allow override

* fix name reference and misplaced stop call

* fix several issues

* code review

* fix toggles not registering and fetching plans

* always refetch server list when view opens

* fix crash in server select screen

* fix split tunnel website view not loading websites

* sync vpn status from system on launch

* fix stale onboarding marker persisting reinstall

* Revert "fix stale onboarding marker persisting reinstall"

This reverts commit a21a218eac7df90d678ce5d35d27892bbe893da2.

* fix vpn prompt displaying when quiting

* Macos system extension updates #2 (#8637)

* if system extension is in uninstall state do not block new installtion.

* update macos system extension test

* do not cache dart_tool

* Set the default status as unknown.

* code review updates

* Filter system apps from Windows split tunneling (#8641)

* Add split tunneling e2e test

* Fix split tunneling website smoke assertion

* Fix split tunneling smoke navigation

* code review updates

* code review updates

* code review updates

* Filter Windows system apps in split tunneling list

* code review updates

* code review updates

* Update system apps filter

* code review updates

* fix: upload and notify for nightlies even when some platforms fail (#8649)

The upload-s3 and upload-release-artifacts jobs required ALL platform
builds to succeed or be skipped. When a matrix entry failed (e.g.,
Linux arm64), the entire build-linux job reported as 'failure', which
caused both upload jobs to skip entirely — even though macOS, Android,
iOS, and Linux amd64 all succeeded.

Simplify the condition: run uploads if at least one platform build
succeeded. The upload steps already handle missing artifacts gracefully
(upload_if_exists checks for file existence).

This ensures the Slack notification goes out with download links for
whatever platforms did build successfully.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add arch to releases (#8652)

* feat: add arch to releases

* Update linux/packaging/usr/lib/systemd/system/lanternd.service

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* chore: remove committed lanternd.service file

Agent-Logs-Url: https://github.com/getlantern/lantern/sessions/15085485-3c6a-4e1e-93ea-6e9bf0623d09

Co-authored-by: reflog <109876+reflog@users.noreply.github.com>

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: reflog <109876+reflog@users.noreply.github.com>

* fix issues from 3173

* Refactor and fixed multiple bugs

* cache selected server location locally to avoid UI flash

* Fix tunnel issue in android

* App event issue and auto server location fixes

* added logs

* mobile: return string instead of []byte + update Swift callers (#8663)

* mobile: return string instead of []byte from gomobile-exported funcs

The gomobile wrapper copies Go pointer-containing return values to the C
thread stack using runtime.wbMove. When a GC cycle runs during the copy,
bulkBarrierPreWrite panics because the destination isn't GC-tracked.
Returning string avoids this — gomobile marshals strings via C heap
allocation rather than leaving them as Go slice headers.

See getlantern/engineering#3175 for the full crash analysis (from
Freshdesk #172640 — Derek reporting "Lantern Crash" on macOS 26.3.1).

Go changes:
  AvailableFeatures, UserData, FetchUserData, GetAvailableServers,
  GetSelectedServerJSON, OAuthLoginCallback, AcknowledgeGooglePurchase,
  AcknowledgeApplePurchase, Login, Logout, DeleteAccount

Swift changes (macos + ios): preserve Flutter contract by converting
the string back to Data for methods whose Dart side reads `bytes` via
utf8.decode (getUserData, fetchUserData, oauthLoginCallback, login,
logout, deleteAccount, acknowledgeInAppPurchase). For methods whose Dart
side expects String (featureFlags, getLanternAvailableServers,
getSelectedServerJSON), just pass the gomobile string directly.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* android: update MethodHandler for string-returning gomobile bindings

The gomobile-exported funcs in lantern-core/mobile/mobile.go now return
string instead of []byte. The generated Android binding will therefore
return String where it used to return ByteArray.

For each affected method, match what the iOS handler does so the Flutter
platform-channel contract stays stable:

  * Methods whose Dart callers expect bytes (Uint8List) — login,
    logout, deleteAccount, userData, fetchUserData, oauthLoginCallback,
    acknowledgeGooglePurchase — convert the String result via
    `.toByteArray(Charsets.UTF_8)` before calling success() (mirrors
    Swift's `.data(using: .utf8)`).

  * Methods whose Dart callers expect a String — availableFeatures,
    getAvailableServers, getSelectedServerJSON — drop the
    `String(byteArray)` constructor and use the return value directly,
    with the same "{}" / "[]" empty-default that iOS uses.

Addresses Copilot review on PR #8663.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* encapsulate ipc.Client behind LanternCore interface

Route all IPC operations through LanternCore methods instead of
exposing Client() to callers. Add GetSelectedServerTag,
GetAutoLocationJSON, CheckDaemonReachable, PatchSettings, and
VPNStatusEvents to the Core interface. Update FFI and mobile layers
to use them, and remove now-unused vpn_tunnel helper functions.

Also includes Flutter-side fixes: device-removal sign-in race
condition, plans fetch retry logic, and private server setup
improvements.

* ios/macos: drop invalid optional-chaining on non-optional String (#8671)

The gomobile-exported functions in lantern-core/mobile/mobile.go were
migrated from ([]byte, error) to (string, error). gomobile renders the
new signatures with a non-optional Swift String return (Data was
optional; String is not), so `json?.data(using: .utf8)` and
`payload?.data(using: .utf8)` now fail to compile:

    error: cannot use optional chaining on non-optional value of type
    'String'

Drop the `?` on all 14 call sites (7 each in ios/ and macos/). The
resulting `json.data(using: .utf8)` returns Data? anyway — an empty
Go string still produces a non-nil empty Data, which preserves the
Flutter contract the comment on these lines describes.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* add android-test + android-reproduce for emulator testing and ticket reproduction (#8672)

* add android-test script for quick emulator testing with env overrides

Usage:
  scripts/android/android-test <apk> [ENV_KEY=VALUE ...]

Example:
  scripts/android/android-test lantern.apk RADIANCE_COUNTRY=BG RADIANCE_FEATURE_OVERRIDES=dns_ruleset_host_bypass

Starts an emulator, installs the APK, pushes a .env file with overrides
to the app's data dir (via adb root on Google APIs images, run-as on
debug APKs, or su on rooted devices), restarts the app, and streams
filtered logcat.

Prefers the "lantern-test" AVD if it exists (create with Google APIs
image for root access):
  sdkmanager "system-images;android-35;google_apis;arm64-v8a"
  avdmanager create avd -n lantern-test -k "system-images;android-35;google_apis;arm64-v8a"

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* address review: serial targeting, su quoting, trap cleanup, fix comment

- Use -s <serial> throughout so multiple devices don't break adb
- Fix su -c quoting so $(stat ...) expands on-device
- Add trap to clean up temp .env on EXIT/INT/TERM
- Fix header comment (no /sdcard/ fallback)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* android-test: push .env to .lantern data dir (not app root)

The Go env package reads .env from the data directory (via
env.LoadFromDir called from common.Init), not from the app's root
data dir. Push to /data/data/$PKG/.lantern/.env so radiance finds it.

Companion: getlantern/radiance#421 (env.LoadFromDir)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* android-test: auto-install system image and create AVD if none exists

If no AVDs are found, the script now automatically:
1. Detects host arch (arm64 vs x86_64)
2. Installs the Google APIs system image via sdkmanager
3. Creates a "lantern-test" AVD via avdmanager

This means running android-test on a fresh machine with just the
Android SDK installed works out of the box — no manual AVD setup.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* address review: array for ADB_CMD, timeouts, remove unused PID

- Use bash array for ADB_CMD so paths with spaces work correctly
- Add configurable timeouts for emulator appear (120s) and boot (300s)
- Remove unused EMULATOR_PID — emulator intentionally left running
  between invocations so subsequent runs don't pay boot cost

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* add android-reproduce: reproduce Freshdesk tickets on emulator

Usage:
  android-reproduce /tmp/ticket-172722              # auto-downloads APK
  android-reproduce /tmp/ticket-172722 lantern.apk  # uses provided APK

After running /analyze-ticket, this script:
1. Extracts country + version from the ticket's config/logs
2. Downloads the matching APK from GitHub releases (gh CLI)
3. Pushes the user's exact config.json, servers.json, split-tunnel.json
   to the emulator so it gets the same proxies, DNS rules, rule sets
4. Sets RADIANCE_COUNTRY to match the user's region
5. Installs, restarts, and streams filtered logcat

This gives near-exact reproduction of Android-specific issues by
replicating the user's proxy assignments, country routing, and
sing-box config on a local emulator.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* android-reproduce: match user's Android API level from ticket logs

Extracts sdkInt, osVersion, and model from flutter.log's "Device info"
line. Creates an AVD with the matching API level (e.g. "lantern-api36"
for a user on Android 16/SDK 36). Falls back to API 35 if the target
image isn't available.

Example for ticket #172722 (Android 16, SM-A556B):
  Creates lantern-api35 (API 36 clamped to 35), installs matching APK,
  pushes user's exact config.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* android-reproduce: dynamically find closest available API image

Instead of hardcoding a fallback to API 35, step down from the user's
sdkInt until we find an installable Google APIs image. Each API level
gets its own AVD (lantern-api29, lantern-api34, etc.) that persists
across runs, building up a catalog over time.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* address review: install-before-push, fix eval injection, f-string, file search

- Install APK + launch once before pushing configs (so data dir exists)
- Replace eval with mapfile for device info extraction (no shell injection)
- Fix f-string syntax error in locations display
- Search both ticket-dir and config-dir for servers.json/split-tunnel.json
- Remove unused SCRIPT_DIR
- Update android-test header to document auto-AVD-creation

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Fix private server navigation issue

* deps: bump sing-box-minimal to v1.12.21-lantern on refactor (#8679)

Companion to #8678. The refactor branch still pins v1.12.19-lantern,
which is missing the non-fatal-rule-set-fetch fix (sing-box-minimal
9c79c311, shipped in v1.12.21-lantern). Without it, Android builds
from this branch hit the same bootstrap deadlock.

* Add IPC starter in android

1

* macos, ios and android cleanup

* lantern-core: wire config events through IPC (#8673)

* lantern-core: subscribe to config events over IPC (/config/events)

The refactor branch removed listenConfigEvents when it was discovered
that the in-process events.SubscribeContext no longer worked — the
extension's radiance process is where config.NewConfigEvent is emitted,
and the host's subscription never fires across processes.

Now that the companion radiance PR adds a /config/events SSE endpoint,
restore the listener using lc.client.ConfigEvents with the same
reconnect-with-backoff pattern listenAutoSelectedEvents uses. Each
frame fires notifyFlutter(EventTypeConfig, "") so Flutter's
app_event_notifier "config" case resumes driving
availableServersProvider.forceFetchAvailableServers() and
homeProvider.fetchUserDataIfNeeded() on every config change.

Also bumps the radiance pin to the commit that adds the endpoint.

Addresses the config-events half of getlantern/engineering#3182.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* lantern-core: update StartBackgroundListeners comment to include config

Reflects that listenConfigEvents also starts automatically from
initialize, addressing Copilot review on PR #8673.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* add android-reproduce: reproduce Freshdesk tickets on emulator

Usage:
  android-reproduce /tmp/ticket-172722              # auto-downloads APK
  android-reproduce /tmp/ticket-172722 lantern.apk  # uses provided APK

After running /analyze-ticket, this script:
1. Extracts country + version from the ticket's config/logs
2. Downloads the matching APK from GitHub releases (gh CLI)
3. Pushes the user's exact config.json, servers.json, split-tunnel.json
   to the emulator so it gets the same proxies, DNS rules, rule sets
4. Sets RADIANCE_COUNTRY to match the user's region
5. Installs, restarts, and streams filtered logcat

This gives near-exact reproduction of Android-specific issues by
replicating the user's proxy assignments, country routing, and
sing-box config on a local emulator.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: Adam Fisk <afisk@mini.local>

* windows ffi cleanup

* Update bindings

* point to radiance refactor branch

* feat(dev-mode): hidden 5-tap unlock on support view + expanded dev screen

Show Build number alongside Lantern version on the support view. Tapping
the Build row 5× within 3s toggles developer mode (gated to nightly/debug
builds for enabling; disabling works anywhere). The developer entry in
settings now hides unless dev mode is enabled.

Developer screen adds radiance env-var overrides (country, version,
feature overrides), a log-level dropdown, a config-fetch toggle, and
buttons to send a config request, run URL tests, show live settings/env,
and disable dev mode. Pins qpack to v0.5.1 via replace directive to match
radiance's own pin so sing-box-minimal's quic-go HTTP/3 continues to
build.

Wires radiance ipc.Client.PatchSettings / PatchEnvVars / RunOfflineURLTests
/ UpdateConfig through lantern-core and exports them via FFI.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* bump radiance - limit config fetch to 1 at a time

* feat(dev-mode): show spinner on in-flight action tiles

Tapping Send config request / Run URL tests / Show settings & env vars
now disables the tile and shows a spinner until the IPC call returns, so
users don't assume the button is broken during the latency before the
snackbar appears.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Developer mode refactor

* Sync garmr/radiance-daemon-refactor with origin/main (#8684)

* deps: update radiance to fix outbound removal breaking config refresh (#8639)

Picks up radiance PR #405 which fixes removeOutbounds failing when
extra outbounds (non-smart Pro locations) aren't in the URL test group.
This was causing every config refresh IPC to return 500, preventing
SetURLOverrides and CheckOutbounds from running — resulting in ~50%
of bandit probe callbacks never firing.

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* Smart location country fix (#8638)

* Do not reset a smart location.

* code review updates

* Fix website split-tunneling reliability and CI validation (#8640)

* Add split tunneling e2e test

* Fix split tunneling website smoke assertion

* Fix split tunneling smoke navigation

* code review updates

* code review updates

* code review updates

* code review updates

* code review updates

* Macos system extension updates #2 (#8637)

* if system extension is in uninstall state do not block new installtion.

* update macos system extension test

* do not cache dart_tool

* Set the default status as unknown.

* code review updates

* Filter system apps from Windows split tunneling (#8641)

* Add split tunneling e2e test

* Fix split tunneling website smoke assertion

* Fix split tunneling smoke navigation

* code review updates

* code review updates

* code review updates

* Filter Windows system apps in split tunneling list

* code review updates

* code review updates

* Update system apps filter

* code review updates

* deps: update radiance + lantern-box to fix ~20% callback failure (#8642)

Picks up:
- radiance PR #406 → lantern-box PR #231: clear URL test history
  when SetURLOverrides is called so outbounds are re-tested with
  new callback URLs
- radiance PR #405: best-effort URL test group removal (already in
  previous update, carried forward)
- lantern-box v0.0.61: includes CA cert install + history fix

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* chore: update radiance + lantern-box for callback-all-outbounds (#8644)

- radiance: removes URL test filtering, all outbounds tested (PR #407)
- lantern-box v0.0.62: 6-worker URL test pool + client delay reporting (PR #232)

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Hide system apps without dropping user apps on Windows (#8643)

* code review updates

* code review updates

* code review updates

* chore: update radiance for async IPC outbound handlers (#8645)

Picks up getlantern/radiance#410: IPC outbound update/add/remove
handlers return 202 immediately and process asynchronously, fixing
the EOF errors on every config refresh.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* chore: update radiance for split tunnel persistence fix (#8646)

Picks up getlantern/radiance#411: fixes split tunnel filters silently
not persisting due to dangling slice pointers in initRuleMap.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: upload and notify for nightlies even when some platforms fail (#8649)

The upload-s3 and upload-release-artifacts jobs required ALL platform
builds to succeed or be skipped. When a matrix entry failed (e.g.,
Linux arm64), the entire build-linux job reported as 'failure', which
caused both upload jobs to skip entirely — even though macOS, Android,
iOS, and Linux amd64 all succeeded.

Simplify the condition: run uploads if at least one platform build
succeeded. The upload steps already handle missing artifacts gracefully
(upload_if_exists checks for file existence).

This ensures the Slack notification goes out with download links for
whatever platforms did build successfully.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Stabilize nightly smoke checks and platform release publishing (#8651)

* Stabilize nightly smoke checks and platform release publishing

* code review updates

* code review updates

* chore: bump radiance to latest main (lantern-box v0.0.65) (#8654)

Picks up:
- Reflex active-probe resistance: silence-timeout + masquerade
  fallback (getlantern/lantern-box#237 via radiance#413)
- TLS 1.3 minimum enforcement for Reflex
  (getlantern/lantern-box#236)
- radiance split-tunnel filter persistence fix (#411)

No Flutter / client-side behavior changes required — the Reflex
hardening is server-side.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add arch to releases (#8652)

* feat: add arch to releases

* Update linux/packaging/usr/lib/systemd/system/lanternd.service

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* chore: remove committed lanternd.service file

Agent-Logs-Url: https://github.com/getlantern/lantern/sessions/15085485-3c6a-4e1e-93ea-6e9bf0623d09

Co-authored-by: reflog <109876+reflog@users.noreply.github.com>

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: reflog <109876+reflog@users.noreply.github.com>

* ran go mod tidy

* Improve Windows app discovery for shortcut wrappers (#8653)

* code review updates

* Improve Windows app discovery for shortcut wrappers

* code review updates

* code review updates

* code review updates

* The radiance-to-device limit is flow fix. (#8659)

* only use permalinks (#8658)

Co-authored-by: atavism <paul@getlantern.org>

* Add auth E2E tests and wire Linux/Windows CI (#8607)

* auth flow test updates

* auth flow test updates

* auth flow test updates

* code review updates

* code review updates

* code review updates

* code review updates

* deps: update sing-box-minimal to v1.12.21-lantern (#8660)

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* Show vpn conflict dialog on smart location (#8661)

* Show vpn conflict dialog on smart location

* code review updates

* chore: bump radiance and lantern-box to latest (#8664)

- radiance: f1c425231e41 → 4241e6c5a9c6 (main HEAD)
- lantern-box: v0.0.65 → v0.0.67

Ran go mod tidy.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

* Windows installer cleanup, improve app discovery and icon loading (#8666)

* code review updates

* Add comment

* code review updates

* remove sentry (#8665)

* Save last server location (#8655)

* save server location

* update radiance.

* Forbid AutoConnect if connect fails.

* update radiance

* code review updates

* update radiance

* code review updates (#8675)

* deps: restore sing-box-minimal v1.12.21-lantern (#8678)

PR #8655 ("Save last server location") accidentally downgraded
sing-box-minimal from v1.12.21-lantern back to v1.12.19-lantern in
go.mod during review churn. v1.12.21-lantern contains commit 9c79c311
("fix: make initial remote rule-set fetch non-fatal"), which turns the
Android bootstrap deadlock ("no available network interface" during
initial rule-set fetch) from a fatal libbox startup error into a
WARN + retry-after-start. Without it, nightly builds from main fail
to connect on any smart-routing country (Macao, Bulgaria, etc.).

Confirmed by comparing Freshdesk #172722 (broken, rule_set_remote.go:235,
v1.12.19-lantern) with #172795 (working, rule_set_remote.go:113,
v1.12.21-lantern). Same user, same device, same 9.0.25 version, same
smart-routing-bg-common-direct fetch failure — only the sing-box-minimal
version differs. The v9.0.25-beta-android tag was cut before #8655
merged, which is why Alexander's beta works while the nightly doesn't.

`go mod tidy` also dropped stale go.sum entries for superseded radiance
and lantern-box pseudo-versions and removed the unused getsentry/sentry-go
indirect (left behind after #8665).

* Makefile: fix empty common.Version on Windows CI (missing app version 400) (#8677)

* Makefile: use env-provided APP_VERSION so Windows CI populates version ldflag

common.Version in radiance was being linked as an empty string on Windows
CI builds. The `-X .../common.Version=$(APP_VERSION_PUBSPEC)` ldflag
depended on `$(shell grep ... | sed ...)` or a PowerShell fallback, and
the Windows path was producing an empty value. With common.Version empty,
backend.NewRequestWithHeaders sets X-Lantern-App-Version to "", and
lantern-cloud's /v1/config-new handler rejects the request with
400 "missing app version" — no config is returned, so the client falls
back to the embedded server list with no bandit tracks. Observed on
Freshdesk #172794 (Windows 9.0.26 nightly, radiance 400s on every retry).

Use the APP_VERSION already exported to GITHUB_ENV by build-windows.yml's
"Read app version from pubspec.yaml" step, and compute APP_VERSION_PUBSPEC
with Make built-ins ($(firstword $(subst +, ,...))) so no shell tools are
required. Drops the Windows_NT branch; local dev on Mac/Linux still uses
the grep/sed fallback (APP_VERSION ?=).

* Makefile: restore Windows local-dev fallback for APP_VERSION

The previous commit removed the Windows_NT branch under the assumption
that APP_VERSION would always come from the environment. That's true on
CI (build-windows.yml exports it to GITHUB_ENV), but local Windows
developers running `make windows-release` directly don't set the env
var, and the grep/sed fallback runs under cmd.exe where Unix-style
quoting fails silently.

Add back the Windows PowerShell branch, but only as the fallback when
APP_VERSION isn't in the environment (`?=` on both branches). CI keeps
working via the env override; local Mac/Linux uses grep/sed; local
Windows uses PowerShell Select-String. The `+`-splitting stays in
Make built-ins so it works no matter which branch produced APP_VERSION.

* Makefile: fail the build when APP_VERSION_PUBSPEC ends up empty

Adds a parse-time guard so an unresolvable version fails loudly rather
than producing a binary with empty common.Version — which is what caused
this whole bug in the first place. Addresses Copilot review feedback on
PR #8677.

$ APP_VERSION="" make
Makefile:36: *** APP_VERSION_PUBSPEC is empty; export APP_VERSION ...

* Roll in #8676: PowerShell quoting + Windows service startup log

Incorporates the non-overlapping pieces of @atavism's PR #8676 so we
can close it in favor of this PR:

- Swap the Windows APP_VERSION fallback's PowerShell invocation to
  outer-single / inner-double quoting. The previous outer-double /
  inner-single form gets mangled when Make expands $$ and cmd.exe
  passes the resulting string to powershell, even in the local-dev
  fallback path.
- Same fix for GO_VERSION's PowerShell shell-out further down in the
  Makefile (separate variable, same root cause).
- Log the Windows service startup (name, version, mode) so it's
  visible when triaging issues. Matches the log line from #8676.

* Fix data cap issue (#8668)

* Report an Issue screen fixes (#8670)

* updates to report issue screen

* updates to report issue screen

* rename report issue

* rename report issue

* code review updates

* ffi: add missing base64 import for app icon encoding

* code review updates

* code review updates

* code review updates

* code review updates

* code review updates

* code review updates

* code review updates

* code review updates

---------

Co-authored-by: Myles Horton <afisk@getlantern.org>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: jigar-f <132374182+jigar-f@users.noreply.github.com>
Co-authored-by: Ilya Yakelzon <reflog@gmail.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: reflog <109876+reflog@users.noreply.github.com>
Co-authored-by: Jay <110402935+jay-418@users.noreply.github.com>

* code review updates

* bump radiance

* bump go to v1.26.2

Go v1.26.2 includes a patch to CGo that addresses some of
the bulkBarrierPreWrite panics.

* bump radiance - fix event streams on mobile

* fix sign up issue to point new radiance.

* split tunneling: treat FFI "ok" response as success, not error (#8691)

* split tunneling: treat FFI "ok" response as success, not error

_runSplitTunnelCall was checking `result != nullptr` and treating any
non-null return as an error message. But the Go FFI
(lantern-core/ffi/ffi.go) returns C.CString("ok") on success for both
addSplitTunnelItem and removeSplitTunnelItem — a non-null C string.

As a result, every successful add/remove was being reported to the UI as
a failure with message "ok". Symptoms:

- Adding a website in split tunneling showed an unstyled default
  snackbar reading "OK" (the default Material SnackBar rendering
  failure.localizedErrorMessage).
- The website appeared to not be saved — but it actually was; the
  provider's `reloaded` flag was never set, so the on-screen list never
  re-fetched from the backend.
- Re-clicking "Add" with the same domain created a duplicate entry on
  disk (visible as repeated items in split-tunnel.json) because the
  provider's local "already-added" check worked against a stale copy
  that had never been refreshed.

Fix: mirror the checkAPIError convention — treat literal "ok" as
success, parse JSON {"error": "..."} bodies for the error message, and
fall back to the raw string otherwise.

Reported in getlantern/engineering#3291 against Windows 9.0.29 build 481
(Freshdesk #173656).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* split tunneling: reuse _ffiOkResults for success-string check

Rather than hardcoding 'ok', use the existing _ffiOkResults set
({'ok', 'true'}) defined at the top of this file so the split-tunnel
path stays in sync with the other FFI success checks (e.g.
_setupRadiance at line 201).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* split tunneling: use design-system error snackbar on add (#8692)

The local showSnackbar helper in website_domain_input was using
Material's default ScaffoldMessenger.showSnackBar(SnackBar(content:
Text(message))) — producing an unstyled grey/dark snackbar that the rest
of the app doesn't use. Every call site in this file is an error path
(empty input, invalid domain, already-added, backend failure), so route
them through context.showSnackBarError which applies the app's rounded,
floating, red-background error style.

Follow-up to #8691. Addresses the "unstyled snackbar" symptom in
getlantern/engineering#3291 issue 3 for any remaining error surface
after the FFI "ok" fix.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(vpn): use SelectServer when switching servers on a live tunnel

 connectToServer previously always called ConnectVPN, which radiance
 rejects with ErrTunnelAlreadyConnected when the tunnel is up. Check
 VPNStatus first and route to SelectServer when Connected, falling
 back to ConnectVPN otherwise.

* android: detach connect() scope so withTimeout actually unblocks the UI (#8689)

* review: detach connect() scope so timeout actually unblocks the UI

Copilot flagged on #8689 that the existing coroutineScope { ... } still
hangs in exactly the scenario this change is meant to protect against.
Structured coroutineScope cancels its children on exception but then
waits for them to complete — so when withTimeout fires, we cancel the
deferred (which the JNI call ignores, since it has no suspension
points) and then block on it finishing anyway. Net effect: the UI is
still frozen, which is the symptom we're trying to prevent.

Switch to a DETACHED CoroutineScope(SupervisorJob() + Dispatchers.IO).
Its Job is not a child of the enclosing coroutine, so cancelling it
doesn't join — the orphan coroutine keeps running the JNI call in the
background until Go returns or the process exits, but the caller is
unblocked and the runCatching.onFailure path fires the timeout error
state for the UI.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* review: add single-flight gate to prevent orphan accumulation

Copilot correctly pointed out on #8689 that the detached-scope approach
can accumulate orphan coroutines if the user retries while a previous
connect() is still stuck in JNI. Each orphan pins a Dispatchers.IO
thread; enough retries against a truly deadlocked Go side could
pressure the IO pool.

Their suggested fix (Dispatchers.IO.limitedParallelism(1)) would
serialize retries behind the orphan, turning the 2nd retry into
another 60s hang. A simple single-flight AtomicBoolean gate with fast
rejection is the cleaner mitigation:

- compareAndSet rejects concurrent attempts with IllegalStateException
  (surfaces via the existing runCatching.onFailure → error state).
- The flag clears in a try/finally inside the async block, which runs
  when the JNI call eventually returns — cancellation alone can't
  break it out, but once Go completes the finally runs and a future
  retry is admitted.
- Process death (reboot, force-stop) resets the flag naturally.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Show the fastest location on smart location.

* android: make restartService block until restart completes (#8697)

* android: make restartService block until restart completes

Two bugs in the platformIfce restart path that together let the tunnel
wedge in Restarting forever on Android, triggering the "Error in VPN
operation" on every subsequent Connect attempt
(getlantern/engineering#3297, Freshdesk #173681).

1. restartService() used serviceScope.launch { ... } and returned
   immediately. Radiance's Restart() treats the sync return as "restart
   succeeded" and leaves the tunnel at status=Restarting, expecting the
   platform coroutine to drive it through stopVPN → startVPN and
   transition status via Mobile.* side-effects. If the service is torn
   down before the coroutine completes (onDestroy, process pressure),
   nothing ever transitions the tunnel out of Restarting.

   Switch to runBlocking(Dispatchers.IO) so the return actually
   reflects completion. c.mu is released on the Go side before
   RestartService is invoked, so synchronous Mobile.* callbacks on
   this thread don't deadlock.

2. stopVPNTunnel() skipped Mobile.stopVPN() when Mobile.isVPNConnected()
   returned false. isVPNConnected is status == Connected — but at the
   point stopVPNTunnel is called from restartService, radiance has
   already set status=Restarting, so the guard always skips and the
   tunnel is never actually closed.

   Swap the guard for Mobile.isRadianceConnected() — i.e. only skip
   when the IPC server itself isn't up. Mobile.stopVPN() is a no-op
   when c.tunnel is nil on the Go side, so the original guard was
   redundant even for the Connected == true case.

Evidence from Freshdesk #173681 logs for the broken path:
- 15:17:34.826 Restart → 15:17:34.828 "Tunnel restarted successfully"
  (2ms total — consistent with fire-and-forget, not real teardown)
- No subsequent tunnel.init / Tunnel connection established
- 15:19:10 onDestroy logs "Skipping stopVPN — VPN tunnel was never
  started" (same isVPNConnected() check)
- 15:21:48 next Connect fails within 2ms of the IPC request with
  "tunnel is currently Restarting"

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* android: drop isVPNConnected guard in onDestroy too

Same shape as the restart-path fix: if c.tunnel is non-nil on the Go
side but the tunnel status is anything other than Connected (Restarting
after a failed restart, Connecting mid-startup, Error from a prior
failure), isVPNConnected() returns false and the old guard skipped
Mobile.stopVPN(). That left the radiance tunnel state dangling across
service destroy.

Observed in Freshdesk #173681: "onDestroy — radianceConnected=true
vpnConnected=false, Skipping stopVPN — VPN tunnel was never started"
while the tunnel was actually alive at status=Restarting.

Swap the second guard for an unconditional call. Mobile.stopVPN() is a
no-op when c.tunnel is nil, so the guard was always redundant — it just
happened to also hide the non-Connected-but-non-nil case that's
load-bearing during restart.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* android: verify restart postcondition before returning to Go

launchVPN wraps its body in runCatching { ... }.onFailure { ... } and
returns normally regardless of whether Mobile.startVPN() threw — so a
nil return from startVPN() does not mean the restart succeeded. Without
a postcondition check, restartService would log "completed" and return
to radiance as if everything worked, even though the tunnel is still
stuck in Restarting, which defeats the whole point of making this
function block.

Check Mobile.isVPNConnected() at the end of the runBlocking block and
throw IllegalStateException if false. The exception propagates through
runBlocking → restartService → radiance's platformIfce.RestartService()
as a non-nil error, so Restart() hits the ErrorStatus branch and the
caller sees the failure.

Addresses Copilot review feedback on PR #8697.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: Adam Fisk <afisk@mini.local>

* fix(vpn): don't cancel tunnel when restart's start phase fails

The PacketTunnelExtension hosts the IPC server, so cancelTunnelWithError
tears down the daemon along with the tunnel. Inline MobileStartVPN in
restartService so a failed restart leaves the extension (and IPC socket)
alive; radiance's status events surface the failure for retry.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Fix copilot issue (#8696) (#8698)

Co-authored-by: atavism <atavism@users.noreply.github.com>

* main: don't block first paint on Updater.init() (#8699)

* main: don't block first paint on Updater.init()

Moving Updater.init() off the critical path to runApp. Investigating a
one-shot black-screen-on-startup report on a local macOS dev build
(9.0.29 build 487): flutter.log stopped at the last pre-runApp log line
with no Dart exception and no crash, while the Go side kept running
normally. The only awaited call between that last log and runApp is
Updater.init().

Inside init(), the actual update check is already deferred 45 s via
Future.delayed + unawaited. But setFeedURL and setScheduledCheckInterval
are awaited — both bridge into Sparkle via the auto_updater Flutter
plugin, and both can stall on first launch: feed URL resolution,
keychain access, or a previous launch's background worker still holding
a lock. Any of those becomes a main-isolate hang that prevents runApp,
which exactly matches the observed symptom.

Fix: drop the await so Updater.init() runs concurrently with the rest
of startup. All errors are already handled inside init() itself, so
unawaited is safe.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* review: guard sl<Updater>() lookup against failed service injection

Copilot flagged that if injectServices() throws above (caught at
main.dart:45), Updater is never registered (it's registered at
injection_container.dart:40, after storage init), and sl<Updater>()
throws synchronously. unawaited() doesn't help — the throw happens
before the Future is constructed, so it propagates out of main and
prevents runApp.

Wrap the call in try/catch + sl.isRegistered<Updater>() so any failure
to look up or start Updater.init logs and continues to runApp.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Adam Fisk <afisk@mini.local>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(logs): stream diagnostic logs via ipc TailLogs on desktop

Wires the FFI path to radiance's ipc.Client.TailLogs and merges in-app
flutter.log records so the diagnostic logs view shows both sources.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* deps: bump radiance to refactor tip (9703bcf) (#8700)

Picks up:
- refactor(vpn): own VPN status on the client so restarts span tunnels
- vpn: instrument tunnel.start phases + VPNClient.Restart (#443)

The VPN-status-ownership refactor moves setStatus calls out of
tunnel and onto VPNClient so a restart transitions Restarting →
Disconnecting → Disconnected → Connecting → Connected cleanly.

The instrumentation PR adds child spans around libbox.Setup,
libbox.NewServiceWithContext, libbox.BoxService.Start, and
newMutableGroupManager so SigNoz can attribute the 10s+ tail
on /service/start observed in Freshdesk #173696.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix server auto issue

* More fix to server selection.

* server selection changes for IOS/Macos

* Use select sever if vpn is active.

* bump radiance - pull in empty tag fix

* lantern-core: dispatch ConnectVPN/StartVPN to SelectServer on live tunnel (#8702)

* lantern-core: dispatch ConnectVPN to SelectServer on live tunnel

When the Flutter UI triggers an auto-select on a live tunnel — most
visibly Jigar's rewrite of onSmartLocation (server_selection.dart), which
routes "switch back to Smart" through startVPN(force: true) → Dart
lantern.startVPN() → ffi.go:startVPN → c.ConnectVPN("") — radiance's
/vpn/connect endpoint rejects the request with ErrTunnelAlreadyConnected
(radiance/vpn/vpn.go:126 in VPNClient.Connect). The error is returned to
the Dart UI as a snackbar, the tunnel stays pinned to the previously
selected manual server, and lantern.log is silent because neither
LocalBackend.ConnectVPN nor VPNClient.Connect slog the ErrTunnelAlready
Connected path.

Observed on 9.0.30 beta (internal tester, Freshdesk #173763, build from
commit 405468954 which includes Jigar's 289507280). After manually
picking Bogotá, clicking "Smart" at the top of the server-selection
screen surfaces the snackbar and the tunnel keeps routing traffic
through the Bogotá samizdat outbound.

Fix: when Status() == Connected, LanternCore.ConnectVPN dispatches the
request to /server/selected (the live-tunnel outbound swap) instead of
/vpn/connect. Empty tag normalizes to vpn.AutoSelectTag — Dart sends ""
for Smart, radiance recognizes only the literal "auto" and otherwise
falls into the manual-outbound branch of SelectServer, stranding Clash
in manual mode with an empty selector. The mapping is centralized in a
small normalizeAutoTag helper used by both ConnectVPN and SelectServer.

This puts the same dispatch logic that lives in ffi.go:connectToServer
onto every caller of LanternCore.ConnectVPN — including ffi.go:startVPN
(which Jigar's rewrite now funnels through) and any future FFI/mobile
entry point.

getlantern/engineering#3291 issue 3. Supersedes earlier work on
fisk/connect-dispatch-select-when-connected (485bf5a00), which was
scoped to this same dispatch but predated the current refactor branch.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* vpn_tunnel: dispatch StartVPN to SelectServer on live tunnel (mobile path)

Mobile.StartVPN (the gomobile entry point for Android MainActivity and
iOS VPNManager) routes through vpn_tunnel.StartVPN(client), which calls
client.ConnectVPN(ctx, vpn.AutoSelectTag) directly — bypassing
lanterncore.Core. Jigar's onSmartLocation rewrite dispatches "switch
back to Smart" through startVPN(force: true), which on Android/iOS
lands here. Same ErrTunnelAlreadyConnected bug as the FFI path fixed in
the previous commit.

Mirror the VPNStatus dispatch pattern garmr already added to
vpn_tunnel.ConnectToServer in 405468954: when Status() == Connected,
swap outbound via /server/selected; otherwise fall through to the
existing /vpn/connect start.

Together with the LanternCore.ConnectVPN dispatch, this closes the
Smart-from-connected bug on every platform (Windows FFI, Android/iOS
gomobile).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* ffi: drop now-redundant VPNStatus dispatch in connectToServer

LanternCore.ConnectVPN already routes to /server/selected when the
tunnel is live (added earlier in this PR), so ffi.go:connectToServer's
own VPNStatus check is duplicate work. Collapse to a single c.ConnectVPN
call — both the live-tunnel-swap and fresh-connect paths flow through
the dispatch one layer down.

Behavior unchanged. The "start service failed" error wrapper is kept
for Dart-side snackbar stability.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* lantern-core: collapse dispatch to a single implementation in vpn_tunnel

Three functions had independent VPNStatus → SelectServer-vs-ConnectVPN
dispatches after the earlier commits: LanternCore.ConnectVPN,
vpn_tunnel.StartVPN (both added in this PR), and vpn_tunnel.ConnectToServer
(pre-existing from 405468954). Consolidate so vpn_tunnel.ConnectToServer
is the authoritative dispatch and the other two delegate.

- LanternCore.ConnectVPN → vpn_tunnel.ConnectToServer(lc.client, tag)
- vpn_tunnel.StartVPN → ConnectToServer(client, vpn.AutoSelectTag)

LanternCore.SelectServer keeps its own empty-tag normalization since its
scope is the one-shot SelectServer IPC, not the dispatch.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Adam Fisk <afisk@mini.local>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* lantern-core: drop client-side empty-tag normalization (radiance fac9089) (#8703)

Patrick's radiance fac9089 ("fix(vpn): treat the empty string as
AutoSelect in SelectServer") is now pinned on this branch via
72a6c6282. Radiance normalizes tag == "" → AutoSelectTag on both
ConnectVPN and SelectServer, so the client-side normalizations we
added earlier (normalizeAutoTag helper in core.go, `if tag == ""` in
vpn_tunnel.ConnectToServer) are redundant — radiance handles the Dart
"" convention uniformly.

Remove:
- LanternCore.normalizeAutoTag helper + its use in SelectServer
- `if tag == "" { tag = vpn.AutoSelectTag }` branch in
  vpn_tunnel.ConnectToServer
- lantern-core/core_test.go (only tested the removed helper)

Behavior unchanged end-to-end: empty tag still means auto-select on
every path (FFI, gomobile, connectToServer, startVPN).

Co-authored-by: Adam Fisk <afisk@mini.local>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* bump radiance to refactor tip (d5a1872) — pull in LocalBackend.SelectServer empty-tag fix (#8705)

radiance@d5a1872 completes fac9089's empty-string → AutoSelectTag
normalization by extending it to LocalBackend.SelectServer, which
previously only matched the literal "auto" and fell through to the
srvManager lookup for tag == "" — producing "no server found with tag"
(HTTP 500, snackbar) on Smart-from-connected flows after the client-
side normalization was removed in this branch's 6de3c9aa9.

Reported on Lantern 9.0.30 beta via Freshdesk #173773.

go.mod + go.sum bump only; no lantern code changes. Pinned commit:
getlantern/radiance@d5a18726afbc (#444).

Co-authored-by: Adam Fisk <afisk@mini.local>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Garmr/refactor mobile logstream (#8701)

* feat(logs): stream diagnostic logs via ipc TailLogs on mobile

Adds a mobile gomobile binding for ipc.Client.TailLogs (TailLogs +
LogSubscription) and switches Android and iOS to consume it, replacing
the per-platform log-file tailers.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(logs): stream diagnostic logs via ipc TailLogs on macos

Switches the macOS log stream to MobileTailLogs, matching iOS. Removes
the file-watching LogTailer (no remaining callers).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(logs): harden TailLogs against nil, panics, and listener leaks

- Reject nil listener in mobile.TailLogs; recover from panics crossing
  the gomobile bridge so the stream survives unexpected bridge errors.
- Retain the Kotlin LogListener in a field so the Go side's reference
  stays strongly rooted on the JVM.
- On iOS/macOS, cancel any pre-existing subscription before starting a
  new one and clear the stored listener when MobileTailLogs errors.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* refactor(logs): share TailLogs plumbing across mobile and ffi

Adds lantern-core/logs.Subscribe wrapping ipc.Client.TailLogs so the
mobile and desktop integrations go through one helper. Drops the iOS
LogTailer dead code and the unused lantern-core/logging package.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Update log formatting

* Fix issue with ios

* Fix macos logs issue

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: Jigar-f <jigar@getlantern.org>

* ffi + lantern-core: drop non-Linux preflight; bound IPC calls with per-operation timeouts (#8707)

* ffi: skip the daemon-reachability preflight on Windows / macOS / mobile

The 300 ms preflight in lantern-core/core.go's CheckDaemonReachable
was originally tuned for the Linux flow (PR #8494 by atavism, commit
bf054f4ea), where the failure path falls back to `systemctl is-active
lanternd.service` for a rich diagnostic error. The 300 ms cap made
sense as "fast probe → systemd-rich-error", with the systemd query
adding the actual user-facing context.

Subsequent refactors (commit bd89bea7e Apr 7, then PR #8578 commit
4d4e06d9d Apr 16) generalized that preflight to all platforms but
the systemd fallback only survived in ffi_linux.go. On Windows /
macOS / mobile, ffi_nonlinux.go ended up running the same 300 ms
probe with no fallback — just an artificial guillotine in front of
ConnectVPN, which has its own "lanternd not reachable" error path
with equivalent precision.

Cold-start IPC on Windows regularly exceeds 300 ms (named-pipe dial
+ winio impersonation token dance + H2c connection preface +
goroutine scheduling on a 96-second-idle daemon), so the first VPN
toggle after launch reliably trips the timeout and shows the user a
"lanternd not reachable" error. Clicking again 10 seconds later
silently succeeds. Reproduced on the same Windows machine across
9.0.29 (Freshdesk #173696) and 9.0.30 (#173932).

Make the preflight a no-op on non-Linux. Linux keeps the original
fast-probe-then-systemdDiag flow unchanged. If we add Windows
(`sc query LanternSvc`) or macOS (`launchctl list`) diagnostics
later, restore the preflight and call them from here.

See getlantern/engineering#3382 for the full archaeology + design
discussion.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* ffi + lantern-core: bound IPC calls with per-operation timeouts

Companion to dropping the non-Linux daemon-reachability preflight in this
same PR. The preflight (ffi_nonlinux.go's `checkDaemonReachable`) was
introduced in commit bd89bea7e along with the *removal* of per-call
timeouts that used to live on the FFI layer:

    -    ctx, cancel := context.WithTimeout(context.Background(), 5*time.Second)
    -    if err := c.Client().DisconnectVPN(ctx); err != nil { ... }
    +    if err := c.DisconnectVPN(); err != nil { ... }

After that change, the only IPC call with any deadline at all was the
300 ms preflight. Every other operation flowed lc.ctx (
context.WithCancel(context.Background())) straight through, meaning a
hung lanternd would freeze the UI indefinitely. Dropping the preflight
without restoring per-call timeouts removes the only line of defense.

Restore them at the LanternCore layer where they belong, with values
sized for the inherent work each operation does (state changes can run
into multi-second territory; status queries should be near-instant):

    ipcConnectTimeout     = 60 * time.Second   // ConnectVPN
    ipcStateChangeTimeout = 30 * time.Second   // SelectServer, DisconnectVPN
    ipcStatusTimeout      = 10 * time.Second   // VPNStatus, IsVPNRunning

These bound the worst case (hung daemon → user sees a clear error within
a minute, no indefinite spinner) without firing during normal slow paths.
The dialer's 10 s connect timeout (radiance/ipc/conn_windows.go) already
covers the lanternd-crashed case; these guard the lanternd-hung case.

vpn_tunnel.{StartVPN, StopVPN, ConnectToServer} take the ctx through
their signatures instead of building their own context.Background()
internally, so callers stay in charge of their own deadlines. mobile/
mobile.go updated to set 60 s / 30 s / 60 s contexts on its three
gomobile entry points.

CheckDaemonReachable's 300 ms timeout is kept untouched — Linux still
calls it from ffi_linux.go for the systemctl is-active fallback that's
the whole point of the fast probe.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Adam Fisk <afisk@mini.local>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* bump radiance

* lantern-core: fix empty Windows split-tunnel apps list + UI-process logging (#8709)

Two narrow fixes that together resolve Freshdesk #173774 / #173778 /
#173826 (Derek's "Failed to fetch installed apps" empty list on Windows
split tunneling). Split out from #8706 so they can land independently
of the broader app-discovery rework that PR also contained.

1. **GetEnabledApps returns []string{} instead of nil.**
   When no apps are split-tunneled, the previous code returned nil,
   which json.Marshal serialized as "null". Dart's jsonDecode("null")
   returns null; the receiving code does `as List`, which throws and
   the UI shows "Failed to fetch installed apps". Initializing as an
   empty slice serializes to "[]" — Dart parses that as an empty list,
   no exception, no error UI. THIS is the actual root cause of the
   empty-list reports we've been chasing; the apps-discovery scanner
   work was investigating a different (also-real but secondary) issue.

2. **UI-process slog wired up via common.Init.**
   On the refactor branch, the UI process never called common.Init.
   slog wrote to stderr (= nowhere on a GUI host), settings were
   uninitialized, no lantern.log was produced outside the daemon.
   Patrick caught this — it was a one-line miss in the refactor.

   Platform-aware so we don't double-init on platforms where the
   backend embeds in-process:
     - windows/linux: full common.Init (separate UI + daemon procs)
     - darwin/ios:    setupAppLogging into a distinct lantern-app.log
                      so the main-app slog doesn't race the tunnel
                      extension's lantern.log on lumberjack rotation
     - android:       Mobile.SetupRadiance already ran common.Init
                      upstream — fall through

3. **Auto-attach UI-process *.log to ReportIssue (windows/linux only).**
   Without it the daemon's archive glob only sees the daemon's logDir;
   UI-side lantern.log + flutter.log never reach the issue bundle. The
   daemon runs as SYSTEM on Windows; we keep UI logDir at
   %PUBLIC%\Lantern\logs so SYSTEM can read it.

The broader Windows app-discovery work from #8706 (App Paths scan, Run
keys, Squirrel pattern, isAppPathsNoise heuristic filters) is being
held in a separate PR for independent review.

Co-authored-by: Adam Fisk <afisk@mini.local>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* qa: plumb RADIANCE_OUTBOUND_SOCKS_ADDRESS into the Android client

Adds a debug-only path that lets a developer route every outbound
network call radiance makes from the Android client through an
upstream SOCKS5 (typically the local pinger bridge), so the bandit
treats the client as a real Russia-residential user end-to-end.

Pairs with:
  * radiance:    https://github.com/getlantern/radiance/pull/445
  * lantern-cloud: https://github.com/getlantern/lantern-cloud/pull/2649

  * `lantern-core/mobile`: new gomobile-exported `SetQAEnvOverrides(socks, tz)`
    that does `os.Setenv` for `RADIANCE_OUTBOUND_SOCKS_ADDRESS` and `TZ`.
    Must be called before `SetupRadiance`/`StartIPCServer` to take effect.
  * `android/.../LanternApp.kt`: override `onCreate` and call the new setter
    with values from Android system properties:
      `debug.lantern.outbound_socks` -> `RADIANCE_OUTBOUND_SOCKS_ADDRESS`
      `debug.lantern.tz`             -> `TZ`
    Set with `adb shell setprop debug.lantern.outbound_socks 10.0.2.2:1080`.
    No-op when the props are unset, so production builds aren't affected
    unless someone deliberately sets them on the device.
  * `go.mod`: bump radiance to the qa/outbound-socks-egress branch tip
    (will swap back to a pinned tag once that PR lands).

Verified end-to-end in an `lantern_test` AVD with packetstream + Russia
upstream:
  - LanternApp logs `QA env overrides applied: outbound_socks=10.0.2.2:1080`
  - Radiance's `/v1/config-new` response: `country=RU ip=85.172.81.50`
  - Bandit serves Russia-tier outbounds (samizdat / reflex in DE/SE/SG/etc.)
  - All sing-box outbound dials wrapped in `_dev_outbound_socks` detour
  - Browsing in the emulator's Chrome egresses from a Lantern entry server
    (e.g. Stockholm/Singapore — bandit-assigned, not the Mac's home IP)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* code review updates

---------

Co-authored-by: garmr <pdixon117@gmail.com>
Co-authored-by: Jigar-f <jigar@getlantern.org>
Co-authored-by: jigar-f <132374182+jigar-f@users.noreply.github.com>
Co-authored-by: atavism <atavism@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Ilya Yakelzon <reflog@gmail.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: reflog <109876+reflog@users.noreply.github.com>
Co-authored-by: Adam Fisk <afisk@mini.local>
Co-authored-by: Jay <110402935+jay-418@users.noreply.github.com>
Co-authored-by: atavism <paul@getlantern.org>
Co-authored-by: garmr-ulfr <104022054+garmr-ulfr@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants