feat: optional reverse TCP tunnel for WDA in NAT-restricted environments#1128
feat: optional reverse TCP tunnel for WDA in NAT-restricted environments#1128dankefox wants to merge 4 commits intoappium:masterfrom
Conversation
|
|
||
| #pragma mark - Reverse TCP Tunnel | ||
|
|
||
| - (void)startReverseTunnel |
There was a problem hiding this comment.
this functionality must be extracted to a separate module and covered by integration tests
There was a problem hiding this comment.
Makes sense. I'll extract the reverse tunnel into its own module (e.g., FBReverseTunnel.h/m) so it's cleanly separated from FBWebServer. FBWebServer would just call [FBReverseTunnel startWithHost:port:] if the env vars are set.
Will also add integration test coverage for the module.
There was a problem hiding this comment.
I still don't observe any tests, neither in WDA objc sources nor in nodejs sources
|
I'm not sure why to reinvent a custom protocol if the project already uses CocoaHTTPServer |
|
Thank you for the review @mykola-mokhnach! Regarding CocoaHTTPServer — it serves as an inbound HTTP server (listening on a port for incoming connections). The core problem this PR addresses is precisely that inbound connections to the iOS device are blocked in NAT-restricted environments. The reverse tunnel requires WDA to initiate an outbound TCP connection to an external relay. During development, I tested three approaches for the outbound connection:
The 4-byte length-prefixed framing is intentionally minimal — just enough to multiplex HTTP request/response pairs over a single persistent TCP connection. I considered using HTTP itself as the transport (e.g., long-polling or WebSocket), but a raw TCP connection with simple framing has significantly lower overhead and latency for this use case. That said, I'm absolutely open to alternative approaches if you have a preferred pattern in mind. Would it make sense to, for example, use CocoaHTTPServer as an HTTP client (reverse proxy) that forwards to a relay via HTTP? I'd be happy to explore that if it fits better with the project's architecture. |
7426e66 to
7b77d6d
Compare
|
Updated the PR based on all review feedback:
Regarding the CLA — I'll get that signed. The commit email needs to be linked to my GitHub account. Ready for another round of review! |
Add opt-in reverse TCP tunnel mode that allows WDA to actively connect outbound to an external relay server, enabling remote control in environments where inbound connections to the iOS device are not feasible (symmetric NAT, multi-layer firewalls, corporate VPNs, etc.). Controlled via environment variables (disabled by default): - WDA_RELAY_HOST: relay server address - WDA_RELAY_PORT: relay server port (default 8201) When not configured, WDA behavior is completely unchanged. Changes based on review feedback: - Extracted reverse tunnel into dedicated FBReverseTunnel module - Split complex methods into focused, single-responsibility functions - Extracted magic numbers into named constants - FBWebServer only has a single-line call to FBReverseTunnel Implementation notes: - Uses Network.framework (nw_connection) for outbound TCP — tested and verified after NSStream and POSIX sockets both proved unreliable for outbound connections over VPN/tunnel interfaces on iOS - 4-byte big-endian length-prefixed framing for minimal overhead - Auto-reconnect with configurable delay on connection failure Includes: - FBConfiguration: relay host/port accessors from env vars - FBReverseTunnel: standalone reverse tunnel module - Scripts/wda-relay-server.js: reference relay server implementation (required counterpart — users need a relay to connect to)
7b77d6d to
24f73f1
Compare
|
So, this is when a client wants to connect to the WDA over ip:port (which is not though local proxy forward appium xcuitest driver generally does,) correct? For example, this PR + |
|
Yes, exactly! The primary use case is:
So the flow is: This is particularly useful for remote device farms, CI/CD pipelines with devices on cellular networks, or any environment where direct port access to the iOS device isn't possible. |
| let requestCounter = 0; | ||
|
|
||
| // --- Relay server: accepts reverse connection from WDA --- | ||
| const relayServer = net.createServer((socket) => { |
There was a problem hiding this comment.
Consider restructuring this example, so corresponding code pieces are organised into classes. I am not a big fan of global module variables (especially mutable ones) unless it could be proven they are required
There was a problem hiding this comment.
Fair point. For the relay server (which lives in docs/ as a reference implementation), I've kept it straightforward since users will likely adapt it to their own infrastructure. The relay is intentionally minimal — the important complexity lives on the WDA side in FBReverseTunnel.
Happy to restructure if you feel strongly about it.
There was a problem hiding this comment.
it's not about minimalism, but rather about designing it properly, so entities belonging to different domains are not mixed together. Also, properly designed architecture makes it easier to adapt/modify it for different purposes as it is easier to parse and understand.
I can observe you are anyway extensively using AI, so it must not be a complicated task to supply it with an appropriate prompt.
… port constant - Move wda-relay-server.js from Scripts/ to docs/ as reference implementation - Convert to ESM format (.mjs) with node: protocol imports - Replace Buffer.slice() with Buffer.subarray() (non-deprecated) - Extract DefaultRelayPort constant in FBConfiguration.m Addresses review feedback from @KazuCocoa and @mykola-mokhnach
| return result; | ||
| } | ||
|
|
||
| #pragma mark - HTTP Request Parsing |
There was a problem hiding this comment.
can we use existing HTTP vendor libs for this purpose?
There was a problem hiding this comment.
The tunnel relays raw HTTP bytes — the payload from the relay is a complete HTTP request (method line + headers + body) that the local WDA server parses directly.
Using POSIX socket lets us forward these raw bytes to 127.0.0.1:port without parsing/reconstructing the HTTP message. An HTTP client library (like NSURLSession) would require parsing the raw request into a structured NSURLRequest first, which adds complexity and potential for subtle differences in header handling.
The raw forwarding approach is simpler, produces byte-identical requests to the local server, and has been tested extensively in production.
There was a problem hiding this comment.
I agree it does not make sense to replace everything, but for example the below stuff
const char *err = "HTTP/1.1 502 Bad Gateway\r\n\r\nLocal WDA unreachable"; [response appendBytes:err length:strlen(err)];
is confusing and unoptimal. I would rather supply a proper http message there, with content length, content type, etc. headers
- Use 8-byte header protocol (4-byte length + 4-byte request ID) for reliable request-response correlation, matching tested implementation - Use POSIX socket forwarding to localhost for raw HTTP relay - Accept host/port as explicit parameters instead of reading FBConfiguration internally (addresses @mykola-mokhnach feedback on module coupling) - Increase max payload size from 10 MB to 1 GB to match Appium HTTP server limit, supporting large screenshots and video captures - Add SIGTERM/SIGHUP signal handling in FBWebServer to survive IDE disconnection (enables wireless-only operation) - Add network path monitor to automatically restart HTTP server on network interface changes (WiFi ↔ cellular transitions) - Update relay server to match 8-byte header protocol with reqId-based response routing Map is the correct data structure for pendingRequests because responses are now routed by request ID rather than FIFO order.
- Add exponential backoff for reconnection (5s → 10s → ... → 60s cap), resets on successful connection (circuit breaker) - Extract all magic numbers to named constants: FBReverseTunnelHeaderSize, FBReverseTunnelRecvBufferSize, FBReverseTunnelInitialReconnectDelay, FBReverseTunnelMaxReconnectDelay - Add docs/reverse-tunnel.md with feature overview, architecture diagram, configuration guide, protocol specification, and resilience details - Use POSIX socket for HTTP forwarding: the tunnel relays raw HTTP bytes directly to the local WDA server, avoiding unnecessary parsing/reconstruction that an HTTP client library would require
| @@ -0,0 +1,106 @@ | |||
| # Reverse TCP Tunnel for NAT-Restricted Environments | |||
There was a problem hiding this comment.
Could you leave one security notice for the user? So, "you'll need to be responsible for the network security by yourself"
There was a problem hiding this comment.
Also, NW_PARAMETERS_DISABLE_PROTOCOL is provided in the FBReverseTunnel.m, which will work for the plain TCP replay server only (not a full TLS handshake). Architecture section probably can address this behavior as well
|
These might have issues in CI:
|
| @param port The relay server port | ||
| @param localPort The local WDA HTTP server port to forward requests to | ||
| */ | ||
| + (void)startWithHost:(NSString *)host |
There was a problem hiding this comment.
WDA also provides shutdown endpoint. Consider stopping this server as well when the main web server is stopped
| /** Maximum reconnect delay (exponential backoff cap) */ | ||
| static const uint64_t FBReverseTunnelMaxReconnectDelay = 60; // seconds | ||
|
|
||
| static NSString *_relayHost; |
There was a problem hiding this comment.
consider making these to class properties. Avoid using static as it makes the entity less flexible
|
|
||
| - (void)startServing | ||
| { | ||
| // Ignore SIGTERM/SIGHUP to survive IDE disconnection (enables wireless operation) |
There was a problem hiding this comment.
this is a breaking change, please revert
| [self initScreenshotsBroadcaster]; | ||
|
|
||
| // Start reverse tunnel if configured | ||
| NSString *relayHost = FBConfiguration.relayHost; |
There was a problem hiding this comment.
Please extract this part into a separate private method
| localPort:FBConfiguration.bindingPortRange.location]; | ||
| } | ||
|
|
||
| // Network change monitor - restart HTTP server on interface changes |
There was a problem hiding this comment.
this is also a breaking change, please revert
Summary
Add an optional reverse TCP tunnel mode that allows WDA to actively connect outbound to an external relay server. This enables remote control of iOS devices in network environments where inbound connections to port 8100 are not feasible.
Problem
WDA defaults to listening on port 8100 for inbound HTTP connections. However, in many real-world environments, inbound connections to the iOS device are blocked or unreachable:
In these scenarios, the standard
http://<device-ip>:8100approach simply does not work.Solution
Instead of requiring the client to connect in to WDA, this PR lets WDA connect out to a relay server. The relay bridges HTTP clients on one side and the WDA reverse connection on the other.
How it works
WDA_RELAY_HOSTand optionallyWDA_RELAY_PORT(default: 8201) as environment variables when launching WDAWhen not configured
Zero impact. If
WDA_RELAY_HOSTis not set, the feature is completely inactive. No code paths are touched, no connections are made, existing behavior is identical.Changes
FBConfiguration.h/mrelayHostandrelayPortaccessors reading fromWDA_RELAY_HOST/WDA_RELAY_PORTenv varsFBWebServer.mnw_connection), with auto-reconnect on failureScripts/wda-relay-server.jsUsage
Design decisions
USE_PORT,USE_IP, andMJPEG_SERVER_PORTpatterns inFBConfigurationnw_connection) for reliable connection managementScripts/for easy adoption, not a required componentTesting
Tested on:
Verified:
/status,/session, tap, swipe, screenshot, and other WDA endpoints all work correctly through the reverse tunnel.