Skip to content

Support session-scoped runtime hints and remote artifact retrieval in native remote daemon mode #197

@thymikee

Description

@thymikee

Problem

agent-device remote daemon mode is close to a transparent remote UX, but there is still a gap for React Native dev/simulator/emulator flows when the CLI runs in one environment and the daemon runs in another.

Current setup:

  • CLI runs in a Linux sandbox
  • daemon runs on a remote macOS host
  • local mobile artifacts can be uploaded remotely
  • normal remote commands like devices work
  • install/open plumbing can work remotely

However, runtime connectivity information (for example Metro host/port or other platform-specific debug runtime settings) is not transparently carried through native remote daemon mode in a way that makes later UI commands just work.

This means the following transparent workflow is still incomplete:

agent-device install MyApp ./artifact --platform ios --session my-session
agent-device open MyApp --platform ios --session my-session
agent-device snapshot --platform ios --session my-session
agent-device click @e7 --platform ios --session my-session

and similarly for Android:

agent-device install com.example ./app-debug.apk --platform android --session my-session
agent-device open com.example --platform android --session my-session
agent-device snapshot --platform android --session my-session
agent-device click @e7 --platform android --session my-session

In remote daemon mode, the CLI needs a way to associate runtime/network information with the session so later UI commands work without custom proxy-specific helpers.

Requested capability

Add native support for session-scoped runtime hints in remote daemon mode for both iOS and Android.

Examples of the kind of information needed:

  • Metro host
  • Metro port
  • optionally resolved bundle URL / launch URL if applicable
  • any Android-specific debug-runtime connectivity settings required for dev builds

Possible API shapes

Option A: session runtime command

agent-device runtime set \
  --session my-session \
  --platform ios \
  --metro-host 10.0.0.10 \
  --metro-port 8081

Then subsequent commands automatically use those runtime hints:

agent-device open MyApp --platform ios --session my-session
agent-device snapshot --platform ios --session my-session
agent-device click @e7 --platform ios --session my-session

A similar flow should exist for Android if platform-specific runtime data is needed.

Option B: environment-based runtime hints

AGENT_DEVICE_IOS_METRO_HOST=10.0.0.10
AGENT_DEVICE_IOS_METRO_PORT=8081
agent-device open MyApp --platform ios --session my-session

Option C: flags on each command

agent-device open MyApp \
  --platform ios \
  --session my-session \
  --metro-host 10.0.0.10 \
  --metro-port 8081

This is less ergonomic than session-scoped state, but still better than requiring a separate proxy-specific execution API.

Also required: remote artifact retrieval back to the caller

In the same remote setup, screenshots, recordings, and other generated artifacts must be returned in a way the remote caller can actually consume.

Today, a remote command can end up returning host-local paths like:

  • /Users/.../screenshot.png
  • ~/Library/...
  • host-local log paths

Those are not useful to a caller running in a sandbox or other remote environment.

Requested behavior

When the daemon is remote, artifact-producing commands such as:

  • screenshot
  • snapshot (if it emits files)
  • record
  • any future file-producing output

should return results that are retrievable by the client.

Possible shapes:

  • stream bytes back directly
  • upload artifacts through the daemon transport and return:
    • artifact ID
    • downloadable URL
    • or client-materialized local temp path
  • provide a follow-up download endpoint/command in native remote mode

The important part is:
remote callers must not receive unusable host-local paths as the primary artifact output.

Why this matters

This would complete the transparent remote UX for AI agent / sandbox workflows:

  • install app remotely
  • start Metro / provide runtime hints once
  • run iOS and Android UI commands normally through agent-device
  • receive screenshots/recordings back in the caller environment

without requiring:

  • proxy-specific exec helpers
  • custom transport wrappers
  • special-case code paths outside agent-device

Concrete use case

A QA workflow running in a sandbox:

  1. downloads an Android APK or iOS simulator .app artifact from CI
  2. installs it remotely via agent-device
  3. starts Metro separately if needed
  4. needs to launch the app and interact with it using normal agent-device commands
  5. needs screenshots/recordings to come back to the sandbox as usable artifacts

Today steps 4-5 still require custom proxy-specific handling.

This issue is about making native agent-device remote daemon mode fully transparent for that setup on both iOS and Android.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions