Skip to content

feat: add exec methods to run commands in a more accessible way#82

Merged
christianalfoni merged 2 commits into
mainfrom
CSB-1420
May 22, 2026
Merged

feat: add exec methods to run commands in a more accessible way#82
christianalfoni merged 2 commits into
mainfrom
CSB-1420

Conversation

@christianalfoni
Copy link
Copy Markdown
Collaborator

@christianalfoni christianalfoni commented May 21, 2026

Run a sandbox command in one call and get back a guaranteed exit code + joined output

❌ Current behavior

To run a one-shot command and find out whether it succeeded, callers had to:

  1. Call execs.create({ command, args, autostart: true })
  2. Open the SSE stream via execs.streamOutput(id) (or poll getOutput)
  3. Write their own for await loop, partition stdout vs stderr if they cared
  4. Watch each event for exitCode and decide when "done" means done
  5. Decide what to do if the stream ended without an exit code (silent failure by default)
// Before — every caller writes this boilerplate
const exec = await sandbox.execs.create({
  command: "sh",
  args: ["-c", "echo hello && exit 3"],
  autostart: true,
});
const stream = await sandbox.execs.streamOutput(exec.id);
const chunks: string[] = [];
let exitCode: number | undefined;
for await (const event of stream) {
  chunks.push(event.output);
  if (typeof event.exitCode === "number") exitCode = event.exitCode;
}
// And… now what if exitCode is still undefined? 🤷

✅ New behavior

A single sandbox.execs.exec(command, args, opts?) call creates the exec with autostart: true and interactive: false, consumes the SSE stream to completion, and returns { exitCode, output } — with exitCode statically typed as number because the SDK throws if the stream ends without one.

getOutput() got the same {exitCode, output} treatment with optional exitCode (since polling may catch a still-running process). streamOutput() remains the only way to get per-event metadata.

// TypeScript
const result = await sandbox.execs.exec("sh", [
  "-c",
  "echo hello && echo oops >&2 && exit 3",
]);

console.log(result.exitCode); // 3 — guaranteed present
console.log(result.output);   // "hello\noops\n" — stdout + stderr interleaved
# Python — symmetric API
result = await sandbox.execs.exec(
    "sh", ["-c", "echo hello && echo oops >&2 && exit 3"]
)

assert result["exit_code"] == 3
assert "hello" in result["output"]
assert "oops" in result["output"]

Three-way API surface

The exec-output methods now have distinct, well-differentiated semantics:

┌────────────────┬───────────────────────────────────┬─────────────────────────┬──────────────────────────┐
│ Method         │ Return                            │ Exit code               │ Output                   │
├────────────────┼───────────────────────────────────┼─────────────────────────┼──────────────────────────┤
│ exec()         │ { exitCode, output }              │ guaranteed (throws)     │ joined string            │
│ getOutput()    │ { exitCode?, output }             │ optional (poll)         │ joined string            │
│ streamOutput() │ AsyncGenerator<ExecStdout>        │ per-event field         │ per-event with metadata  │
└────────────────┴───────────────────────────────────┴─────────────────────────┴──────────────────────────┘

Each has a clear purpose:

  • exec() — "I want to run a command and know the result" (fire-and-forget convenience)
  • getOutput() — "I want a snapshot of what's been logged so far" (one-shot poll)
  • streamOutput() — "I want events as they arrive, with types and sequence numbers" (live tailing)
sequenceDiagram
    participant Caller
    participant Sandbox
    participant Agent

    Caller->>Sandbox: execs.exec(cmd, args, opts?)
    Sandbox->>Agent: createExec(autostart=true, interactive=false)
    Agent-->>Sandbox: exec.id
    Sandbox->>Agent: streamOutput(exec.id) [SSE]
    loop while process running
        Agent-->>Sandbox: ExecStdout {type, output, exitCode?}
        Note over Sandbox: accumulate chunks,<br/>capture exitCode if present
    end
    Agent-->>Sandbox: stream closes
    alt exitCode was seen
        Sandbox-->>Caller: { exitCode, output: joined string }
    else stream ended without exitCode
        Sandbox--xCaller: throws / raises RuntimeError
    end
Loading

🤔 Assumptions

  • The SSE stream emits exitCode on the terminating event when the process exits cleanly — confirmed against the generated ExecStdout type, which documents exit_code as "only present when process has exited".
  • The stream completes naturally once the process exits; no explicit cancel/close from exec() needed.
  • All optional create() parameters except autostart and interactive are useful through exec() — exposed via opts (TS: Omit<…, "command" | "args" | "autostart" | "interactive">; Python: explicit kwargs pty, cwd, env, user).
  • Most callers of exec() want the joined output as a string. Callers needing per-chunk metadata (type, sequence, timestamp) can drop down to streamOutput() — the only place that information now surfaces.

🧠 Decisions

  • interactive is hardcoded to false and excluded from the public opts type. Interactive mode keeps stdin open and would deadlock the convenience method, which has no way to feed input.
  • pty stays available, even though it merges stderr into stdout at the OS level (the kernel reopens the slave PTY onto fds 0/1/2 of the child). Documented as the escape hatch for tty-aware tools (colorized output, npm install spinners, programs that refuse without a TTY).
  • output: string, not ExecStdout[], in both exec() and getOutput(). Match subprocess.run().stdout / Node's child_process.exec callback / Go's cmd.Output() — the conventional shape for "run a command, get its output". Per-event detail is now exclusively a streamOutput() concern.
  • exec() and getOutput() are deliberately symmetric — same {exitCode/exit_code, output: string} shape, differing only in whether exitCode is guaranteed (exec() throws on missing; getOutput() returns undefined/None).
  • Throws / raises RuntimeError on missing exit code in exec() rather than silently returning exitCode: undefined. The convenience method's contract is "I waited for completion"; failing that contract should be loud. Callers needing to handle killed-process scenarios should use streamOutput() directly.
  • Python ExecOutputResult TypedDict exposed for getOutput()'s return type — gives callers static typing without forcing them through dict[str, Any]. exec() still uses dict[str, Any] because its exit_code: int (no None) doesn't quite warrant a separate TypedDict yet — symmetric typing is a small follow-up if you want it.
  • ExecStdout re-exported at package root in both languages so users can import { ExecStdout } from "@together-sandbox/sdk" / from together_sandbox import ExecStdout without reaching into generated-client namespaces.
  • Pre-existing doc fix-ups landed alongside:
    • execs.get_output() Python doc return type corrected from -> str to -> ExecOutputResult.
    • execs.start() TypeScript doc: Python-style return arrow -> Exec corrected to : Exec, and code fence language fixed from ```python to ```typescript.
    • execs.update() TypeScript doc removed — that method no longer exists; replaced in-place by the new execs.exec() section.

🔄 Discussions

The return shape iterated several times during the session before settling on the final design — worth noting because the journey shaped the final decision:

  1. { exitCode, stdout: string[], stderr: string[] } — ergonomic but threw away per-chunk metadata
  2. list[ExecStdout] — mirror of getOutput, but the most useful field (exit code) got buried in list metadata
  3. { exitCode, output: ExecStdout[] } with throw guarantee — first-class exit code, but list output forced caller-side filtering/joining
  4. { exitCode, output: string } — final. exec() joins chunks itself since that's what 90% of callers want. getOutput() aligned to the same shape with optional exitCode.

The convergence point was: strings for the convenience methods, raw events only via streamOutput(). That single principle made the API surface much sharper.

🧪 Testing

  • New test_exec in the Python e2e suite — runs sh -c "echo hello && echo oops >&2 && exit 3", asserts exit_code == 3 and that both stdout and stderr content appear in the joined output string.
  • test_exec_stdout_vs_stderr rewritten to use streamOutput() directly — the joined-string getOutput() can no longer distinguish stream types, so the test now exercises the only API that still can (and asserts on event["type"] == ExecStdoutType.STDOUT.value from raw event dicts).
  • Six other get_output()-based tests (test_get_output, test_exec_exit_code, test_exec_with_cwd, test_exec_with_env, test_exec_with_args, test_send_stdin) simplified from any(... for item in r) patterns to direct substring/equality checks against r["output"] / r["exit_code"].
  • TypeScript: verified no LSP diagnostics across Sandbox.ts, index.ts, both docs. The for await and event.exitCode usages typecheck against the generated sandboxApi.ExecStdout; the structured return is correctly typed as { exitCode: number; output: string } for exec() and { exitCode: number | undefined; output: string } for getOutput().
  • Not yet run end-to-end against a live sandbox in CI.

📁 References

@christianalfoni christianalfoni force-pushed the CSB-1420 branch 3 times, most recently from 214a85c to 895a66f Compare May 21, 2026 13:02
@mohamedveron
Copy link
Copy Markdown
Contributor

@christianalfoni let's just remove interactive from here

@christianalfoni christianalfoni merged commit 2c4bcee into main May 22, 2026
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants