Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
73 changes: 73 additions & 0 deletions cookbook/en/sandbox/best_practice.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
# Sandbox Usage Best Practices

> **Prerequisite Reading**: This document assumes you are familiar with the basic concepts and usage of sandboxes. Before diving into the content below, it is strongly recommended to complete the previous tutorial on [ Sandbox Basics ](sandbox.md) to better understand the advanced deployment strategies discussed in this section.

In production environments, the deployment and management of sandboxes must be designed according to system scale, concurrency requirements, and resource isolation needs. Different Runtime deployment architectures present distinct challenges regarding sandbox lifecycle management, resource reuse mechanisms, and backend storage. The following sections introduce sandbox usage best practices from single-machine to distributed scenarios, categorized by hierarchy.

## Single Machine, Single Runtime Scenario

### Applicable Scenario

Suitable for development and debugging, lightweight services, or single-instance applications where only one Runtime process exists in the system, without the need to share sandboxes across processes.

### Recommended Architecture

* **Sandbox Management**: Use an in-memory `SandboxMap` implementation for managing sandbox state.
* **Container Backend**: Directly interface with the local Docker Daemon, using the basic Docker driver as the container runtime backend.

### Practice Recommendations

In this scenario, since there is no concurrent access or reuse of the same sandbox by multiple Runtime instances, there is no need to introduce external state storage. In-memory level sandbox mapping is sufficient to meet performance and consistency requirements. It is simple to deploy and starts quickly, making it ideal for rapid iteration and local verification.

> **Note**: This mode lacks scalability across processes or nodes and is not suitable for multi-instance deployment environments.

## Single Machine, Multiple Runtime Scenario

### Applicable Scenario

Suitable for high-concurrency, multi-tenant, or modular architecture deployments on a single machine, where multiple Runtime instances are launched to process tasks in parallel, sharing the resources of the same host.

### Core Challenge

Multiple Runtime instances may simultaneously attempt to access, reuse, or destroy the same sandbox instance. If local memory management is still used, it will lead to state inconsistency, resource contention, or duplicate creation issues.

### Recommended Architecture

* **State Management**: **Redis must be introduced** as a globally shared sandbox metadata storage center to ensure all Runtime instances can consistently read and update sandbox states.
* **Container Backend**: All Runtime instances access the backend through the same `containerClient` to achieve unified scheduling and reuse of sandbox instances.

### Practice Recommendations

Utilize the RedisSandboxMap provided by AgentScope Runtime Java to manage sandbox reference counting and lifecycle, avoiding race conditions.

## Multiple Machines, Multiple Runtime Scenario

### Applicable Scenario

Suitable for distributed systems, elastic scaling clusters, or microservice architectures where multiple Runtime instances are distributed across different hosts and need to collaboratively manage sandbox resources.

### Core Challenge

In addition to state consistency, the reachability of the container backend and network isolation issues must be addressed. Different nodes may not be able to directly access each other's container runtimes, leading to sandboxes that cannot be reused or managed effectively.

### Recommended Solutions

Based on whether all nodes can access a unified container backend, the following two deployment strategies are divided:

#### 1. All Nodes Can Access the Same Container Backend (Centralized Container Management)

* **Architecture Description**: All Runtime nodes access the same remote container runtime (e.g., remote Docker Daemon or Kubernetes cluster) through the network (e.g., Docker TCP API or Kubernetes CRI).
* **Management Method**: Consistent with the "Single Machine, Multiple Runtime" scenario, **use the RedisSandboxMap provided by AgentScope Runtime Java to centrally manage sandbox states**. All nodes operate containers through the shared backend.
* **Advantages**: Unified architecture, simple management, and sandboxes can be reused by any node.
* **Note**: Ensure the high availability and network stability of the container backend to avoid single points of failure.

#### 2. Nodes Cannot Access the Same Container Backend (Distributed Container Environment)

* **Architecture Description**: Each node has its own independent container runtime (e.g., running Docker locally) and cannot directly operate containers on other nodes.
* **Recommended Solution**: Adopt a **Remote Runtime architecture**.
* **Implementation Method**:
* Deploy a dedicated **Runtime agent process** (Sandbox Manager) on a machine that can access the container backend.
* Configure the remaining nodes as **Remote Mode**. Connect to the agent through the built-in remote connection mode in **SandboxService**, delegating it to create, manage, and reuse sandboxes.
* All sandbox operation requests are forwarded to a unified entry point, achieving a management architecture that is logically centralized but physically distributed.

By reasonably selecting the above schemes, system complexity, performance, and scalability can be effectively balanced, providing a secure, efficient, and reusable sandbox runtime environment for applications of different scales.
83 changes: 83 additions & 0 deletions cookbook/zh/sandbox/best_practice.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,83 @@
# 沙箱使用最佳实践

> **建议阅读前提**:本文档假设你已熟悉沙箱的基本概念与基础用法。在深入以下内容前,强烈建议先完成上一节关于[ 沙箱基础 ](sandbox.md)的教程,以便更好地理解本节所讨论的高级部署策略。

在实际生产环境中,沙箱的部署与管理方式需根据系统规模、并发需求以及资源隔离要求进行合理设计。不同的运行时(Runtime)部署架构对沙箱生命周期管理、资源复用机制及后端存储提出了不同挑战。以下将从单机到分布式场景,分层次介绍沙箱使用的最佳实践。

## 单机单 Runtime 场景

### 适用场景

适用于开发调试、轻量级服务或单实例应用,系统中仅存在一个 Runtime 进程,且无跨进程共享沙箱的需求。

### 推荐架构

* **沙箱管理**:使用基于内存的 `SandboxMap` 实现沙箱状态管理。

* **容器后端**:直接对接本地 Docker Daemon,采用基础 Docker 驱动作为容器运行时后端。

### 实践建议

在此场景下,由于不存在多个 Runtime 实例对同一沙箱的并发访问与复用问题,无需引入外部状态存储。内存级别的沙箱映射足以满足性能与一致性要求,部署简单、启动快速,适合快速迭代与本地验证。

> **注意**:该模式不具备跨进程或跨节点的可扩展性,不适用于多实例部署环境。

## 单机多 Runtime 场景

### 适用场景

适用于高并发、多租户或模块化架构的单机部署,系统中启动多个 Runtime 实例以并行处理任务,但所有实例共享同一台主机资源。

### 核心挑战

多个 Runtime 实例可能同时尝试访问、复用或销毁同一个沙箱实例,若仍使用本地内存管理,将导致状态不一致、资源竞争或重复创建等问题。

### 推荐架构

* **状态管理**:**必须引入 Redis** 作为全局共享的沙箱元数据存储中心,确保所有 Runtime 实例能一致地读取和更新沙箱状态。

* **容器后端**:所有 Runtime 实例通过同一个 `containerClient` 共享访问后端,实现沙箱实例的统一调度与复用。

### 实践建议

通过 AgentScope Runtime Java 提供的 RedisSandboxMap 实现沙箱的引用计数与生命周期管理,避免竞态条件。

## 多机多 Runtime 场景

### 适用场景

适用于分布式系统、弹性伸缩集群或微服务架构,多个 Runtime 实例分布在不同主机上,需协同管理沙箱资源。

### 核心挑战

除了状态一致性外,还需解决容器后端的可达性与网络隔离问题。不同节点可能无法直接访问彼此的容器运行时,导致沙箱无法复用或管理失效。

### 推荐方案

根据各节点是否能访问到统一的容器后端,分为以下两种部署策略:

#### 1. 所有节点可访问同一容器后端(集中式容器管理)

* **架构描述**:所有 Runtime 节点通过网络(如 Docker TCP API 或 Kubernetes CRI)访问同一个远程容器运行时(如远程 Docker Daemon 或 Kubernetes 集群)。

* **管理方式**:与“单机多 Runtime”一致,**使用 AgentScope Runtime Java 提供的 RedisSandboxMap 统一管理沙箱状态**,所有节点通过共享后端操作容器。

* **优势**:架构统一,管理简单,沙箱可被任意节点复用。

* **注意**:需保障容器后端的高可用与网络稳定性,避免单点故障。

#### 2. 节点无法访问同一容器后端(分布式容器环境)

* **架构描述**:各节点拥有独立的容器运行时(如各自运行 Docker),无法直接操作其他节点的容器。

* **推荐方案**:采用 **远程Runtime(Remote Runtime)架构**。

* **实施方式**:

* 在能够访问容器后端的机器上部署一个专用的 **Runtime 代理进程**(Sandbox Manager)。

* 其余节点配置为 **远程模式(Remote Mode)**,通过 **SandboxService** 中内置的远程连接模式连接该代理,委托其创建、管理与复用沙箱。

* 所有沙箱操作请求被转发至统一入口,实现逻辑集中、物理分布的管理架构。

通过合理选择上述方案,可有效平衡系统复杂性、性能与可扩展性,为不同规模的应用提供安全、高效、可复用的沙箱运行环境。
Original file line number Diff line number Diff line change
Expand Up @@ -142,6 +142,9 @@ public void stop() {
if (sessionHistoryService != null) {
sessionHistoryService.stop();
}
if(sandboxService != null){
sandboxService.stop();
}
Comment on lines +145 to +147

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Calling sandboxService.stop() will not shut down the scheduled executor service for cleaning up expired sandboxes, which can lead to a resource leak. The close() method in SandboxService is designed for a complete shutdown, including the executor. You should call sandboxService.close() to ensure all resources are properly released.

Additionally, for coding style consistency, please add a space after if.

Suggested change
if(sandboxService != null){
sandboxService.stop();
}
if (sandboxService != null) {
sandboxService.close();
}

}

/**
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,6 @@
* </ul>
*/
public class MyAgentScopeAgentHandler extends AgentScopeAgentHandler {
private SandboxService sandboxService;
private static final Logger logger = LoggerFactory.getLogger(MyAgentScopeAgentHandler.class);
private final String apiKey;

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -81,7 +81,7 @@ public BrowserSandbox(
* @throws RuntimeException if sandbox is not healthy
*/
public String getDesktopUrl() {
return GuiMixin.getDesktopUrl(managerApi, sandboxId, baseUrl);
return GuiMixin.getDesktopUrl(managerApi, this, baseUrl);
}

public String navigate(String url) {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -82,7 +82,7 @@ public FilesystemSandbox(
* @throws RuntimeException if sandbox is not healthy
*/
public String getDesktopUrl() {
return GuiMixin.getDesktopUrl(managerApi, sandboxId, baseUrl);
return GuiMixin.getDesktopUrl(managerApi, this, baseUrl);
}

public String readFile(String path) {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -35,27 +35,27 @@ public class GuiMixin {
* SandboxService and sandboxId.
*
* @param managerApi The SandboxService instance
* @param sandboxId The sandbox ID
* @param sandbox The sandbox instance
* @param baseUrl Optional base URL (can be null)
* @return The desktop URL for VNC access
* @throws RuntimeException if sandbox is not healthy or info cannot be retrieved
*/
public static String getDesktopUrl(SandboxService managerApi, String sandboxId, String baseUrl) {
public static String getDesktopUrl(SandboxService managerApi, Sandbox sandbox, String baseUrl) {
// Check if sandbox is healthy by attempting to get info
ContainerModel info;
try {
info = managerApi.getInfo(sandboxId);
info = managerApi.getInfo(sandbox);
} catch (Exception e) {
throw new RuntimeException("Sandbox " + sandboxId + " is not healthy: " + e.getMessage(), e);
throw new RuntimeException("Sandbox " + sandbox.getSandboxId() + " is not healthy: " + e.getMessage(), e);
}

if (info == null) {
throw new RuntimeException("Sandbox " + sandboxId + " is not healthy: cannot retrieve info");
throw new RuntimeException("Sandbox " + sandbox.getSandboxId() + " is not healthy: cannot retrieve info");
}

String runtimeToken = info.getRuntimeToken();
if (runtimeToken == null || runtimeToken.isEmpty()) {
throw new RuntimeException("Sandbox " + sandboxId + " does not have a runtime token");
throw new RuntimeException("Sandbox " + sandbox.getSandboxId() + " does not have a runtime token");
}

String path = "/vnc/vnc_lite.html";
Expand All @@ -68,7 +68,7 @@ public static String getDesktopUrl(SandboxService managerApi, String sandboxId,
// Use direct URL from container info
String containerUrl = info.getBaseUrl();
if (containerUrl == null || containerUrl.isEmpty()) {
throw new RuntimeException("Sandbox " + sandboxId + " does not have a base URL");
throw new RuntimeException("Sandbox " + sandbox.getSandboxId() + " does not have a base URL");
}
// Ensure URL ends with / if not present
if (!containerUrl.endsWith("/")) {
Expand All @@ -81,7 +81,7 @@ public static String getDesktopUrl(SandboxService managerApi, String sandboxId,
} else {
// Use base_url with sandbox ID
String base = baseUrl.endsWith("/") ? baseUrl.substring(0, baseUrl.length() - 1) : baseUrl;
return base + "/desktop/" + sandboxId + remotePath + "?" + params;
return base + "/desktop/" + sandbox.getSandboxId() + remotePath + "?" + params;
}
}
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -86,7 +86,7 @@ public GuiSandbox(
* @throws RuntimeException if sandbox is not healthy
*/
public String getDesktopUrl() {
return GuiMixin.getDesktopUrl(managerApi, sandboxId, baseUrl);
return GuiMixin.getDesktopUrl(managerApi, this, baseUrl);
}

/**
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -104,6 +104,10 @@ public Sandbox(
this.environment = new HashMap<>(environment);
}

public void setSandboxId(String sandboxId) {
this.sandboxId = sandboxId;
}

public String getSandboxId() {
return sandboxId;
}
Expand All @@ -124,7 +128,7 @@ public Map<String, String> getEnvironment() {
return environment;
}

public FileSystemConfig getFileSystemStarter() {
public FileSystemConfig getFileSystemConfig() {
return fileSystemConfig;
}

Expand All @@ -149,7 +153,13 @@ private void initializeSandbox(){
@JsonIgnore
public ContainerModel getInfo() {
initializeSandbox();
return managerApi.getInfo(sandboxId);
try {
return managerApi.getInfo(this);
}
catch (Exception e) {
logger.error("Failed to get sandbox info: {}", e.getMessage());
throw new RuntimeException("Failed to get sandbox info", e);
}
}

public Map<String, Object> listTools() {
Expand All @@ -158,12 +168,24 @@ public Map<String, Object> listTools() {

public Map<String, Object> listTools(String toolType) {
initializeSandbox();
return managerApi.listTools(sandboxId, toolType);
try{
return managerApi.listTools(this, toolType);
}
catch (Exception e) {
logger.error("Failed to list tools: {}", e.getMessage());
throw new RuntimeException("Failed to list tools", e);
}
}

public String callTool(String name, Map<String, Object> arguments) {
initializeSandbox();
return managerApi.callTool(sandboxId, name, arguments);
try{
return managerApi.callTool(this, name, arguments);
}
catch (Exception e) {
logger.error("Failed to call tool {}: {}", name, e.getMessage());
throw new RuntimeException("Failed to call tool " + name, e);
}
}

public Map<String, Object> addMcpServers(Map<String, Object> serverConfigs) {
Expand All @@ -172,7 +194,13 @@ public Map<String, Object> addMcpServers(Map<String, Object> serverConfigs) {

public Map<String, Object> addMcpServers(Map<String, Object> serverConfigs, boolean overwrite) {
initializeSandbox();
return managerApi.addMcpServers(sandboxId, serverConfigs, overwrite);
try{
return managerApi.addMcpServers(this, serverConfigs, overwrite);
}
catch (Exception e) {
logger.error("Failed to add MCP servers: {}", e.getMessage());
throw new RuntimeException("Failed to add MCP servers", e);
}
}

@Override
Expand Down
Loading
Loading