Skip to content

Conversation

@xuehuitian45
Copy link
Collaborator

Description

This PR improves the sandbox lifecycle management:

  1. Redis: Added timeout deletion and container counting (delete only when count is 0).
  2. Sandbox: Added health check before reuse + auto-creation; fixed remote sandbox bugs.
  3. Mounting: Fixed issue by auto-creating non-existent non-copy files during mounting

@gemini-code-assist
Copy link

Summary of Changes

Hello @xuehuitian45, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the lifecycle management of sandboxes by introducing robust mechanisms for Redis-backed sandbox instances, including expiration and reference counting to ensure efficient resource utilization and prevent accidental deletions. It also improves the reliability of sandbox reuse through integrated health checks and addresses a critical issue with file mounting by automatically creating missing host paths. The changes further streamline the API by passing full Sandbox objects to core service methods, allowing for more context-aware operations and better resource management.

Highlights

  • Redis Lifecycle Management: Introduced Time-To-Live (TTL) for Redis entries and implemented a reference counting mechanism for sandboxes. Sandboxes are now only deleted when their reference count reaches zero, preventing premature deletion of actively used sandboxes. A scheduled cleanup task has been added to automatically remove expired sandboxes.
  • Improved Sandbox Reuse with Health Checks: Before reusing an existing sandbox, a health check is performed to ensure it is in a valid and running state. If a sandbox is found to be unhealthy, it is removed, and a new one is created, significantly improving the reliability of sandbox reuse.
  • Automatic File Mounting Fixes: Resolved issues with file mounting by automatically creating non-existent non-copy files or directories on the host system during the mounting process. This ensures that mount operations succeed even if the target host path does not initially exist.
  • API Refactoring for Sandbox Object: Several core SandboxService methods, including getInfo, listTools, callTool, addMcpServers, and establishConnection, have been refactored to accept a full Sandbox object instead of just a sandboxId. This allows for more comprehensive sandbox information to be passed and utilized, especially in remote execution scenarios.
  • Resource Cleanup in AgentScopeAgentHandler: The AgentScopeAgentHandler now properly stops the sandboxService when its own stop() method is called, ensuring that all sandbox-related resources are released gracefully.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces significant improvements to sandbox lifecycle management, including TTL-based expiration with Redis, reference counting for sandbox reuse, and health checks before reuse. The refactoring to pass Sandbox objects instead of just IDs is a good step towards better type safety and code clarity. However, I've identified a few issues that need attention. There is a critical bug in SandboxService.establishConnection that could lead to using stale data after a sandbox is recreated. Additionally, there's a potential resource leak due to improper shutdown of the cleanup task's executor. I've also provided suggestions to improve maintainability by addressing magic numbers and simplifying complex logic.

Comment on lines 581 to 584
ContainerModel containerInfo = getInfo(sandbox);
if(!checkSandboxStatus(sandbox.getSandboxId())){
createContainer(sandbox);
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

There's a critical issue in this block. getInfo(sandbox) is called before the sandbox health is checked and potentially recreated. If checkSandboxStatus returns false, createContainer(sandbox) is called, which creates a new container. However, the containerInfo variable will still hold the information of the old, unhealthy container, leading to subsequent operations using stale data.

To fix this, you should check the status and recreate the container before getting the container info. The createContainer call can also throw a checked JsonProcessingException which is not handled here and will be wrapped in a RuntimeException, which is fine, but the stale data issue is critical.

Suggested change
ContainerModel containerInfo = getInfo(sandbox);
if(!checkSandboxStatus(sandbox.getSandboxId())){
createContainer(sandbox);
}
if(!checkSandboxStatus(sandbox.getSandboxId())){
createContainer(sandbox);
}
ContainerModel containerInfo = getInfo(sandbox);

Comment on lines +145 to +147
if(sandboxService != null){
sandboxService.stop();
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Calling sandboxService.stop() will not shut down the scheduled executor service for cleaning up expired sandboxes, which can lead to a resource leak. The close() method in SandboxService is designed for a complete shutdown, including the executor. You should call sandboxService.close() to ensure all resources are properly released.

Additionally, for coding style consistency, please add a space after if.

Suggested change
if(sandboxService != null){
sandboxService.stop();
}
if (sandboxService != null) {
sandboxService.close();
}

Comment on lines 692 to 694
public void stop(){
cleanupAllSandboxes();
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

This new stop() method is redundant and can cause a resource leak. It only cleans up sandboxes but doesn't shut down the cleanupExecutor thread. The existing close() method already handles both cleanup and executor shutdown. It's recommended to remove this stop() method and have callers (like AgentScopeAgentHandler) use close() for a complete and safe shutdown of the service.

for (Map.Entry<String, ContainerModel> entry : allSandboxes.entrySet()) {
String containerId = entry.getKey();
long ttl = sandboxMap.getTTL(containerId);
if (ttl > 0 && ttl < 10) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The value 10 used as the upper bound for the TTL check appears to be a magic number. To improve readability and make the code easier to maintain, consider defining it as a named constant, for example: private static final long SANDBOX_EXPIRATION_THRESHOLD_SECONDS = 10;.

Comment on lines +295 to +314
if (hostPath.endsWith(File.separator) || hostPath.endsWith("/") ||
containerPath.endsWith("/") || containerPath.endsWith(File.separator)) {
if (hostFile.mkdirs()) {
logger.info("Successfully created directory: {}", hostPath);
} else {
logger.warn("Failed to create directory: {}", hostPath);
continue;
}
} else {
File parentDir = hostFile.getParentFile();
if (parentDir != null && !parentDir.exists()) {
parentDir.mkdirs();
}
if (hostFile.createNewFile()) {
logger.info("Successfully created file: {}", hostPath);
} else {
logger.warn("Failed to create file: {}", hostPath);
continue;
}
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The logic to determine whether to create a file or a directory by checking the suffix of both hostPath and containerPath is quite complex and could be fragile. This could be simplified by, for example, relying only on the containerPath's suffix or by adding an explicit flag to the mount configuration. Using the java.nio.file.Files API could also make the file and directory creation logic more concise and readable.

Comment on lines 129 to 142
public Set<String> scan(String pattern) {
return Set.copyOf(syncCommands.keys(pattern));
Set<String> keys = new HashSet<>();
ScanArgs scanArgs = ScanArgs.Builder.limit(100).match(pattern);
KeyScanCursor<String> cursor = syncCommands.scan(scanArgs);

while (cursor != null) {
keys.addAll(cursor.getKeys());
if (cursor.isFinished()) {
break;
}
cursor = syncCommands.scan(cursor, scanArgs);
}
return keys;
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The implementation of the scan method is correct, but the while (cursor != null) loop is a bit unconventional since syncCommands.scan() doesn't return null. This could be confusing. A more idiomatic and readable approach would be to use a do-while loop or check !cursor.isFinished() in the loop condition.

    public Set<String> scan(String pattern) {
        Set<String> keys = new HashSet<>();
        ScanArgs scanArgs = ScanArgs.Builder.limit(100).match(pattern);
        KeyScanCursor<String> cursor = syncCommands.scan(scanArgs);
        keys.addAll(cursor.getKeys());

        while (!cursor.isFinished()) {
            cursor = syncCommands.scan(cursor, scanArgs);
            keys.addAll(cursor.getKeys());
        }
        return keys;
    }


private void refreshExpiration(String... keys) {
if (expirationSeconds > 0) {
long ttl = (long) expirationSeconds + 10;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The value 10 added to expirationSeconds is a magic number. It's not immediately clear why this buffer is needed. To improve clarity and maintainability, please add a comment explaining its purpose or define it as a named constant (e.g., private static final long EXPIRATION_BUFFER_SECONDS = 10;).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant