Conversation
…e bugs LiteLlmClient tool_calls support (src/llm/litellm.rs): - Add tools and tool_choice fields (as serde_json::Value) to ApiRequest with skip_serializing_if to avoid sending null to providers - Add ApiToolCall and ApiToolCallFunction deserialization structs for parsing function call responses from the API - Add reasoning and reasoning_content fields to ApiMessage for reasoning models - Make finish_reason optional in ApiChoice (some providers omit it) - Implement full tool_calls parsing in generate() response conversion, mapping API tool calls to ToolCallInfo/ToolCallFunction types matching OpenRouterProvider - Add content extraction priority: content > reasoning_content > reasoning, with fallback to tool call arguments when content is empty - Update test to verify tools/tool_choice are properly excluded when None Patch extractor fix (src/swe/extractor.rs): - Replace git show with git diff for extracting PR patches, which produces cleaner unified diffs without commit metadata that confused downstream parsing - Handle all three cases: base..merge, HEAD..merge, and HEAD~1..HEAD Test generator clippy fix (src/swe/test_generator.rs): - Remove unnecessary reference on result.stdout.trim() call Workspace validator runtime install (src/swe/workspace_validator.rs): - Add language runtime installation step before running validation tests in Docker containers (Go, Node.js, Rust, Java) since the base image may not include them, causing false setup_error failures Formatting (docker_sandbox.rs, harness.rs, test_generator.rs, extractor.rs): - Apply cargo fmt to fix formatting across all modified SWE pipeline files Gitignore (.gitignore): - Add mine_test.log and test-easy-output/ to prevent committing test artifacts
… library code - Add validate_git_ref() and validate_repo_name() to prevent command injection via shell metacharacters in user-controlled inputs - Validate commit hashes in extractor.rs before interpolating into git fetch/diff commands - Validate repo name and base_commit in docker_sandbox.rs before interpolating into git clone/checkout commands - Validate repo name and base_commit in harness.rs before interpolating into git clone/checkout commands - Replace expect() with Result return types in LiteLlmClient::new() and LiteLlmClient::new_with_defaults() per AGENTS.md rules - Add comprehensive unit tests for validation functions
…chBlock.lines field - Remove #[allow(dead_code)] from ApiToolCall: all fields are either accessed in generate() or suppressed by underscore prefix (_tool_type) - Remove #[allow(dead_code)] from ApiToolCallFunction: both fields (name, arguments) are accessed in generate() - Remove unused PatchBlock.lines field in extractor.rs: accumulated but never read (split_solution_and_tests only returns .patch)
…results Replace 'let _ = sandbox.write_file(...)' with proper error logging using tracing::warn in test_generator.rs and workspace_validator.rs. These were silently swallowing errors during test file restoration in validation cleanup paths.
…th traversal - validate_git_ref: reject empty refs, '..' sequences, and leading '-' (flag injection) - validate_repo_name: reject parts starting with '.' or '-' (path traversal, flag injection) - Add validate_file_path: reject shell metacharacters, null bytes, '..', absolute paths - docker_sandbox: validate file paths in write_file, write_file_abs, read_file - docker_sandbox: restrict write_file_abs to /tools/ prefix, validate tool_name - harness: validate file paths in docker_write_file - litellm: replace unwrap_or(Null) with proper error propagation for tools serialization - extractor: guard validate_git_ref calls with empty checks for optional refs
Add validate_file_path() check before interpolating tf.path into the mkdir shell command in evaluate_task(). Previously, the mkdir command at line 340 used the unvalidated path from deserialized test files, while docker_write_file() on the next line did validate. This created a window where a malicious path could inject shell commands via the mkdir step before validation occurred.
Replace 3 instances of 'let _ =' that silently discard Docker command errors with 'if let Err(e)' + tracing::debug! for proper error observability while maintaining best-effort cleanup semantics. Files changed: - src/swe/docker_sandbox.rs: start() stale container removal, destroy() - src/swe/harness.rs: docker_rm() helper
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds full function-calling (tool_calls) support to
LiteLlmClientand fixes several bugs across the SWE mining pipeline — patch extraction, workspace validation, and test generation.Changes
LiteLlmClient (
src/llm/litellm.rs)toolsandtool_choicefields to internalApiRequeststruct (withskip_serializing_if)ApiToolCallandApiToolCallFunctiondeserialization structsreasoning,reasoning_content, andtool_callsfields toApiMessagefinish_reasonoptional inApiChoice(some providers omit it)tool_callsin response conversion, matchingOpenRouterProviderbehaviorPatch Extraction (
src/swe/extractor.rs)git diffinstead ofgit showfor proper base-to-merge diffsWorkspace Validator (
src/swe/workspace_validator.rs)Docker Sandbox & Test Generator
serde_json::from_strcall to avoid borrowing a temporary referenceMiscellaneous
mine_test.logandtest-easy-output/to.gitignore