Test Execution Enviroment for SweBench tasks

I have recently being working on swebench where we built distributed eval on top of Modal for faster eval cycles. As a next step, I was hoping to use that setup to execute the patch generated by LLMs after the localization stage. I was wondering whether it is possible via the `commit0` project.

Test execution feedback and search can improve the quality over Best-of-N or majority voting based approaches. Also, as part of this idea, we either need to predict the relevant unittests which affect the localized files or generate unittests using LLMs.



cc @wenting-zhao 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Test Execution Enviroment for SweBench tasks #73

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Test Execution Enviroment for SweBench tasks #73

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions