Skip to content

Test Execution Enviroment for SweBench tasks #73

@345ishaan

Description

@345ishaan

I have recently being working on swebench where we built distributed eval on top of Modal for faster eval cycles. As a next step, I was hoping to use that setup to execute the patch generated by LLMs after the localization stage. I was wondering whether it is possible via the commit0 project.

Test execution feedback and search can improve the quality over Best-of-N or majority voting based approaches. Also, as part of this idea, we either need to predict the relevant unittests which affect the localized files or generate unittests using LLMs.

cc @wenting-zhao

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions