Repair Sub-Agent#443
Open
ElliotGestrin wants to merge 7 commits intoalgorithmicsuperintelligence:mainfrom
Open
Repair Sub-Agent#443ElliotGestrin wants to merge 7 commits intoalgorithmicsuperintelligence:mainfrom
ElliotGestrin wants to merge 7 commits intoalgorithmicsuperintelligence:mainfrom
Conversation
…fore sampling. Previously, the order of programs in the island could change between runs, leading to non-deterministic behavior in the test. This will also make the sampling process consistent across any runs.
Author
|
Even if the main PR content (the repair agent) is rejected, the last minor determinism change should probably be included. Currently, setting a seed appears to be able to cause non-determinism between machines, due to hashing of sets. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
When generating complex code (such as C++) or when using a more lightweight LLM, the models will likely cause minor issues, e.g. declaring variables with the wrong type or similar easily-corrected flaws. Currently, OpenEvolve will simply evaluate this flawed program, likely scoring it 0, and add it to the population. This will then (possibly) be used as a parent later, which hopefully would fix the compilation issue.
Instead, it could make sense to attempt to correct the flaws immediately, before evaluating the child and adding it to the population. This PR enables such functionality, through an optional repair sub-agent.
The evaluator can raise an
EvaluatorRepairRequest(defined inevaluation_results.py) which wil be caught and trigger a repair agent, if one is available. This can be done anywhere in the evaluator and allows for passing error messages and similar to the repair agent, along with a default score to assign if the repair itself fails (or is disabled).After each repair attempt, the evaluator will automatically be called again and the repaired child will be evaluated. This could trigger another repair request if needed, and this cycle can be looped a configurable number of times.
To enable the repair subagent, one configures the new
config.yamlkeys:Setting
max_repair_attempts: 0orrepair_on_failure: false(defaults) disables the repair sub-agent. A separate LLM (likely smaller) can be used for the repair process, and new prompts are available (repair_diff_user.txtandrepair_full_rewrite_user.txt) to configure the repair behaviour.The repair changelogs will be stored and can be visualized in a new tab in the visualizer, if this is enabled.
Happy to discuss and adapt this implementation to better fit the project. Thanks for the great work!