Skip to content

Commit e03f147

Browse files
author
henrique
committed
rm problems implementation, these will be on the main repo only
1 parent 0cef4ad commit e03f147

205 files changed

Lines changed: 3 additions & 15100 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

README.md

Lines changed: 3 additions & 50 deletions
Original file line numberDiff line numberDiff line change
@@ -13,18 +13,17 @@
1313

1414
</div>
1515

16-
This is a companion repository to [science-codeevolve](https://github.com/inter-co/science-codeevolve), and contains the complete experimental setup, benchmark implementations, and reproducibility code for the CodeEvolve [paper](https://arxiv.org/abs/2510.14150).
16+
This is a companion repository to [science-codeevolve](https://github.com/inter-co/science-codeevolve), and contains the complete experimental setup and results for the CodeEvolve [paper](https://arxiv.org/abs/2510.14150).
1717

1818
## Overview
1919

2020
This repository provides:
2121

22-
- **Complete benchmark problems** used in the paper's evaluation
2322
- **Experimental configurations** for reproducing all results
2423
- **Raw experimental data** from paper runs (`.pkl`, `.py`, `.txt` files)
2524
- **Analysis notebooks** with visualizations and statistical tests
2625

27-
All experiments validate CodeEvolve's performance on algorithmic discovery tasks from mathematics, demonstrating competitive or superior results compared to closed-source systems like Google DeepMind's AlphaEvolve, and other open-source frameworks for algorithmic discovery.
26+
The benchmark problems themselves are implemented in the main [science-codeevolve](https://github.com/inter-co/science-codeevolve) repository.
2827

2928
## Repository Structure
3029

@@ -34,7 +33,6 @@ science-codeevolve-experiments/
3433
├── notebooks/ # Analysis and visualization
3534
│ ├── experiment_analysis.ipynb # Main analysis notebook
3635
│ └── figs/ # Generated figures from paper
37-
├── problems/ # Benchmark problem definitions
3836
└── README.md
3937
```
4038

@@ -49,11 +47,6 @@ science-codeevolve-experiments/
4947
- **`notebooks/`**: Jupyter notebooks for analysis
5048
- `experiment_analysis.ipynb`: Statistical analysis and comparisons
5149

52-
- **`problems/`**: Problem definitions with:
53-
- Initial solution (`input/`)
54-
- Configuration files for different LLMs (`configs/`)
55-
- Evaluation scripts
56-
5750
## Prerequisites
5851

5952
### Install CodeEvolve Framework
@@ -86,47 +79,7 @@ export API_KEY=your_api_key_here
8679
export API_BASE=your_api_base_url
8780
```
8881

89-
## Running a Benchmark Problem
90-
91-
Each problem has configuration files for different LLM providers (Gemini, Qwen, etc.). Here's how to run an experiment:
92-
93-
```bash
94-
# Example: Circle packing in a square (26 circles) with Qwen
95-
codeevolve \
96-
--inpt_dir=problems/alphaevolve_math_problems/packing_problems/circle_packing_square/26/input \
97-
--cfg_path=problems/alphaevolve_math_problems/packing_problems/circle_packing_square/26/configs/qwen_config.yaml \
98-
--out_dir=results/circle_packing_26_qwen \
99-
--terminal_logging
100-
101-
# Example: First autocorrelation inequality with Gemini
102-
codeevolve \
103-
--inpt_dir=problems/alphaevolve_math_problems/autocorrelation_problems/first_autocorr_ineq/input \
104-
--cfg_path=problems/alphaevolve_math_problems/autocorrelation_problems/first_autocorr_ineq/configs/gemini_config.yaml \
105-
--out_dir=results/autocorr_first_gemini \
106-
--terminal_logging
107-
```
108-
109-
## Resuming from Checkpoints
110-
111-
To resume an interrupted run:
112-
113-
```bash
114-
codeevolve \
115-
--inpt_dir=problems/alphaevolve_math_problems/packing_problems/circle_packing_square/26/input \
116-
--out_dir=results/circle_packing_26_qwen \
117-
--load_ckpt=-1 # Load latest checkpoint
118-
```
119-
120-
Or load a specific checkpoint epoch:
121-
122-
```bash
123-
codeevolve \
124-
--inpt_dir=problems/alphaevolve_math_problems/packing_problems/circle_packing_square/26/input \
125-
--out_dir=results/circle_packing_26_qwen \
126-
--load_ckpt=100 # Load checkpoint from epoch 100
127-
```
128-
129-
### Reproducibility
82+
## Reproducibility
13083

13184
This repository supports two distinct notions of reproducibility:
13285

problems/alphaevolve_math_problems/autocorrelation_problems/first_autocorr_ineq/configs/ablations_comp/qwen_config.yaml

Lines changed: 0 additions & 89 deletions
This file was deleted.

problems/alphaevolve_math_problems/autocorrelation_problems/first_autocorr_ineq/configs/ablations_comp/qwen_naive.yaml

Lines changed: 0 additions & 89 deletions
This file was deleted.

problems/alphaevolve_math_problems/autocorrelation_problems/first_autocorr_ineq/configs/ablations_comp/qwen_no_evolve.yaml

Lines changed: 0 additions & 89 deletions
This file was deleted.

0 commit comments

Comments
 (0)