Thank you for your great work on InternAgent. The project demonstrates strong performance on reasoning benchmarks such as HLE, which is very impressive.
However, I was not able to find the evaluation code or scripts for running InternAgent on HLE (or similar reasoning benchmarks) in the repository. It would be extremely helpful if the team could share the corresponding evaluation pipeline, including data preprocessing, prompt templates, and inference settings, to facilitate reproducibility and further research.
Thank you for your great work on InternAgent. The project demonstrates strong performance on reasoning benchmarks such as HLE, which is very impressive.
However, I was not able to find the evaluation code or scripts for running InternAgent on HLE (or similar reasoning benchmarks) in the repository. It would be extremely helpful if the team could share the corresponding evaluation pipeline, including data preprocessing, prompt templates, and inference settings, to facilitate reproducibility and further research.
Hi @xiaolin-cs, @VigneshHexo, and everyone,
Thanks for your interest and patience! We’re happy to share that we have now open-sourced the complete code of InternAgent 1.5.
You can find the full implementation in the repository. We hope this makes it easier for the community to reproduce our results, explore the framework, and build upon it.
Please feel free to open new issues or discussions if you have any questions, feedback, or suggestions. Contributions are also very welcome!
Thanks again for your support.