The official GitHub repository of the paper "Recent advances in large langauge model benchmarks against data contamination: From static to dynamic evaluation"
-
Updated
Sep 13, 2025
The official GitHub repository of the paper "Recent advances in large langauge model benchmarks against data contamination: From static to dynamic evaluation"
EvaLearn is a pioneering benchmark designed to evaluate large language models (LLMs) on their learning capability and efficiency in challenging tasks.
Add a description, image, and links to the dynamic-evaluation topic page so that developers can more easily learn about it.
To associate your repository with the dynamic-evaluation topic, visit your repo's landing page and select "manage topics."