Skip to content

An evaluation benchmark for Olympiad competition math in Lean4 based on Indian Mathematical Olympiads

Notifications You must be signed in to change notification settings

prmbiy/IndiMathBench

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 

Repository files navigation

IndiMathBench is a benchmark for evaluating automated theorem provers on 312 competition-level mathematics Olympiad problems formalized in Lean 4. All problems are produced through a structured Human–AI collaborative pipeline and are fully human-verified. The dataset is sourced from the Indian National Mathematics Olympiad (INMO) and the Regional Mathematics Olympiad (RMO), two long-running annual competitions in India.

The benchmark contributes 98 new Geometry problems and 45 new Combinatorics problems, two domains that are underrepresented in existing formal-reasoning benchmarks. A public leaderboard for evaluation results will be released soon!

For submissions or questions, please contact parambiyani8@gmail.com. Feedback is welcome; please open an issue if you find errors or would like to suggest improvements.

Problem Domains

Category Count
Geometry 98
Algebra 92
Set Theory & Combinatorics 45
Number Theory 77

Evaluation Results

Model Scaffold Resolution
Claude Sonnet 4 ReAct 10 turns 4/312
Gemini 2.5 Pro ReAct 10 turns 12/312
GPT-5 ReAct 10 turns 36/312
DeepSeekProver V2 7B ReAct 10 turns 24/312
GodelProver V2 8B ReAct 10 turns 36/312

Detailed description of the 10 turn ReAct scaffold in out paper!

Citation

The associated paper for IndiMathBench is available here. Please consider citing if you find IndiMathBench useful:

@misc{biyani2025indimathbenchautoformalizingmathematicalreasoning,
      title={IndiMathBench: Autoformalizing Mathematical Reasoning Problems with a Human Touch}, 
      author={Param Biyani and Shashank Kirtania and Yasharth Bajpai and Sumit Gulwani and Ashish Tiwari},
      year={2025},
      eprint={2512.00997},
      archivePrefix={arXiv},
      primaryClass={cs.AI},
      url={https://arxiv.org/abs/2512.00997}, 
}

About

An evaluation benchmark for Olympiad competition math in Lean4 based on Indian Mathematical Olympiads

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published