OptMATH is a scalable framework for synthesizing high-quality optimization modeling datasets. The framework consists of a bidirectional pipeline that:
- Generates problem data (PD) with controllable complexity from seed mathematical formulations (MF)
- Creates natural language (NL) descriptions through backtranslation
- Validates the correspondence between NL and PD through forward modeling and rejection sampling

- 2026/01/06-Released OptMATH models on Hugging Face: https://huggingface.co/Aurora-Gem/models
- 2025/05/01 - Our paper has been accepted as a poster presentation at ICML 2025!
- 2025/02/21 - We have released our 200K training dataset (OptMATH-Train) and OptMATH-Bench.
- Scalable data synthesis framework for optimization modeling
- Coverage of 10+ real-world applications through 53 seed generators
- Released OptMATH-Train with over 200K high-quality training instances and OptMATH-Bench, a challenging benchmark pushing the boundaries of LLM capabilities
- State-of-the-art performance on multiple benchmarks
OptMATH consists of two main components:
- Over 200k high-quality and diverse optimization problems.
- Covers diverse optimization scenarios including logistics, supply chain, manufacturing etc.

A challenging benchmark comprising "hard instances" characterized by:
- Extended natural language contexts (2.9× longer than MAMO EasyLP)
- Complex constraints
- Coverage of various problem types (LP, MILP, IP, NLP, SOCP)

We use the LLaMAFactory framework for fine-tuning. For more details, please refer to https://github.com/hiyouga/LLaMA-Factory.
The primary results are presented in Table 1. First, our best-performing model, OptMATH-Qwen2.5-32B, achieves superior performance across all benchmarks, surpassing proprietary large language models such as GPT-3.5-Turbo, GPT4, and Deepseek-V3, despite these models having tens of times more parameters. Furthermore, our OptMATH-Qwen2.5-7B outperforms ORLM-LLaMA-3-8B, a model of comparable size, on all benchmarks and demonstrates performance only marginally inferior to Deepseek-V3. Collectively, these results demonstrate that training with OptMATH-Train significantly enhances the model's optimization modeling capabilities.

As shown in the figure below, the performance of Qwen2.5-1.5B across different benchmarks varies with the amount of training data. The model demonstrates notable improvements in optimization modeling capabilities even with a small portion of the OptMATH-Train dataset. The performance gains gradually level off as more training data is added, showing a typical diminishing returns pattern.

@inproceedings{Lu2025OptMATHAS,
title={OptMATH: A Scalable Bidirectional Data Synthesis Framework for Optimization Modeling},
author={Hongliang Lu and Zhonglin Xie and Yaoyu Wu and Can Ren
and Yuxuan Chen and Zaiwen Wen},
year={2025},
url={https://api.semanticscholar.org/CorpusID:276407996}
}
We hope that the package is useful for your application. If you have any bug reports or comments, please feel free to email one of the toolbox authors:
- Hongliang Lu, lhl@pku.edu.cn.
- Zhonglin Xie, zlxie@pku.edu.cn
- Zaiwen Wen, wenzw@pku.edu.cn
