The new open code Mathematical model Light-R1-32B surpasses Deepseek’s equivalent efficiency with just $ 1,000 for training training
Join our daily and weekly newsletters for the latest updates and exclusive content of a leading AI coverage industry. Learn more
A team of researchers introduced Light-R1-32B, a new AI open code model optimized to solve sophisticated mathematical problems by making it available to Hug Under Apache 2.0-paid for businesses and researchers to take, deploy, refine or change as they wish, even for commercial purposes.
The 32mm parameter model (number of model settings) surpasses the performance of such sizes (and even larger) models with open source such as Deepseek-R1-Distil-Llama-70B and Deepseek-R1-Distil-QWen-32B on Trelete Party indicator indicator indicator US Mathematics Invitation Exam (Aime)which contains 15 mathematical problems intended for extremely advanced students and has a distribution of 3 hours for human users.

Developed by Liang Wen, Fenrui Xiao, Xin He, Yunke Cai, Qi An, Zhenyu Duan, Yimin Du, Junchen Liu, Lifu Tang, Xiaowei LV, Hangz Zoua Mo. Prophesses Preperty Source Alterne of Competitiveness.
It is amazing that the researchers have completed the training of the model for less than six hours at 12 GPU at the NVIDIA H800 at an approximately total price of $ 1,000. This makes Light-R1-32B one of the most affordable and practical approaches to develop highly effective mathematical AI models. However, it is important to remember that the model was trained in a variant of Open Code of Alibaba Qwen 2.5-32B-InstructionWhich in itself is supposed to have much higher training costs.
Along with the model, the team released its training sets, training scripts and evaluation tools, providing a transparent and affordable framework for the construction of AI models aimed at mathematics.
The arrival of Light-R1-32B follows other similar efforts by rivals such as Microsoft with its Orca-Math series.
A new math king arises
Light-R1-32B is designed to deal with complex mathematical reasoning, especially at Aime (the American Invitation Mathematical Exam) indicators.
He was trained by Qwen2.5-32b-instructive, starting with a long-term refinement model (COT). The team applied the curricula (SFT), based on curricula (SFT) and direct preference optimization (DPO) to improve its problem solution.
When evaluated, Light-R1-32B reached 76.6 of Aime24 and 64.6 of Aime25, exceeding Deepseek-R1-Distill-QWen-32B, which scored 72.6 and 54.9 respectively.
This improvement suggests that the training approach based on curricula effectively enhances mathematical reasoning, even when training from models that they initially have no long basket.
Fair comparison
In order to ensure a fair comparison, the team deactivates training data against general reference reference, including Aime24/25, Math-500 and GPQA Diamond, preventing data from leaking.
They also implement filtering of difficulty based using Deepscaler-1.5B-Preview, ultimately forming a set of data of 76,000 people for the first stage of controlled fine adjustment. A second, more challenging set of data from 3000 examples further improved efficiency.
After a workout, the team combined numerous trained versions of Light-R1-32B, leading to additional profits. In particular, the model maintains strong ability to generalize the tasks of scientific reasoning (GPQA), although it is mathematically specialized.
How businesses can benefit
Light-R1-32B is placed under the Apache License 2.0, an open source license that allows free use, modification and commercial implementation without requiring derivative work to be opened. T
It makes it an attractive option for businesses, AI developers and software engineers who want to integrate or customize the model for their own applications.
The license also includes globally grant, reducing the legal risks to business, while discouraging patent disputes. Companies can freely deploy Light-R1-32B in commercial products, maintaining full control over their innovations, while taking advantage of an open and transparent AI ecosystem.
For CEOs, CTO and IT leaders, Apache 2.0 guarantees the efficiency of costs and independence of suppliers, eliminating licensing fees and restrictive dependencies on own AI solutions. AI developers and engineers acquire the flexibility for fine adjustment, integration and expansion of the model without restrictions, which makes it ideal for specialized mathematical reasoning, research and enterprises AI applications. However, since the license does not provide any guarantee or coverage of responsibility, organizations must hold their own security, compliance and evaluation of the results before implementing Light-R1-32b in critical environments.
Transparency in low -cost training and optimization to solve mathematical problems
Researchers emphasize that Light-R1-32B provides a validated, profitable way to train strong-chain-long-chain-like models in specialized domains.
By sharing their methodology, training and code data, they aim to reduce the barriers of costs for highly efficient AI development.
Future work involves a study of reinforcement training (RL) to further improve the model reflections.