Microsoft Research unveiled rStar-Math, a framework that demonstrates the ability of small language models (SLMs) to achieve mathematical reasoning capabilities comparable to, and in some cases exceeding, larger models like OpenAI’s o1-mini. This is accomplished without the need for more advanced models, representing a novel approach to enhancing the inference capabilities of AI.
At the core of rStar-Math is a method known as Monte Carlo Tree Search (MCTS), which enables SLMs to engage in iterative, step-by-step reasoning. This process is guided by a reward model, also based on an SLM, that evaluates the quality of intermediate steps and refines reasoning paths. Through a self-evolutionary process, rStar-Math continuously improves both its models and the…