Researchers from Microsoft Research Asia, Peking University, and Xi’an Jiaotong University have developed a new technique to improve large language models’ (LLMs) ability to solve math problems by having them learn from their mistakes, akin to how humans learn.
The researchers have revealed a pioneering strategy, Learning from Mistakes (LeMa), which trains AI to correct its own mistakes, leading to enhanced reasoning abilities, according to a research paper published this week.
The researchers drew inspiration from human learning processes, where a student learns from their mistakes to improve future performance.
“Consider a human student who failed to solve a math problem, he will learn from what mistake he has made and how to correct it,” the authors explained. They then applied this concept to LLMs, using mistake-correction data pairs generated by GPT-4 to fine-tune them.
The researchers first had models like LLaMA-2 generate flawed reasoning paths for math word problems. GPT-4 then identified errors in the reasoning, explained them and provided corrected reasoning paths. The researchers used the corrected data to further train the original models.
The results of this new approach are significant. “Across five backbone LLMs and two mathematical reasoning tasks, LeMa consistently improves the performance compared with fine-tuning on CoT data alone,” the researchers explain.
What’s more, specialized LLMs like WizardMath and MetaMath also benefited from LeMa, achieving 85.4% pass@1 accuracy on GSM8K and 27.1% on MATH. These results surpass the state-of-the-art performance achieved by non-execution open-source models on these challenging tasks.
This breakthrough signifies more than just an enhancement in the reasoning capability of AI models. It also marks a significant step towards AI systems that can learn and improve from their mistakes, much like humans do.
The team’s research, including their code, data, and models, is now publicly available on GitHub. This open-source approach encourages the broader AI community to continue this line of exploration, potentially leading to further advancements in machine learning.
The advent of LeMa represents a major milestone in AI, suggesting that machines’ learning (ML) processes can be made more akin to human learning. This development could revolutionize sectors heavily reliant on AI, such as healthcare, finance, and autonomous vehicles, where error correction and continuous learning are critical.
As the AI field continues to evolve rapidly, the integration of human-like learning processes, such as learning from mistakes, appears to be an essential factor in developing more efficient and effective AI systems.