Training LLMs to know when to use tools and when to think on their own
https://arxiv.org/abs/2411.00412
🤖 Original Problem:
LLMs struggle with complex scientific problems, often producing hallucinations. While integrating them with tools helps, it leads to over-reliance on tools even for simple problems that could be solved through basic reasoning.
-----
🔧 Solution in this Paper:
→ Introduces a two-component training method: World Knowledge Distillation (WKD) and Tool Usage Adaptation (TUA)
→ WKD trains LLMs using solutions generated with tool information to build internal domain knowledge
→ TUA classifies problems as easy/hard based on model accuracy, then trains the model to use direct reasoning for easy problems while leveraging tools for hard ones
→ Implements a mixed loss function to maintain knowledge consistency across different prompting strategies
-----
💡 Key Insights:
→ LLMs can be trained to make intelligent decisions about tool usage similar to human experts
→ A two-stage training approach enables adaptive problem-solving based on complexity
→ The method shows robustness against noisy training data
→ The approach works across diverse scientific domains from math to climate science
-----
📊 Results:
→ 28.18% improvement in answer accuracy across all datasets
→ 13.89% increase in tool usage precision
→ Outperformed state-of-the-art models including GPT-4 and Claude-3.5
→ Showed 61.80% to 75.50% tool usage accuracy across different domains
Share this post