A new reasoning strategy that enhances small language models' problem-solving abilities while requiring just 3% of traditional training data.
https://arxiv.org/abs/2412.09906
Original Problem 🤔:
→ Small Language Models (SLMs) under 10B parameters struggle with complex reasoning tasks, requiring extensive Chain-of-Thought (CoT) training data which is costly to obtain.
→ Current CoT methods integrate both logic and computation, leading to error accumulation and reduced effectiveness.
Solution in this Paper 💡:
→ Introduces Solution Guidance (SG) - a reasoning strategy focusing on problem understanding and decomposition at semantic/logical levels.
→ Implements Solution-Guidance Fine-Tuning (SGFT), requiring only 3,000 SG samples versus 30,000 CoT samples.
→ Uses Layer-wise Importance Sampling for memory-efficient training on consumer GPUs.
→ Employs collaborative inference between SG-trained and base models.
Key Insights 🔍:
→ Problem decomposition at semantic level reduces error propagation
→ Context-aware prompting significantly improves model performance
→ Separating guidance and computation tasks enhances accuracy
Results 📊:
→ Achieves 48.3% accuracy on GSM8K using Qwen2-7B
→ Improves MultiArith performance by 72.9%
→ Requires only 3% of traditional CoT training data
→ Shows 79.8% accuracy on StrategyQA tasks
Share this post