"Enhancing the Reasoning Capabilities of Small Language Models via Solution Guidance Fine-Tuning"

Playback speed

Share post at current time

0:00

Transcript

"Enhancing the Reasoning Capabilities of Small Language Models via Solution Guidance Fine-Tuning"

Generated below podcast on this paper with Google's Illuminate.

Rohan Paul

Jan 22, 2025

A new reasoning strategy that enhances small language models' problem-solving abilities while requiring just 3% of traditional training data.

https://arxiv.org/abs/2412.09906

Original Problem 🤔:

→ Small Language Models (SLMs) under 10B parameters struggle with complex reasoning tasks, requiring extensive Chain-of-Thought (CoT) training data which is costly to obtain.

→ Current CoT methods integrate both logic and computation, leading to error accumulation and reduced effectiveness.

Solution in this Paper 💡:

→ Introduces Solution Guidance (SG) - a reasoning strategy focusing on problem understanding and decomposition at semantic/logical levels.

→ Implements Solution-Guidance Fine-Tuning (SGFT), requiring only 3,000 SG samples versus 30,000 CoT samples.

→ Uses Layer-wise Importance Sampling for memory-efficient training on consumer GPUs.

→ Employs collaborative inference between SG-trained and base models.

Key Insights 🔍:

→ Problem decomposition at semantic level reduces error propagation

→ Context-aware prompting significantly improves model performance

→ Separating guidance and computation tasks enhances accuracy

Results 📊:

→ Achieves 48.3% accuracy on GSM8K using Qwen2-7B

→ Improves MultiArith performance by 72.9%

→ Requires only 3% of traditional CoT training data

→ Shows 79.8% accuracy on StrategyQA tasks

Rohan's Bytes

"Enhancing the Reasoning Capabilities of Small Language Models via Solution Guidance Fine-Tuning"

Discussion about this video