Teaching small LLMs to reason like big ones.
AutoReason enhances LLM reasoning by automatically generating step-by-step rationales, eliminating the need for hand-crafted few-shot exemplars in Chain of Thought prompting.
-----
https://arxiv.org/abs/2412.06975v1
🤔 Original Problem:
→ Chain of Thought (CoT) prompting requires manually crafted few-shot examples, making it time-consuming and limiting its scalability across different tasks.
→ Current CoT methods use fixed exemplars for all queries, reducing effectiveness for unique problem characteristics.
-----
🔧 Solution in this Paper:
→ AutoReason introduces a two-tier model approach where a stronger LLM (like GPT-4) generates reasoning rationales for a weaker LLM (like GPT-3.5).
→ The system automatically decomposes implicit queries into explicit questions, improving interpretability.
→ It transforms zero-shot queries into few-shot reasoning traces without relying on hand-crafted exemplars.
→ The framework uses query-specific rationales instead of fixed CoT prompts, enhancing reasoning relevance.
-----
💡 Key Insights:
→ Two-tier model hierarchy enables weaker LLMs to leverage stronger models' reasoning capabilities
→ Query-specific rationale generation improves over fixed CoT exemplars
→ Automatic decomposition of complex queries enhances interpretability
-----
📊 Results:
→ Significant accuracy improvement on StrategyQA dataset compared to baseline and standard CoT
→ GPT-3.5-Turbo accuracy increased from 55% to 76.6% using AutoReason
→ GPT-4 performance improved from 71.6% to 91.6% with AutoReason implementation
Share this post