RATIONALYST Pre-training Process Supervision for Improving Reasoning

Playback speed

Share post at current time

0:00

Transcript

Generated this podcast with Google's Illuminate.

Jan 01, 2025

Really nice Paper to increase Reasoning power of LLMs

RATIONALYST extracts implicit rationales to enhance LLM reasoning across diverse tasks.

Original Problem 🔍:

LLMs often generate incomplete reasoning steps, missing crucial implicit rationales present in human communication.

-----

Solution in this Paper 🧠:

• RATIONALYST: Model pre-trained on implicit rationales extracted from unlabeled text

• Extraction process: Pre-filtering, generation, filtration of rationales from web-scale data and reasoning datasets

• Inference: Provides step-by-step supervision to an "agent" LLM during reasoning tasks

• Two supervision methods: Implicit (probability-based) and explicit (context augmentation)

-----

Key Insights from this Paper 💡:

• Leveraging implicit rationales improves reasoning across diverse tasks

• Web-scale data enhances performance compared to task-specific datasets

• Implicit supervision outperforms explicit due to robustness to imperfect rationales

• RATIONALYST surpasses larger models like GPT-4 in process supervision

-----

Results 📊:

• Average accuracy improvement: 3.9% across 7 reasoning benchmarks

• Outperforms:

- LLaMa-3-8B process supervision

- GPT-4 process supervision

- Fine-tuned outcome-based verifiers

• GSM8K: 81.6% accuracy (4.0% improvement)

• MMLU-Pro: 45.3% accuracy (5.7% improvement)

📚 https://arxiv.org/pdf/2410.01044

------

Are you into AI and LLMs❓ Join me on Twitter with 39.4K others, to remain on the bleeding-edge every day.

𝕏/🐦 https://x.com/rohanpaul_ai

Rohan's Bytes