0:00
/
0:00
Transcript

"RobustFT: Robust Supervised Fine-tuning for Large Language Models under Noisy Response"

Generated below podcast on this paper with Google's Illuminate.

Multiple LLMs team up to spot and fix noisy training samples before fine-tuning.

Basically teaching LLMs to be their own data quality inspectors.

RobustFT introduces a framework to handle noisy data during LLM fine-tuning by using multi-expert detection and context-enhanced denoising, improving downstream task performance.

-----

https://arxiv.org/abs/2412.14922

🤔 Original Problem:

→ Supervised fine-tuning of LLMs suffers from noisy training data, causing significant performance drops - 30% noise leads to 8.9% accuracy decline.

→ Existing denoising methods don't work well for open-ended text generation tasks.

-----

🛠️ Solution in this Paper:

→ RobustFT uses multiple expert LLMs working together to detect noisy samples through a consistency checker.

→ It employs a reasoning-enhanced LLM that combines step-by-step reasoning with self-reflection for better noise detection.

→ For denoising, it uses similar clean samples as context to relabel noisy data.

→ A Review Agent examines and synthesizes responses from different sources.

→ It filters samples based on response entropy, keeping only high-confidence data for fine-tuning.

-----

💡 Key Insights:

→ Larger models aren't inherently more noise-resistant

→ Domain-specific tasks need specialized noise handling

→ Multi-expert collaboration improves noise detection accuracy

→ Context-enhanced relabeling works better than direct correction

-----

📊 Results:

→ Outperforms baselines across 5 datasets with 30-70% noise levels

→ Improves MMLU accuracy by 14.6% compared to standard fine-tuning

→ Maintains consistent performance even with 70% noise

→ Shows 81.2% relative improvement in high-noise scenarios

------

Are you into AI and LLMs❓ Join my daily AI newsletter. I will send you 7 emails a week analyzing the highest signal AI developments. ↓↓

🎉 https://rohanpaul.substack.com/

Discussion about this video