Multiple LLMs team up to spot and fix noisy training samples before fine-tuning.
Basically teaching LLMs to be their own data quality inspectors.
RobustFT introduces a framework to handle noisy data during LLM fine-tuning by using multi-expert detection and context-enhanced denoising, improving downstream task performance.
-----
https://arxiv.org/abs/2412.14922
🤔 Original Problem:
→ Supervised fine-tuning of LLMs suffers from noisy training data, causing significant performance drops - 30% noise leads to 8.9% accuracy decline.
→ Existing denoising methods don't work well for open-ended text generation tasks.
-----
🛠️ Solution in this Paper:
→ RobustFT uses multiple expert LLMs working together to detect noisy samples through a consistency checker.
→ It employs a reasoning-enhanced LLM that combines step-by-step reasoning with self-reflection for better noise detection.
→ For denoising, it uses similar clean samples as context to relabel noisy data.
→ A Review Agent examines and synthesizes responses from different sources.
→ It filters samples based on response entropy, keeping only high-confidence data for fine-tuning.
-----
💡 Key Insights:
→ Larger models aren't inherently more noise-resistant
→ Domain-specific tasks need specialized noise handling
→ Multi-expert collaboration improves noise detection accuracy
→ Context-enhanced relabeling works better than direct correction
-----
📊 Results:
→ Outperforms baselines across 5 datasets with 30-70% noise levels
→ Improves MMLU accuracy by 14.6% compared to standard fine-tuning
→ Maintains consistent performance even with 70% noise
→ Shows 81.2% relative improvement in high-noise scenarios
------
Are you into AI and LLMs❓ Join my daily AI newsletter. I will send you 7 emails a week analyzing the highest signal AI developments. ↓↓
🎉 https://rohanpaul.substack.com/
Share this post