"RobustFT: Robust Supervised Fine-tuning for Large Language Models under Noisy Response"

Playback speed

Share post at current time

0:00

Transcript

"RobustFT: Robust Supervised Fine-tuning for Large Language Models under Noisy Response"

Generated below podcast on this paper with Google's Illuminate.

Rohan Paul

Jan 13, 2025

Multiple LLMs team up to spot and fix noisy training samples before fine-tuning.

Basically teaching LLMs to be their own data quality inspectors.

RobustFT introduces a framework to handle noisy data during LLM fine-tuning by using multi-expert detection and context-enhanced denoising, improving downstream task performance.

-----

https://arxiv.org/abs/2412.14922

🤔 Original Problem:

→ Supervised fine-tuning of LLMs suffers from noisy training data, causing significant performance drops - 30% noise leads to 8.9% accuracy decline.

→ Existing denoising methods don't work well for open-ended text generation tasks.

-----

🛠️ Solution in this Paper:

→ RobustFT uses multiple expert LLMs working together to detect noisy samples through a consistency checker.

→ It employs a reasoning-enhanced LLM that combines step-by-step reasoning with self-reflection for better noise detection.

→ For denoising, it uses similar clean samples as context to relabel noisy data.

→ A Review Agent examines and synthesizes responses from different sources.

→ It filters samples based on response entropy, keeping only high-confidence data for fine-tuning.

-----

💡 Key Insights:

→ Larger models aren't inherently more noise-resistant

→ Domain-specific tasks need specialized noise handling

→ Multi-expert collaboration improves noise detection accuracy

→ Context-enhanced relabeling works better than direct correction

-----

📊 Results:

→ Outperforms baselines across 5 datasets with 30-70% noise levels

→ Improves MMLU accuracy by 14.6% compared to standard fine-tuning

→ Maintains consistent performance even with 70% noise

→ Shows 81.2% relative improvement in high-noise scenarios

------

Are you into AI and LLMs❓ Join my daily AI newsletter. I will send you 7 emails a week analyzing the highest signal AI developments. ↓↓

🎉 https://rohanpaul.substack.com/

Rohan's Bytes

"RobustFT: Robust Supervised Fine-tuning for Large Language Models under Noisy Response"

Discussion about this video