"Preference Curriculum: LLMs Should Always Be Pretrained on Their Preferred Data"

Playback speed

Share post at current time

0:00

Transcript

"Preference Curriculum: LLMs Should Always Be Pretrained on Their Preferred Data"

Below podcast on this paper is generated with Google's Illuminate.

Rohan Paul

Jan 31, 2025

Data preference-based curriculum learning improves LLM efficiency and accuracy.

This paper introduces a novel training paradigm for LLMs, where training data is dynamically selected based on the model's evolving preferences, leading to significant performance gains. LLMs are typically pretrained on a uniform data distribution, ignoring the fact that a model’s data preference changes as its capabilities evolve during training.

Paper - https://arxiv.org/abs/2501.13126

Original Problem 🤔:

→ Current LLMs are pretrained on static data distributions, which is suboptimal as the model's learning capacity changes during training.

Solution in this Paper 🛠️:

→ The Perplexity Difference based Preference Curriculum learning (PDPC) framework dynamically arranges the training data based on the model's preference.

→ PD is calculated offline using reference models, reducing computational overhead.

→ An S-shaped preference function guides the concentration of low-PD data during training, ensuring smooth curriculum progression.

→ The training data is arranged offline ensuring continuous training.

Key Insights 💡:

→ Model's perplexity difference (PD) between early and late checkpoints reflects sample difficulty and preference shift during training.

→ High-PD data is beneficial in later training stages, while low-PD data suits earlier stages, creating a natural curriculum.

Results 💯:

→ 3B model trained with PDPC using 1 trillion tokens achieves an average accuracy increase of 4.1% across benchmarks, and 8.1% on MMLU and CMMLU.

→ 1.3B model also shows consistent improvements on all benchmarks.

Rohan's Bytes

"Preference Curriculum: LLMs Should Always Be Pretrained on Their Preferred Data"

Discussion about this video