"Scaling Laws for Floating Point Quantization Training"

Playback speed

Share post at current time

0:00

Transcript

"Scaling Laws for Floating Point Quantization Training"

Generated below podcast on this paper with Google's Illuminate.

Rohan Paul

Jan 22, 2025

Want faster LLM training? Give more bits to exponents, less to mantissa.

The paper finds not all bits are equal - exponent bits matter more than mantissa bits in LLM training.

Floating-point quantization needs different bit ratios at different precision levels for optimal LLM training.

There's a sweet spot for data size in low-precision training - more isn't always better.

This paper develops a unified scaling law for floating-point quantization in LLM training, revealing optimal bit allocation between exponent and mantissa bits for different precision levels.

https://arxiv.org/abs/2501.02423

🤔 Original Problem:

→ Existing scaling laws focus on integer quantization but don't account for floating-point quantization parameters like exponent bits, mantissa bits, and block size

→ No clear understanding of how these parameters affect LLM training performance

-----

🔧 Solution in this Paper:

→ The researchers developed a unified scaling law that considers model size (N), data size (D), exponent bits (E), mantissa bits (M), and block size (B).

→ The law predicts model loss as: L = n/N^α + d/D^β + ε + (D^β/N^α)(log2 B)/(γ(E+0.5)^δ(M+0.5)^ν)

→ They trained 366 models with different configurations to validate the law.

-----

💡 Key Insights:

→ Exponent bits contribute slightly more to model performance than mantissa bits

→ Optimal bit ratios: FP4 (E2M1), FP8 (E4M3), FP16 (E8M7)

→ Critical data size exists - training beyond this point degrades performance in low precision

-----

📊 Results:

→ For 1B parameter model: BF16 critical size = 1730T tokens, FP8-E4M3 = 27T tokens, FP4-E2M1 = 0.4T tokens

→ Best cost-performance precision lies between 4-8 bits across wide compute ranges

------

Are you into AI and LLMs❓ Join my daily AI newsletter. I will send you 7 emails a week analyzing the highest signal AI developments. ↓↓

🎉 https://rohanpaul.substack.com/

Rohan's Bytes

"Scaling Laws for Floating Point Quantization Training"

Discussion about this video