ADDITION IS ALL YOU NEED FOR ENERGY-EFFICIENT LANGUAGE MODELS
MASSIVE claim in this Paper for LLM Training 🤯
MASSIVE claim in this Paper for LLM Training 🤯
This new Linear-complexity Multiplication (L-Mul) algorithm can reduce energy costs by 95% for element-wise tensor multiplications and 80% for dot products in large language models, while maintaining or even improving precision compared to 8-bit floating point operations. 🤯
By replacing complex floating-point multiplication with integer addition
Solution in this Paper 🧠:
Approximates floating-point multiplication using integer addition
Linear O(n) complexity vs O(m^2) for standard floating-point multiplication
Replaces tensor multiplications in attention mechanisms and linear transformations
Implements L-Mul-based attention mechanism in transformer models
Key Insights from this Paper 💡:
L-Mul achieves higher precision than 8-bit float operations with less computation
Potential 95% energy reduction for element-wise tensor multiplications
80% energy reduction for dot products compared to 8-bit float operations
Can be integrated into existing models without additional training
Results 📊:
L-Mul with 4-bit mantissa: comparable precision to float8 e4m3
L-Mul with 3-bit mantissa: outperforms float8 e5m2
Attention mechanism replacement: 0.07% average performance loss across NLP tasks
Vision tasks: 0.12% accuracy improvement
Full model fine-tuning: equivalent results to float8 e4m3 accumulation precision
Generated this podcast with Google's Illuminate.
Multiplication operations are generally more complicated than additions, and FP operation are more costly than integers.
👇pod
New algorithm reduces AI power consumption by 95% — replaces complex floating-point multiplication with integer addition
Paper - "Addition is All You Need"
Generated this podcast with Google's Illuminate.