ADDITION IS ALL YOU NEED FOR ENERGY-EFFICIENT LANGUAGE MODELS
MASSIVE claim in this Paper for LLM Training ๐คฏ
MASSIVE claim in this Paper for LLM Training ๐คฏ
This new Linear-complexity Multiplication (L-Mul) algorithm can reduce energy costs by 95% for element-wise tensor multiplications and 80% for dot products in large language models, while maintaining or even improving precision compared to 8-bit floating point operations. ๐คฏ
By replacing complex floating-point multiplication with integer addition
Solution in this Paper ๐ง :
Approximates floating-point multiplication using integer addition
Linear O(n) complexity vs O(m^2) for standard floating-point multiplication
Replaces tensor multiplications in attention mechanisms and linear transformations
Implements L-Mul-based attention mechanism in transformer models
Key Insights from this Paper ๐ก:
L-Mul achieves higher precision than 8-bit float operations with less computation
Potential 95% energy reduction for element-wise tensor multiplications
80% energy reduction for dot products compared to 8-bit float operations
Can be integrated into existing models without additional training
Results ๐:
L-Mul with 4-bit mantissa: comparable precision to float8 e4m3
L-Mul with 3-bit mantissa: outperforms float8 e5m2
Attention mechanism replacement: 0.07% average performance loss across NLP tasks
Vision tasks: 0.12% accuracy improvement
Full model fine-tuning: equivalent results to float8 e4m3 accumulation precision
Generated this podcast with Google's Illuminate.
Multiplication operations are generally more complicated than additions, and FP operation are more costly than integers.
๐pod
New algorithm reduces AI power consumption by 95% โ replaces complex floating-point multiplication with integer addition
Paper - "Addition is All You Need"
Generated this podcast with Google's Illuminate.



