AQLM is really a great paper for the GPU constrained LLM setup - "Extreme Compression of Large Language Models via Additive Quantization" 💡
Share this post
Extreme Compression of Large Language Models…
Share this post
AQLM is really a great paper for the GPU constrained LLM setup - "Extreme Compression of Large Language Models via Additive Quantization" 💡