Rohan's Bytes

Rohan's Bytes

Share this post

Rohan's Bytes
Rohan's Bytes
Not All Heads Matter: A Head-Level KV Cache Compression Method with Integrated Retrieval and Reasoning

Not All Heads Matter: A Head-Level KV Cache…

Rohan Paul
Nov 6, 2024

Share this post

Rohan's Bytes
Rohan's Bytes
Not All Heads Matter: A Head-Level KV Cache Compression Method with Integrated Retrieval and Reasoning

Not all brain cells are equal - same goes for LLM attention heads! πŸ’‘

Read β†’
Comments
User's avatar
Β© 2025 Rohan Paul
Privacy βˆ™ Terms βˆ™ Collection notice
Start writingGet the app
Substack is the home for great culture

Share