Nice paper from @GoogleDeepMind
When models share work, they accidentally share your secrets too.
MoE models can leak user prompts through expert routing vulnerabilities in batched processing.
Expert-Choice-Routing and Token dropping in MoE creates a backdoor to steal user inputs
📚 https://arxiv.org/abs/2410.22884
🎯 Original Problem:
Mixture-of-Experts (MoE) models, while efficient for LLMs, have a critical vulnerability in their token routing mechanism that could expose user data when queries are batched together.
-----
🛠️ Solution in this Paper:
→ Introduces "MoE Tiebreak Leakage" attack that exploits Expert-Choice-Routing to leak victim's prompts
→ Uses strategic batch crafting to manipulate expert routing and force specific token dropping
→ Exploits tie-handling behavior in torch.topk CUDA implementation
→ Requires whitebox access to model and ability to control batch placement
→ Implements attack in two variants: Oracle Attack (2 queries) and Leakage Attack (iterative extraction)
-----
💡 Key Insights:
→ Cross-batch dependencies in MoE create exploitable side channels
→ Optimization for efficiency can introduce security vulnerabilities
→ Token dropping, meant for efficiency, becomes security risk
→ Need for rigorous security testing in architectural optimizations
-----
📊 Results:
→ Successfully extracted 996 out of 1000 secret messages
→ Recovered 4,833 out of 4,838 total secret tokens
→ Requires ~100 queries per token on average
→ Works optimally with padding sequence length of 40
→ Shows 99.9% success rate using all 8 experts
Share this post