Ever wondered how LLMs juggle different languages? They're secretly translating everything to English first!
LLMs process diverse inputs by mapping them to a shared semantic space anchored in their dominant training language
https://arxiv.org/abs/2411.04986
🎯 Original Problem:
Modern LLMs can process diverse inputs (different languages, code, math, images, audio) but we don't understand how they handle these different data types with a single set of parameters.
This paper posits, LLMs don't just translate between languages - they create a universal semantic map
-----
🔍 Solution in this Paper:
→ The paper introduces the "semantic hub hypothesis" - LLMs develop a shared representation space that integrates information from different languages and modalities
→ This hub places semantically similar inputs close together in model's intermediate layers, even if they come from different sources
→ For English-dominant models, the representations often align closest to English tokens even when processing non-English inputs
→ The model actively uses this shared space during processing, not just as a byproduct of training
-----
💡 Key Insights:
→ LLMs naturally develop an integrated shared space during training without requiring explicit alignment
→ The shared representation space is actively used during processing, not just a structural similarity
→ The dominant training language (like English) serves as an anchor for processing other types of inputs
→ This finding enables better interpretation and control of model behavior across languages/modalities
-----
📊 Results:
→ High cosine similarity between translations in middle layers across multiple models (Llama-2, Llama-3, Baichuan-2, BLOOM)
→ For English-dominant models, Chinese inputs show higher probability for English tokens in intermediate layers
→ Similar patterns emerge for arithmetic expressions, code, and multimodal inputs
Share this post