0:00
/
0:00
Transcript

"The Semantic Hub Hypothesis: Language Models Share Semantic Representations Across Languages and Modalities"

The podcast on this paper is generated with Google's Illuminate.

Ever wondered how LLMs juggle different languages? They're secretly translating everything to English first!

LLMs process diverse inputs by mapping them to a shared semantic space anchored in their dominant training language

https://arxiv.org/abs/2411.04986

🎯 Original Problem:

Modern LLMs can process diverse inputs (different languages, code, math, images, audio) but we don't understand how they handle these different data types with a single set of parameters.

This paper posits, LLMs don't just translate between languages - they create a universal semantic map

-----

🔍 Solution in this Paper:

→ The paper introduces the "semantic hub hypothesis" - LLMs develop a shared representation space that integrates information from different languages and modalities

→ This hub places semantically similar inputs close together in model's intermediate layers, even if they come from different sources

→ For English-dominant models, the representations often align closest to English tokens even when processing non-English inputs

→ The model actively uses this shared space during processing, not just as a byproduct of training

-----

💡 Key Insights:

→ LLMs naturally develop an integrated shared space during training without requiring explicit alignment

→ The shared representation space is actively used during processing, not just a structural similarity

→ The dominant training language (like English) serves as an anchor for processing other types of inputs

→ This finding enables better interpretation and control of model behavior across languages/modalities

-----

📊 Results:

→ High cosine similarity between translations in middle layers across multiple models (Llama-2, Llama-3, Baichuan-2, BLOOM)

→ For English-dominant models, Chinese inputs show higher probability for English tokens in intermediate layers

→ Similar patterns emerge for arithmetic expressions, code, and multimodal inputs

Discussion about this video