0:00
/
0:00
Transcript

"Frame Representation Hypothesis: Multi-Token LLM Interpretability and Concept-Guided Text Generation"

The podcast on this paper is generated with Google's Illuminate.

Frame Representation helps decode the black box of LLM word understanding.

By treating words as frames, we can peek inside LLM's thought process.

This paper extends Linear Representation Hypothesis to handle multi-token words in LLMs by introducing Frame Representation Hypothesis, enabling better concept understanding and controlled text generation.

-----

https://arxiv.org/abs/2412.07334v1

🤔 Original Problem:

LLMs lack interpretability due to single-token analysis limitations. Most words contain multiple tokens, making it difficult to understand and control how models represent linguistic concepts.

-----

🔍 Solution in this Paper:

→ Introduces Frame Representation Hypothesis (FRH) to model multi-token words as ordered sequences of vectors called frames.

→ Proposes Word Frames to represent multi-token words as linearly independent vectors in high-dimensional space.

→ Develops Concept Frames as centroids of word sets sharing common meanings.

→ Creates Top-k Concept-Guided Decoding to steer text generation using chosen concepts.

→ Leverages WordNet to build over 100,000 Concept Frames across multiple languages.

-----

💡 Key Insights:

→ Over 99% of words show linear independence among their token vectors

→ Hindi and Thai languages demonstrate higher susceptibility to concept guidance

→ Model biases can be exposed and potentially remediated through concept-guided generation

→ Concept Frames preserve semantic relationships while handling multi-token structures

-----

📊 Results:

→ Validated on Llama 3.1, Gemma 2, and Phi 3 families

→ Successfully represents words up to 3-4 tokens with near-maximum matrix ranks

→ Demonstrates gender and language biases through controlled text generation

→ Shows potential for safer and more transparent LLM operations

Discussion about this video