0:00
/
0:00
Transcript

"Improving Contextual Faithfulness of Large Language Models via Retrieval Heads-Induced Optimization"

Below podcast on this paper is generated with Google's Illuminate.

Retrieval-augmented LLMs hallucinate in long-form question answering (LFQA).

This work introduces Retrieval Heads-Induced Optimization (RHIO) to improve contextual faithfulness. RHIO uses retrieval heads to generate realistic unfaithful training examples and teaches LLMs to distinguish between faithful and unfaithful generations.

-----

Paper - https://arxiv.org/abs/2501.13573

Original Problem 🙁:

→ Retrieval-augmented large language models (LLMs) often generate unfaithful responses in LFQA, eroding user trust.

-----

Solution in this Paper 💡:

→ RHIO augments unfaithful training data by masking retrieval heads, which are attention heads key for retrieving information from context.

→ RHIO utilizes special control tokens ([POS], [NEG]) to fine-tune LLMs, teaching them to discriminate between faithful and unfaithful responses.

→ RHIO uses contrastive decoding to amplify the difference between outputs induced by these control tokens, further enhancing faithfulness.

-----

Key Insights from this Paper 🤔:

→ Retrieval heads in LLMs are crucial for maintaining contextual faithfulness in LFQA.

→ Masking retrieval heads produces realistic unfaithful examples, mimicking model-intrinsic errors.

→ Explicitly teaching LLMs to distinguish faithful and unfaithful generations enhances contextual faithfulness.

Results ✅:

→ RHIO improves faithfulness on GroundBench, with gains of 12.84% and 12.59% in 7B and 13B models, respectively.

→ RHIO even outperforms GPT-40.

Discussion about this video