"Inference Scaling for Bridging Retrieval and Augmented Generation"

Playback speed

Share post at current time

0:00

Transcript

"Inference Scaling for Bridging Retrieval and Augmented Generation"

Generated below podcast on this paper with Google's Illuminate.

Rohan Paul

Jan 07, 2025

Smart passage shuffling reveals true importance, making RAG more reliable.

MoI (Mixture-of-Intervention) fixes position bias in RAG systems by reordering retrieved passages based on their true utility, improving answer quality by 7 points on benchmarks[1].

By understanding position bias, MoI helps LLMs see past the order of information.

-----

https://arxiv.org/abs/2412.10684

🤔 Original Problem:

→ RAG systems suffer from generator bias where better retrieval can actually hurt performance[1].

→ Current reranking approaches like RankGPT fail to improve RAG despite better retrieval quality[1].

-----

🔧 Solution in this Paper:

→ MoI observes how passages perform in different positions through multiple forward passes[1].

→ It separates passage utility from position bias using parallel observations[1].

→ The method aggregates outcomes from different permutations to estimate true passage importance[1].

→ MoI leverages retriever's prior knowledge to reduce computational costs[1].

-----

💡 Key Insights:

→ Position bias makes LLMs weigh passages differently based on their order[1]

→ Sometimes downranking relevant passages can improve overall performance[1]

→ Larger models like LLaMA-3 70B still suffer from position bias[1]

-----

📊 Results:

→ Improved ROUGE-L on MS MARCO by 7 points (44.30 vs 37.75 baseline)[1]

→ Boosted HotpotQA Exact Match score by 7 points (55.67 vs 48.54)[1]

→ Achieved 90% cost savings while maintaining 50% performance gains[1]

Rohan's Bytes

"Inference Scaling for Bridging Retrieval and Augmented Generation"

Discussion about this video