0:00
/
0:00
Transcript

"GeAR: Generation Augmented Retrieval"

Generated below podcast on this paper with Google's Illuminate.

GeAR enhances document retrieval by adding generation capabilities to locate and explain relevant information, making search results more interpretable and fine-grained .

https://arxiv.org/abs/2501.02772

🔍 Original Problem:

Traditional bi-encoder retrieval systems compress complex query-document relationships into single similarity scores, making it hard to understand why documents match and locate specific relevant sections .

⚡ Solution in this Paper:

→ GeAR introduces a novel architecture combining bi-encoder retrieval with generation capabilities through a fusion encoder and text decoder

→ The system processes query-document-information triples using contrastive learning to optimize similarity matching

→ A text decoder generates relevant snippets from documents based on fused query-document representations

→ The model synthesizes high-quality training data using LLMs to support the enhanced capabilities

💡 Key Insights:

→ Generation and localization capabilities are synergistic - better generation leads to better information localization

→ Peak localization performance occurs in intermediate layers rather than the final layer

→ The approach maintains retrieval efficiency while adding interpretability

📊 Results:

→ Achieves 0.961 Recall@5 and 0.903 MAP@5 on information retrieval tasks

→ Demonstrates 0.885 Recall@1 and 0.965 MAP@1 for fine-grained localization

→ Generation quality reaches 87.4 ROUGE-1 and 87.1 ROUGE-L scores

Discussion about this video

User's avatar