Dynamic retrieval, guided by uncertainty, improves efficiency in retrieval-augmented generation.
This paper enhances Retrieval Augmented Generation (RAG) efficiency by dynamically invoking retrieval only when the LLM is uncertain. This reduces unnecessary retrievals, especially in multi-hop question answering, where multiple retrievals are often needed.
-----
https://arxiv.org/abs/2501.09292
Original Problem 🤔:
→ Existing RAG systems mostly retrieve deterministically, leading to inefficiency.
-----
Solution in this Paper 💡:
→ This paper explores uncertainty detection methods to trigger retrieval dynamically.
→ It evaluates various uncertainty metrics to decide "to retrieve or not to retrieve".
→ The system generates a temporary sentence and assesses its uncertainty.
→ If uncertainty exceeds a threshold, a subquery is generated for retrieval.
→ Retrieved information is then added to the LLM context for improved generation.
-----
Key Insights from this Paper 💎:
→ Uncertainty detection can effectively reduce retrieval calls without significantly sacrificing accuracy.
→ Eccentricity metric balances retrieval efficiency and task performance well.
→ Simpler methods like Degree Matrix (Jaccard) minimize retrievals while maintaining reasonable performance.
-----
Results 📊:
→ Eccentricity achieves the highest F1 score (0.605) with fewer retrievals.
→ Degree Matrix (Jaccard) achieves an F1 score of 0.524 with the least retrievals.
→ Always Retrieve baseline achieves 0.552 F1 with significantly more retrievals.
Share this post