0:00
/
0:00
Transcript

"Sliding Windows Are Not the End: Exploring Full Ranking with Long-Context Large Language Models"

Generated below podcast on this paper with Google's Illuminate.

Single-shot passage ranking with LLMs reduces API costs while improving accuracy

LLMs can now rank all passages in a single shot instead of using sliding windows, making search ranking faster and more accurate with proper fine-tuning .

-----

https://arxiv.org/abs/2412.14574v1

🔍 Original Problem:

Current LLMs use sliding window strategy for passage ranking, which repeatedly processes overlapping passages, causing high API costs and latency . This inefficiency stems from limited input length capabilities .

-----

🛠️ Solution in this Paper:

→ The paper introduces full ranking strategy using long-context LLMs that can process all passages simultaneously

→ A multi-pass sliding window approach generates complete ranking lists for training labels

→ An importance-aware loss function weights passage IDs based on their rank position, ensuring top-ranked passages receive more attention

-----

💡 Key Insights:

→ Full ranking is more efficient but less effective in zero-shot settings

→ With supervised fine-tuning, full ranking outperforms sliding window approach

→ Initial passage order significantly impacts ranking performance

→ Increasing ranking iterations improves performance but converges after 3-4 passes

-----

📊 Results:

→ 2.2 point improvement in NDCG@10 on TREC DL19

→ 29.3% reduction in latency compared to sliding window

→ 50% reduction in API costs

→ Consistent performance improvement across different passage numbers (20-100)

------

Are you into AI and LLMs❓ Join my daily AI newsletter. I will send you 7 emails a week analyzing the highest signal AI developments. ↓↓

🎉 https://rohanpaul.substack.com/

Discussion about this video