"Drowning in Documents: Consequences of Scaling Reranker Inference"

Playback speed

Share post at current time

0:00

Transcript

"Drowning in Documents: Consequences of Scaling Reranker Inference"

The podcast on this paper is generated with Google's Illuminate.

Rohan Paul

Dec 24, 2024

Rerankers get worse at finding relevant documents as they process more data

Current rerankers drown in noise when searching through large document collections

So scaling up rerankers leads to unexpected quality degradation in document search

This paper challenges the common assumption that rerankers improve retrieval quality as they process more documents. Through extensive experiments, researchers discovered that modern rerankers actually show diminishing returns and quality degradation when scaling up, often performing worse than simpler retrievers when ranking full datasets.

-----

https://arxiv.org/abs/2411.11767

🔍 Original Problem:

Modern information retrieval systems use two-stage pipelines where rerankers score documents retrieved by first-stage systems. The assumption is that rerankers consistently improve quality as they process more documents.

-----

🛠️ Solution in this Paper:

→ The researchers conducted comprehensive experiments testing rerankers' performance when scoring progressively larger document sets.

→ They evaluated multiple state-of-the-art rerankers across 8 diverse datasets, both academic and enterprise.

→ They introduced listwise reranking using LLMs as an alternative approach that shows better robustness.

-----

💡 Key Insights:

→ Rerankers show diminishing returns beyond 100 documents and often degrade quality with larger sets

→ Full dataset reranking with rerankers performs worse than simple retrievers

→ Rerankers frequently assign high scores to irrelevant documents with minimal query overlap

→ Listwise reranking using LLMs shows promise for more robust performance

-----

📊 Results:

→ Recall@10 drops significantly when reranking more than 100 documents

→ LLM-based listwise reranking maintains consistent performance even with larger document sets

→ Error rates in listwise reranking remain under 10% for most datasets

Rohan's Bytes

"Drowning in Documents: Consequences of Scaling Reranker Inference"

Discussion about this video