"HyQE: Ranking Contexts with Hypothetical Query Embeddings"

Playback speed

Share post at current time

0:00

Transcript

"HyQE: Ranking Contexts with Hypothetical Query Embeddings"

The podcast on this paper is generated with Google's Illuminate.

Rohan Paul

Dec 27, 2024

Generate questions from answers first, then match them with user queries for better retrieval

HyQE, proposed in this paper, flips the script by generating hypothetical queries from contexts instead of matching query-context similarity

Let contexts tell you what questions they can answer before trying to match them

📚 https://arxiv.org/abs/2410.15262

🎯 Original Problem:

Context ranking in retrieval systems often fails when using simple embedding similarity between queries and contexts. Current solutions using LLMs face scalability issues and need fine-tuning.

-----

🛠️ Solution in this Paper:

→ Introduces HyQE - a framework that uses LLMs to generate hypothetical queries from contexts

→ Ranks contexts based on similarity between user queries and hypothetical queries

→ Works offline - generates and stores hypothetical queries beforehand for reuse

→ Requires no LLM fine-tuning and works with both open-source and proprietary LLMs

→ Uses variational inference to preserve causal relationships between queries and contexts

-----

💡 Key Insights:

→ Similarity between queries is more reliable than similarity between queries and contexts

→ Offline query generation makes it more scalable than existing LLM-based methods

→ Hypothetical queries are constrained by context information, reducing hallucination risk

→ Compatible with existing retrieval methods like HyDE for additive improvements

-----

📊 Results:

→ Improved NDCG@10 scores across multiple benchmarks (DL19, DL20, COVID, NEWS, Touche)

→ Works effectively with different LLMs (GPT-4, GPT-3.5, Mistral) and embedding models

→ Shows consistent performance gains when combined with other retrieval methods

Rohan's Bytes

"HyQE: Ranking Contexts with Hypothetical Query Embeddings"

Discussion about this video