"CORAL: Benchmarking Multi-turn Conversational Retrieval-Augmentation Generation"

Playback speed

Share post at current time

0:00

Transcript

"CORAL: Benchmarking Multi-turn Conversational Retrieval-Augmentation Generation"

The podcast on this paper is generated with Google's Illuminate.

Rohan Paul

Dec 23, 2024

CORAL, proposed in this paper, bridges single-turn to multi-turn RAG with Wikipedia-based conversations and smart compression

Wikipedia's structure transforms into natural conversations for better RAG evaluation

📚 https://arxiv.org/abs/2410.23090

🎯 Original Problem:

Current academic research focuses mainly on single-turn Retrieval-Augmented Generation (RAG), while real-world applications require handling multi-turn conversations. This gap creates challenges in managing conversation history, topic shifts, and maintaining response quality across extended dialogues.

-----

🔧 Solution in this Paper:

→ Introduced CORAL: A benchmark with 8,000 diverse conversations derived from Wikipedia

→ Developed a three-stage construction approach:

- Extract title trees from Wikipedia pages

- Sample conversation flows using 4 strategies (Linear Descent, Sibling-Inclusive, Single-Tree Random Walk, Dual-Tree Random Walk)

- Use GPT-4 to contextualize questions with natural language elements

→ Created unified framework supporting 3 core tasks:

- Conversational passage retrieval

- Response generation

- Citation labeling

→ Implemented conversation compression strategies: