0:00
/
0:00
Transcript

LongCite: Enabling LLMs to Generate Fine-grained Citations in Long-context QA

Generated this podcast with Google's Illuminate.

This paper's method teaches LLMs to pinpoint exact sentences, instead of chunks, when citing sources in long documents, making fact-checking easier and responses more accurate

Results 📊:

• LongCite-8B/9B outperform GPT-4o by 6.4%/3.6% in citation F1 score

• 2x finer citation granularity vs proprietary models

• 7-9% improvement in response correctness over vanilla long-context SFT

• High agreement between human evaluation and automated metrics

📚 https://arxiv.org/abs/2409.02897

Original Problem 🔍:

Current long-context LLMs lack citation capabilities, making it difficult for users to verify information and raising concerns about hallucinations.

-----

Key Insights from this Paper 💡:

• LongBench-Cite: Automated benchmark for LQAC ( Long-Context Question Answering with Citations)

• "Coarse to Fine" (CoF): Pipeline for generating high-quality LQAC data

• SFT with citations improves response correctness and citation quality

• Sentence-level citations are more user-friendly than chunk-level

-----

Solution in this Paper 🧠:

• CoF pipeline for LongCite-45k dataset creation:

- Generate QA pairs via self-instruct

- Retrieve chunks and add coarse citations

- Extract fine-grained sentence-level citations

- Filter low-quality instances

• Fine-tune LongCite-8B and LongCite-9B on LongCite-45k

• One-pass generation of responses with sentence-level citations

------

Are you into AI and LLMs❓ Join me on Twitter with 31.7K others, to remain on the bleeding-edge every day.

𝕏/🐦 https://x.com/rohanpaul_ai

Discussion about this video

User's avatar