Attention weights + dependency parsing = Better AI content attribution
Making LLMs show their work: A new way to trace AI's thought process to source documents
This paper introduces a method to precisely trace generated content back to source documents by using attention weights and dependency parsing, making attribution more accurate and efficient.
https://arxiv.org/abs/2412.11404
Original Problem 🔍:
→ Current LLMs struggle with accurately attributing generated content to source documents, often providing coarse-grained or computationally expensive solutions
→ Existing methods can't effectively incorporate contextual information after the target span, limiting their understanding of complete semantic relationships
Solution in this Paper 🛠️:
→ The method aggregates evidence through set union operations instead of averaging hidden states, preserving detailed token-level information
→ It enhances attribution by integrating dependency parsing to capture complete semantic relationships between tokens
→ The system uses attention weights as similarity metrics between response and source document tokens
→ For practical implementation, it includes optimizations for GPU memory usage and handles inaccessible attention weights through approximation
Key Insights 💡:
→ Token-wise evidence aggregation preserves granular representation details better than averaging
→ Dependency parsing significantly improves attribution accuracy by capturing semantic completeness
→ Attention weights provide faster computation compared to gradient-based approaches
Results 📊:
→ Outperforms all baseline approaches in fine-grained attribution tasks
→ Achieves 93.3% accuracy on QuoteSum dataset
→ Shows 84.6% accuracy on VERI-GRAN dataset
→ Demonstrates significantly faster computation times compared to previous methods
Share this post