0:00
/
0:00
Transcript

"Multi-Document Financial Question Answering using LLMs"

The podcast on this paper is generated with Google's Illuminate.

This paper introduces two innovative methods - RAG_SEM and KG_RAG - for answering complex financial questions across multiple documents. The methods combine semantic tagging and knowledge graphs with RAG to improve accuracy and enable multi-hop reasoning across 1810 K reports from major tech companies.

-----

https://arxiv.org/abs/2411.07264

🤔 Original Problem:

Traditional RAG struggles with complex financial questions spanning multiple documents and years. It cannot effectively handle multi-hop reasoning or extract precise information from large document collections.

-----

🔧 Solution in this Paper:

→ RAG_SEM enhances retrieval by adding semantic tags like entities, dates, industries to both questions and documents for better context matching.

→ KG_RAG builds knowledge graphs from documents using a distilled small model, enabling multi-hop reasoning across document boundaries.

→ The system processes 1810 K reports from 6 tech companies spanning 3 years to answer 111 complex financial questions.

→ Knowledge graphs provide structured representation of financial facts as subject-predicate-object triples.

-----

💡 Key Insights:

→ Semantic tagging significantly improves context retrieval accuracy

→ Knowledge graphs enable answering complex multi-hop questions

→ Small distilled models can efficiently construct knowledge graphs

→ The method scales horizontally across domains and industries

-----

📊 Results:

→ Both methods outperform vanilla RAG on all 9 metrics

→ KG_RAG achieves 85% relevance, 83% correctness, 83% faithfulness

→ KG_RAG outperforms RAG_SEM in 4 out of 9 metrics

Discussion about this video

User's avatar