This paper introduces two innovative methods - RAG_SEM and KG_RAG - for answering complex financial questions across multiple documents. The methods combine semantic tagging and knowledge graphs with RAG to improve accuracy and enable multi-hop reasoning across 1810 K reports from major tech companies.
-----
https://arxiv.org/abs/2411.07264
🤔 Original Problem:
Traditional RAG struggles with complex financial questions spanning multiple documents and years. It cannot effectively handle multi-hop reasoning or extract precise information from large document collections.
-----
🔧 Solution in this Paper:
→ RAG_SEM enhances retrieval by adding semantic tags like entities, dates, industries to both questions and documents for better context matching.
→ KG_RAG builds knowledge graphs from documents using a distilled small model, enabling multi-hop reasoning across document boundaries.
→ The system processes 1810 K reports from 6 tech companies spanning 3 years to answer 111 complex financial questions.
→ Knowledge graphs provide structured representation of financial facts as subject-predicate-object triples.
-----
💡 Key Insights:
→ Semantic tagging significantly improves context retrieval accuracy
→ Knowledge graphs enable answering complex multi-hop questions
→ Small distilled models can efficiently construct knowledge graphs
→ The method scales horizontally across domains and industries
-----
📊 Results:
→ Both methods outperform vanilla RAG on all 9 metrics
→ KG_RAG achieves 85% relevance, 83% correctness, 83% faithfulness
→ KG_RAG outperforms RAG_SEM in 4 out of 9 metrics
Share this post