Knowledge-Aware Query Expansion with Large Language Models for Textual and Relational Retrieval

LLMs now understand both text and relationships in documents by using knowledge graphs as a bridge.

Nov 14, 2024

LLMs now understand both text and relationships in documents by using knowledge graphs as a bridge.

Knowledge-Aware Retrieval (KAR) framework, proposed in this paper, teaches LLMs to see connections between documents, not just words.

Original Problem 🎯:

Current query expansion methods focus mainly on textual similarities, overlooking important document relations in semi-structured search queries. This leads to suboptimal search results when users have both textual and relational requirements.

Solution in this Paper 🛠️:

• Introduces Knowledge-Aware Retrieval (KAR) framework that augments LLMs with structured document relations from knowledge graphs

• Uses document texts as rich knowledge graph node representations

• Implements document-based relation filtering to extract query-focused relations

• Employs two-step LLM inference: first for entity parsing, then for knowledge-aware expansion

• Constructs document triples combining textual and relational information

Key Insights from this Paper 💡:

• Document texts provide richer node representations than entity names alone

• Structured document relations improve query expansion quality

• Two LLM inferences per query maintain reasonable latency

• Knowledge graphs effectively ground LLM generations to domain-specific facts

Results 📊:

• Outperforms state-of-the-art baselines on AMAZON, MAG, and PRIME datasets

• Achieves better or comparable performance to supervised methods

• Shows consistent improvement across different retrievers (BM25, dense embeddings)

• Maintains practical latency compared to other LLM-based approaches

• Demonstrates effectiveness with both GPT-4 and LLaMA-3.1-8B

💡 The knowledge-aware query expansion framework called KAR works by:

Using LLMs to parse entities from queries
Retrieving relevant document texts
Propagating through knowledge graph relations
Filtering relations using document-based scoring
Constructing document triples
Generating knowledge-aware query expansions

Rohan's Bytes

Discussion about this post