0:00
/
0:00
Transcript

"Human Evaluation of Procedural Knowledge Graph Extraction from Text with Large Language Models"

The podcast on this paper is generated with Google's Illuminate.

Two-stage Chain-of-thought prompting transforms instruction manuals into queryable knowledge graphs

This paper proposes using LLMs to convert unstructured procedural knowledge from instruction manuals into structured Knowledge Graphs through a two-step prompting approach, while evaluating the human perception of LLM-extracted knowledge compared to human annotations.

-----

https://arxiv.org/abs/2412.03589

Original Problem 🎯:

Procedural knowledge in manuals exists as unstructured text, making access and execution difficult. Converting this into structured Knowledge Graphs can enable better digital tools for users.

-----

Solution in this Paper ⚙️:

→ Implements a two-stage Chain-of-Thought prompting using LLMs

→ First prompt extracts steps, actions, objects, equipment and temporal information with expert roles

→ Second prompt converts extracted data into RDF graphs using predefined ontology

→ Uses one-shot learning with examples to guide the LLM's extraction process

→ Allows flexible step rephrasing rather than strict verbatim extraction

-----

Key Insights 💡:

→ No single ground truth exists for procedural knowledge annotation

→ LLMs perform comparably to human annotators in knowledge extraction

→ Humans tend to deviate from instructions while LLMs maintain consistency

→ Both LLMs and humans benefit from rephrasing flexibility

-----

Results 📊:

→ Human evaluators rated LLM extraction quality: 4/5 median score

→ Usefulness ratings averaged 3/5 across evaluations

→ Consistent quality across different procedures

→ Slight bias detected against LLM vs human annotators

Discussion about this video