0:00
/
0:00
Transcript

"Evaluating and Improving Graph to Text Generation with Large Language Models"

Below podcast is generated with Google's Illuminate.

The paper addresses the challenge of LLMs struggling with graph-to-text generation.

It introduces PlanGTG, a new dataset with reordering and attribution sub-tasks, to enhance LLMs' planning and faithfulness in this task. Fine-tuning LLMs on PlanGTG improves text generation quality compared to existing methods.

-----

Paper - https://arxiv.org/abs/2501.14497

Original Problem:

→ Current LLMs show limited ability in interpreting graph structures for text generation.

→ LLMs struggle with planning when graphs have numerous triplets and small diameters.

→ Existing methods for graph-to-text generation with LLMs show incremental improvements and limitations in handling complex graphs.

-----

Solution in this Paper:

→ This paper introduces PlanGTG, a new graph-to-text dataset.

→ PlanGTG includes two sub-tasks: reordering knowledge graph triplets and attribution of triplets in the generated text.

→ The dataset creation process involves seed data preparation, sequential graph-text pair generation, and parallel attribution annotation using GPT-3.5-turbo.

→ PlanGTG aims to improve LLMs' ability to plan with graph sequences and ground text in truth.

→ LLMs are fine-tuned using PlanGTG with instructions to perform reordering and attribution sub-tasks alongside text generation.

-----

Key Insights from this Paper:

→ Detailed prompts are necessary to unlock LLMs' graph-to-text generation capabilities.

→ Selecting moderately difficult and diverse examples for few-shot learning yields better performance, though improvements are marginal.

→ Increasing the number of few-shot examples does not consistently improve performance.

→ Fine-tuning LLMs on PlanGTG, especially with reordering and attribution instructions, significantly enhances graph-to-text generation.

→ Curriculum learning, training from simple to complex graphs, further improves model performance.

-----

Results:

→ PlanGTG fine-tuned models achieve up to 5.99 points BLEU score increase and 0.3 points BartScore increase over zero-shot LLMs.

→ PlanGTG fine-tuned models outperform models fine-tuned on EventNarrative, GraphNarrative, and TEKGEN datasets in zero-shot generalization.

→ In full-shot fine-tuning on WebNLG17, PlanGTG fine-tuned models outperform direct fine-tuning by 0.57 BLEU on seen data and 1.64 BLEU on unseen data.

→ Human evaluation shows PlanGTG generated text has reduced hallucinated entities (0.36) and relations (0.42) compared to other datasets.

Discussion about this video