The paper addresses the challenge of LLMs struggling with graph-to-text generation.
It introduces PlanGTG, a new dataset with reordering and attribution sub-tasks, to enhance LLMs' planning and faithfulness in this task. Fine-tuning LLMs on PlanGTG improves text generation quality compared to existing methods.
-----
Paper - https://arxiv.org/abs/2501.14497
Original Problem:
→ Current LLMs show limited ability in interpreting graph structures for text generation.
→ LLMs struggle with planning when graphs have numerous triplets and small diameters.
→ Existing methods for graph-to-text generation with LLMs show incremental improvements and limitations in handling complex graphs.
-----
Solution in this Paper:
→ This paper introduces PlanGTG, a new graph-to-text dataset.
→ PlanGTG includes two sub-tasks: reordering knowledge graph triplets and attribution of triplets in the generated text.
→ The dataset creation process involves seed data preparation, sequential graph-text pair generation, and parallel attribution annotation using GPT-3.5-turbo.
→ PlanGTG aims to improve LLMs' ability to plan with graph sequences and ground text in truth.
→ LLMs are fine-tuned using PlanGTG with instructions to perform reordering and attribution sub-tasks alongside text generation.
-----
Key Insights from this Paper:
→ Detailed prompts are necessary to unlock LLMs' graph-to-text generation capabilities.
→ Selecting moderately difficult and diverse examples for few-shot learning yields better performance, though improvements are marginal.
→ Increasing the number of few-shot examples does not consistently improve performance.
→ Fine-tuning LLMs on PlanGTG, especially with reordering and attribution instructions, significantly enhances graph-to-text generation.
→ Curriculum learning, training from simple to complex graphs, further improves model performance.
-----
Results:
→ PlanGTG fine-tuned models achieve up to 5.99 points BLEU score increase and 0.3 points BartScore increase over zero-shot LLMs.
→ PlanGTG fine-tuned models outperform models fine-tuned on EventNarrative, GraphNarrative, and TEKGEN datasets in zero-shot generalization.
→ In full-shot fine-tuning on WebNLG17, PlanGTG fine-tuned models outperform direct fine-tuning by 0.57 BLEU on seen data and 1.64 BLEU on unseen data.
→ Human evaluation shows PlanGTG generated text has reduced hallucinated entities (0.36) and relations (0.42) compared to other datasets.
Share this post