0:00
/
0:00
Transcript

LLMs Can Plan Only If We Tell Them

Below podcast on this paper is generated with Google's Illuminate.

Algorithm of Thoughts (AoT)+ allows LLMs to plan effectively by simplifying prompts and enhancing state tracking.

AoT+ enhances LLMs' autonomous planning by improving state tracking and simplifying prompt creation. This allows LLMs to generate better long-horizon plans.

-----

https://arxiv.org/abs/2501.13545

Original Problem 😟:

→ LLMs struggle with autonomous, long-horizon planning, often requiring external feedback mechanisms.

→ Existing methods like Chain-of-Thought (CoT) are ineffective for non-ergodic planning, where mistakes can be irreversible.

→ Algorithm of Thoughts (AoT) improves on CoT, but still suffers from state hallucinations due to cognitive overload.

-----

Solution in this Paper 😎:

→ AoT+ builds upon AoT by introducing two key innovations.

→ Periodic Structured State Generation: The problem state is periodically regenerated and restated, reducing the LLM's cognitive load and improving state tracking.

→ Random Trajectory Augmentation: Random search trajectories augmented with correct solution steps simplify prompt creation, eliminating the need for human-designed heuristics.

-----

Key Insights from this Paper 🤔:

→ LLMs possess latent planning capabilities that can be activated through appropriate prompting.

→ State tracking and management are crucial for effective LLM planning.

→ Simplifying prompt creation and reducing reliance on human-designed heuristics can improve LLM performance and generalization.

-----

Results 💪:

→ AoT+ outperforms or matches existing state-of-the-art methods, including those using external verifiers, across multiple benchmarks (Blocksworld, Logistics, List Functions, and ACRE).

→ AoT+ surpasses human performance (78%) in Blocksworld using GPT-4 and Claude, reaching 82% accuracy.

→ AoT+ significantly reduces token usage compared to LLM-Modulo, with the latter requiring more than three times the total tokens.

Discussion about this video

User's avatar