0:00
/
0:00
Transcript

"Evolutionary Pre-Prompt Optimization for Mathematical Reasoning"

The podcast on this paper is generated with Google's Illuminate.

Smart prompt evolution beats brute force optimization for teaching LLMs complex math.

This paper introduces Evolutionary Pre-Prompt Optimization (EPPO), a method that enhances mathematical reasoning in LLMs by optimizing Chain-of-Thought prompts using evolutionary algorithms. EPPO achieves 10% better exact match scores on GSM8k and MathQA benchmarks while providing theoretical guarantees against overfitting.

-----

https://arxiv.org/abs/2412.04291

🤔 Original Problem:

LLMs struggle with complex mathematical reasoning tasks despite their size. Current prompt optimization methods lack theoretical guarantees and often overfit on small training datasets.

-----

🔧 Solution in this Paper:

→ EPPO uses evolutionary algorithms to select optimal Chain-of-Thought examples as pre-prompts for mathematical reasoning tasks

→ The method requires only binary comparisons between pre-prompts, enabling information-theoretic generalization bounds

→ EPPO optimizes a small set of 2-16 examples that remain fixed for the entire downstream task

→ The algorithm employs comparison-based optimization to minimize overfitting risks

→ Integration with self-consistency voting further amplifies performance gains

-----

💡 Key Insights:

→ 4-shot prompts perform better than 8-shot prompts due to reduced overfitting

→ Evolutionary optimization outperforms random search for prompt selection

→ Pre-prompts transfer well across different models and mathematical tasks

→ Limited data feedback helps prevent overfitting compared to gradient-based methods

-----

📊 Results:

→ 10% improvement in exact match scores on GSM8k and MathQA

→ 64% prediction accuracy on test sets

→ Sharpe ratio of 2.21 on sector rotation strategy

→ Successful transfer from 7B to 70B parameter models

Discussion about this video