0:00
/
0:00
Transcript

"Does Few-Shot Learning Help LLM Performance in Code Synthesis?"

The podcast on this paper is generated with Google's Illuminate.

Code better with LLMs by picking smarter examples, not bigger models

This paper introduces methods to optimize code generation by selecting effective few-shot examples in prompts. The research demonstrates that carefully chosen examples significantly improve LLM's coding abilities without modifying the model architecture or training process.

-----

https://arxiv.org/abs/2412.02906

🤔 Original Problem:

LLMs show impressive code generation capabilities, but prompt-level optimizations remain unexplored. Current techniques use predefined prompt templates with minimal modifications.

-----

🔧 Solution in this Paper:

→ The paper proposes two methods for selecting optimal few-shot examples: CODEEXEMPLAR-FREE and CODEEXEMPLAR-BASE.

→ CODEEXEMPLAR-FREE picks examples based on perplexity metrics without requiring training data.

→ CODEEXEMPLAR-BASE uses a neural network trained on bootstrapped data to select examples.

→ Both methods support arbitrary token cost constraints and work without accessing model weights.

-----

💡 Key Insights:

→ Choice of few-shot examples significantly impacts coding performance across different LLMs

→ Complex input examples tend to be more informative than simple edge cases

→ Performance saturates after 6 examples, showing diminishing returns

-----

📊 Results:

→ Both methods improved CODELLAMA's Pass@1 performance by ~5.7% on HumanEval+ benchmark

→ CODEEXEMPLAR-BASE showed better generalization across different prompts

→ Achieved significant improvements while maintaining fixed token constraints

Discussion about this video