MAPS guides different LLMs to generate more effective test cases.
MAPS automatically generates tailored prompts for different LLMs to improve test case generation quality, achieving better code coverage through diversity-guided optimization and failure-driven learning.
https://arxiv.org/abs/2501.01329
Original Problem 🤔:
→ Current LLM-based test generation relies on basic prompts, leading to suboptimal results
→ Different LLMs perform best with different prompts, but manually designing prompts for each LLM is time-consuming
→ Existing prompt optimization methods fail to produce effective prompts due to low diversity and lack of domain knowledge
-----
Solution in this Paper 🔧:
→ MAPS uses three key modules to generate LLM-tailored prompts
→ The Diversity-guided Prompt Generation creates varied prompts by exploring different modification paths during optimization
→ The Failure-driven Rule Induction identifies common errors in generated tests and transforms them into rules to prevent recurring issues
→ The Domain Contextual Knowledge Extraction provides both in-file and cross-file context to help LLMs understand inheritance and invocation relationships
-----
Key Insights 💡:
→ Different LLMs require different prompts for optimal performance
→ Adding domain context significantly improves test generation quality
→ Preventing recurring errors through rules is more effective than iterative refinement
-----
Results 📊:
→ Outperforms baseline methods by 6.19% higher line coverage
→ Achieves 5.03% higher branch coverage across different LLMs
→ Successfully generates tailored prompts that perform better than manually designed ones
Share this post