CoqPilot, a plugin for LLM-based generation of proofs

VSCode plugin that makes Coq proof generation accessible.

Nov 08, 2024

VSCode plugin that makes Coq proof generation accessible.

Automates writing Coq proofs. It identifies proof holes marked with 'admit' tactic in Coq files and uses Large Language Models along with non-machine learning methods to generate proof candidates. The plugin automatically checks if generated proofs solve the given subgoals and replaces valid ones.

What is Coq proofs ?

In Coq, you can:

State mathematical theorems and definitions Write detailed, step-by-step proofs Have the computer verify that your proofs are logically correct

Coq proofs are particularly valuable in computer science and mathematics because they can be mechanically checked for correctness, eliminating human error in proof verification.

🎯 Original Problem:

Writing formal proofs in Coq is time-consuming and needs deep expertise. Current tools for automated proof generation either need complex setup or lack integration into development workflows.

🛠️ Solution in this Paper:

• Uses premise selection to pick relevant theorems from same file as context

• Implements few-shot prompting using selected theorems

• Multi-turn error fixing: Uses Coq's error messages to guide proof fixes

• Integrates multiple tools: Coq Language Server, LLMs, Tactician, CoqHammer

• Zero-setup experience with modular architecture for easy service addition

💡 Key Insights:

• Combining multiple models performs better than individual ones

• Context from same file significantly improves proof generation

• Error messages from Coq can guide proof fixing

• Modular design enables easy integration of new services

📊 Results:

• GPT-4 with CoqPilot proves 34% theorems vs 0% with plain GPT-4

• Combined effort of 4 models achieves 39% success rate

• Total success reaches 51% when combining with CoqHammer and Tactician

• Multi-round mechanism with depth 2 fixes additional proofs

Rohan's Bytes

Discussion about this post

Ready for more?