Teaching ChatGPT to read code context makes it expand abbreviations like a pro. This paper explores this area in depth.
ChatGPT + smart context = state-of-art code abbreviation expansion without parsing headaches
https://arxiv.org/abs/2410.23866
🎯 Original Problem:
Source code abbreviations reduce readability and hinder maintenance. While several approaches exist to expand these abbreviations, none leverage LLMs, which have shown remarkable success in various software engineering tasks.
-----
🛠️ Methods discussed in this Paper:
→ Used ChatGPT to expand source code abbreviations through few-shot prompting
→ Enhanced context awareness by incorporating surrounding code (3 lines before/after)
→ Implemented iterative marking to identify missed abbreviations
→ Added post-condition checking to filter incorrect expansions
→ Built a system that matches state-of-the-art accuracy without requiring expensive code parsing
-----
💡 Key Insights:
→ ChatGPT initially performs poorly compared to specialized tools, with 28.2% lower precision
→ Surrounding code context outperforms both knowledge graphs and enclosing files
→ 53% of missed abbreviations get correctly expanded after explicit marking
→ Simple post-condition checking improves precision by 2 percentage points
-----
📊 Results:
→ Base ChatGPT: 64% precision, 61% recall
→ With surrounding code: 89% precision, 87% recall
→ Final system matches tfExpander's 92% precision, 89% recall
→ Works even with compilation errors, unlike traditional approaches
Share this post