LLMs can unmask anonymous authors with surprising accuracy through writing pattern analysis.
AIDBench evaluates LLMs' ability to identify anonymous text authors, revealing privacy risks in systems like anonymous peer reviews.
-----
https://arxiv.org/abs/2411.13226
🔍 Original Problem:
Anonymous systems like peer reviews rely on identity protection, but LLMs might compromise this by identifying authors through writing patterns.
-----
🛠️ Solution in this Paper:
→ AIDBench introduces multiple datasets including research papers, emails, blogs, reviews, and articles to test LLM authorship identification
→ Implements two evaluation methods: one-to-one identification determines if two texts share an author, while one-to-many finds matching authors from multiple candidates
→ Develops a RAG-based pipeline to handle large-scale authorship identification when input exceeds model context windows
→ Uses topic-ignored prompts to focus on writing style rather than content
-----
💡 Key Insights:
→ LLMs can identify authors significantly above random chance
→ GPT-4 shows superior authorship detection compared to other models
→ Cross-topic scenarios pose greater challenges for identification
→ First-author papers provide more reliable authorship signals
-----
📊 Results:
→ GPT-4 achieves 83.3% precision in two-author scenarios
→ RAG-based method improves identification accuracy by 15-20%
→ Performance drops in cross-topic scenarios, but remains above random baseline
Share this post