0:00
/
0:00
Transcript

"AIDBench: A benchmark for evaluating the authorship identification capability of large language models"

The podcast on this paper is generated with Google's Illuminate.

LLMs can unmask anonymous authors with surprising accuracy through writing pattern analysis.

AIDBench evaluates LLMs' ability to identify anonymous text authors, revealing privacy risks in systems like anonymous peer reviews.

-----

https://arxiv.org/abs/2411.13226

🔍 Original Problem:

Anonymous systems like peer reviews rely on identity protection, but LLMs might compromise this by identifying authors through writing patterns.

-----

🛠️ Solution in this Paper:

→ AIDBench introduces multiple datasets including research papers, emails, blogs, reviews, and articles to test LLM authorship identification

→ Implements two evaluation methods: one-to-one identification determines if two texts share an author, while one-to-many finds matching authors from multiple candidates

→ Develops a RAG-based pipeline to handle large-scale authorship identification when input exceeds model context windows

→ Uses topic-ignored prompts to focus on writing style rather than content

-----

💡 Key Insights:

→ LLMs can identify authors significantly above random chance

→ GPT-4 shows superior authorship detection compared to other models

→ Cross-topic scenarios pose greater challenges for identification

→ First-author papers provide more reliable authorship signals

-----

📊 Results:

→ GPT-4 achieves 83.3% precision in two-author scenarios

→ RAG-based method improves identification accuracy by 15-20%

→ Performance drops in cross-topic scenarios, but remains above random baseline

Discussion about this video