A new dataset and automated pipeline simplifies large-scale meme analysis.
This paper introduces ClassicMemes-50-templates (CM50), a dataset of 33,000 memes, and an automated annotation pipeline using LLMs for meme analysis and retrieval.
-----
Paper - https://arxiv.org/abs/2501.13851
Original Problem 🤔:
→ Existing meme research lacks deeper comprehension and retrieval methods.
→ Current datasets are limited in scope or size, hindering large-scale analysis.
-----
Solution in this Paper 💡:
→ This study introduces CM50, a dataset of 33,172 memes based on 50 popular templates.
→ It proposes an automated annotation pipeline leveraging LLMs, specifically GPT-40.
→ This pipeline generates image captions, meme captions, and literary device labels, simplifying data annotation and analysis.
→ A meme-text retrieval CLIP model (mtrCLIP) is proposed to enhance meme analysis through cross-modal embedding.
-----
Key Insights from this Paper 💎:
→ LLMs like GPT-40 can automate meme annotation with near-human accuracy for captions and embedded text.
→ Template context improves LLM performance in labeling literary devices.
→ Fine-tuning CLIP for meme-text retrieval improves retrieval accuracy by up to 11.2%.
-----
Results 📊:
→ mtrCLIP achieves Recall@1 of 0.760 for meme-to-text and 0.780 for text-to-meme on MemeCap.
→ Automated pipeline with GPT-40 achieves 0.525 BLEURT score on MemeCap for meme captions, comparable to human-level performance.
→ Macro F1-score of 0.39 is achieved for literary device labeling on Figmemes, showing potential for improvement.
Share this post