BanFakeNews-2.0, an expanded Bangla fake news dataset, improves detection using LLMs and transformer models, addressing data scarcity in low-resource languages.
https://arxiv.org/abs/2501.09604
Solution in this Paper 💡:
→ The paper introduces BanFakeNews-2.0, a significantly larger dataset with 60,000 news articles (47,000 authentic, 13,000 fake).
→ It covers 13 categories and includes a manually curated independent test set of 1000 articles.
→ The study uses transformer-based models, including fine-tuned Bidirectional Encoder Representations from Transformers variants and LLMs with Quantized Low-Rank Approximation.
-----
Key Insights from this Paper 🔑:
→ LLMs and transformer models outperform traditional methods in Bangla fake news detection when trained on diverse datasets.
→ Character-level features are effective in capturing nuanced patterns in Bangla fake news.
→ Data diversity and balance are crucial for robust fake news detection.
-----
Results 💯:
→ Fine-tuned Bidirectional Encoder Representations from Transformers variants achieved 87% F1-score.
→ LLMs with Quantized Low-Rank Approximation achieved 89% F1-score.
→ Traditional methods using linguistic features with Support Vector Machine achieved up to 86% macro F1-score on BanFakeNews-2.0
Share this post