0:00
/
0:00
Transcript

"From Scarcity to Capability: Empowering Fake News Detection in Low-Resource Languages with LLMs"

Generated below podcast on this paper with Google's Illuminate.

BanFakeNews-2.0, an expanded Bangla fake news dataset, improves detection using LLMs and transformer models, addressing data scarcity in low-resource languages.

https://arxiv.org/abs/2501.09604

Solution in this Paper 💡:

→ The paper introduces BanFakeNews-2.0, a significantly larger dataset with 60,000 news articles (47,000 authentic, 13,000 fake).

→ It covers 13 categories and includes a manually curated independent test set of 1000 articles.

→ The study uses transformer-based models, including fine-tuned Bidirectional Encoder Representations from Transformers variants and LLMs with Quantized Low-Rank Approximation.

-----

Key Insights from this Paper 🔑:

→ LLMs and transformer models outperform traditional methods in Bangla fake news detection when trained on diverse datasets.

→ Character-level features are effective in capturing nuanced patterns in Bangla fake news.

→ Data diversity and balance are crucial for robust fake news detection.

-----

Results 💯:

→ Fine-tuned Bidirectional Encoder Representations from Transformers variants achieved 87% F1-score.

→ LLMs with Quantized Low-Rank Approximation achieved 89% F1-score.

→ Traditional methods using linguistic features with Support Vector Machine achieved up to 86% macro F1-score on BanFakeNews-2.0

Discussion about this video