ROBERTa and DistiliBERT demonstrate strong alignment with brain activity during language processing.
Removing punctuation can improve model accuracy and alignment with brain activity.
This research investigates how well advanced transformer models align with human brain activity during semantic processing, focusing on the role of punctuation.
-----
https://arxiv.org/abs/2501.06278
Original Problem 🤔:
→ Understanding how LLMs represent language internally and how punctuation influences semantic processing in the brain.
-----
Solution in this Paper 💡:
→ The researchers compared four advanced transformer models (ROBERTa, DistiliBERT, ALBERT, and ELECTRA) against fMRI data of subjects reading text.
→ They trained a ridge regression model to predict brain activity from model features, evaluating predictions with searchlight classification.
→ They also investigated the impact of removing different punctuation symbols on model-brain alignment.
-----
Key Insights from this Paper 🔑:
→ ROBERTa shows the closest alignment with neural activity, slightly outperforming BERT.
→ DistiliBERT, despite being smaller, performs comparably to BERT.
→ Removing punctuation, particularly as padding tokens, improves BERT's accuracy and alignment with brain activity in later layers.
→ This suggests the brain may have limited use of punctuation for semantic understanding, especially with longer text.
-----
Results 💯:
→ ROBERTa achieves slightly better accuracy than BERT at its peak performance.
→ DistiliBERT retains comparable performance despite being 40% smaller and 60% faster to train.
→ Removing punctuation as padding tokens leads to a 1.5% improvement in accuracy for some layers of BERT.
Share this post