Contrastive learning with SimCSE enhances BERT's semantic understanding and generalization.
This paper enhances BERT sentence embeddings by applying contrastive learning, specifically SimCSE, focusing on sentiment analysis, semantic textual similarity, and paraphrase detection.
-----
Paper - https://arxiv.org/abs/2501.13758
Original Problem 😟:
→ Effective sentence embeddings are crucial for various Natural Language Processing tasks, but creating embeddings that capture semantic nuances and generalize well is challenging.
→ Current methods often rely on labeled data, limiting scalability and applicability.
-----
Solution in this Paper 🤔:
→ This paper uses SimCSE to fine-tune a smaller BERT model (minBERT).
→ It explores different dropout methods (standard, curriculum, adaptive) to address overfitting.
→ A novel 2-Tier SimCSE model is proposed. This combines unsupervised and supervised SimCSE for Semantic Textual Similarity tasks.
→ Transfer learning from STS to Paraphrase and SST tasks is also investigated.
-----
Key Insights from this Paper 💡:
→ SimCSE effectively enhances BERT sentence embeddings, especially for Semantic Textual Similarity tasks.
→ The 2-Tier SimCSE model outperforms single-task models on STS, demonstrating the benefit of combining unsupervised and supervised contrastive learning.
→ Transfer learning from STS to other tasks shows limited effectiveness.
-----
Results 👍:
→ 2-Tier SimCSE achieves an average test score of 0.742 across all three downstream tasks.
→ Single-task unsupervised SimCSE with standard dropout achieved 0.716 Pearson Correlation on STS.
→ Single-task supervised SimCSE with standard dropout achieved 0.806 Pearson Correlation on STS.
Share this post