"Linguistics Theory Meets LLM: Code-Switched Text Generation via Equivalence Constrained Large Language Models"

Playback speed

Share post at current time

0:00

Transcript

"Linguistics Theory Meets LLM: Code-Switched Text Generation via Equivalence Constrained Large Language Models"

The podcast on this paper is generated with Google's Illuminate.

Rohan Paul

Dec 28, 2024

This Paper proposes, a framework that teaches LLMs the grammar of language mixing.

EZSWITCH, proposed in this paper, combines linguistic rules with LLMs to generate natural code-switched text

📚 https://arxiv.org/abs/2410.22660

🎯 Original Problem:

Code-switching (mixing multiple languages in conversation) poses challenges for LLMs. Current approaches either rely on complex syntactic rules or pure neural generation, but fail to combine linguistic theory with LLMs effectively.

-----

🔧 Solution in this Paper:

→ EZSWITCH framework combines Equivalence Constraint Theory (ECT) with LLMs

→ Uses word-level alignment between languages using GIZA++ tool

→ Identifies valid switching points based on ECT constraints

→ Feeds these constraints as prompts to LLMs along with bilingual context and examples

→ Tested with three models: Aya23, Llama3 8B, and Llama3.1 8B

-----

💡 Key Insights:

→ Small open-source models (8B parameters) can produce high-quality code-switched text when guided by linguistic constraints

→ Traditional metrics (BLEU, BERTScore) show weak correlation (0.2) with human evaluation

→ GPT-4o-mini achieves better correlation (0.5) when used as structured evaluator

→ Translations from Indic to English showed higher fluency than English to Indic

-----

📊 Results:

→ Llama3.1 8B consistently achieved highest accuracy and fluency scores

→ EZSWITCH outperformed Human ECT for English input translations

→ Both underperformed compared to baseline for Indic input translations

→ Released CSPREF dataset with human preference annotations for future research

Rohan's Bytes

"Linguistics Theory Meets LLM: Code-Switched Text Generation via Equivalence Constrained Large Language Models"

Discussion about this video