0:00
/
0:00
Transcript

"Human-like Affective Cognition in Foundation Models"

Generated this podcast with Google's Illuminate.

Foundation models demonstrate human-like affective cognition across diverse emotional reasoning tasks.

LLMs show sophisticated grasp of emotional dynamics in social situations.

📚 https://arxiv.org/pdf/2409.11733

Original Problem 👀:

Evaluations of affective cognition (understanding emotions) in foundation models compared to humans is a challenge. Existing evaluations lack systematic benchmarking of different types of affective inferences.

-----

Solution in this Paper 🔬:

• Introduces evaluation framework based on psychological theory of emotions

• Generates 1,280 diverse scenarios exploring relationships between appraisals, emotions, expressions, and outcomes

• Uses causal template to systematically vary stimuli and test different inferences

• Compares model performance (GPT-4, Claude-3, Gemini-1.5-Pro) to human judgments across carefully selected conditions

-----

Key Insights from this Paper 💡:

• Foundation models match or exceed human-level performance on many affective reasoning tasks

• Models benefit from chain-of-thought prompting, improving affective judgments

• Some appraisal dimensions (e.g. goal inference) more salient than others for both humans and models

• Models can integrate information from outcomes, appraisals, emotions, and facial expressions

-----

Results 📊:

• Model-participant agreement matches/exceeds interparticipant agreement on many tasks

• "Superhuman" performance on some tasks, e.g. Claude-3 with CoT: 78.82% vs human 69.38% agreement on emotion inference

• Chain-of-thought improves performance, e.g. GPT-4 goal inference from 71.14% to 88.61%

• Models struggle more with safety appraisal inference (61.07% agreement) vs goal inference (88.61%)

Discussion about this video

User's avatar