0:00
/
0:00
Transcript

"OpenCharacter: Training Customizable Role-Playing LLMs with Large-Scale Synthetic Personas"

Below podcast is generated with Google's Illuminate.

Synthetic personas and targeted training enable LLMs to master diverse character-based dialogues.

The paper proposes is a new approach to enable LLMs to perform customizable role-playing. It uses large-scale synthetic data derived from personas to train LLMs for character generalization in dialogue.

-----

Paper - https://arxiv.org/abs/2501.15427

Original Problem 😞:

→ Current role-playing dialogue agents in LLMs often struggle with out-of-domain characters.

→ Creating agents that can embody arbitrary, user-defined characters is challenging.

→ Existing methods rely on limited human-annotated data, hindering generalization to diverse characters.

-----

Solution in this Paper 😎:

→ This paper introduces OpenCharacter, a method using large-scale synthetic personas to train LLMs for character generalization.

→ The approach synthesizes character profiles from Persona Hub's personas, adding detailed character information.

→ Two data synthesis strategies are explored: response rewriting (\ocrew{}) and response generation (\ocgen{}).

→ Response rewriting adapts existing dialogue responses to align with a character.

→ Response generation creates new character-aligned responses directly.

→ Supervised Fine-tuning (SFT) is performed on LLaMA-3 8B using the synthetic dialogue data.

-----

Key Insights from this Paper 🤔:

→ Large-scale synthetic data with diverse characters enhances LLMs' character generalization abilities.

→ Character-driven response generation (\ocgen{}) is more effective than response rewriting (\ocrew{}) for this task.

→ LLMs trained with synthetic data can achieve performance comparable to or even better than GPT-4o in role-playing dialogue.

-----

Results 🚀:

→ OpenCharacter achieves a PersonaGym-Light score (PScore-L) of 4.66.

→ OpenCharacter achieves a PersonaGym score (PScore) of 4.52.

→ OpenCharacter, an 8B parameter model, performs comparably to GPT-4o models in role-playing evaluations.

Discussion about this video