Synthetic personas and targeted training enable LLMs to master diverse character-based dialogues.
The paper proposes is a new approach to enable LLMs to perform customizable role-playing. It uses large-scale synthetic data derived from personas to train LLMs for character generalization in dialogue.
-----
Paper - https://arxiv.org/abs/2501.15427
Original Problem 😞:
→ Current role-playing dialogue agents in LLMs often struggle with out-of-domain characters.
→ Creating agents that can embody arbitrary, user-defined characters is challenging.
→ Existing methods rely on limited human-annotated data, hindering generalization to diverse characters.
-----
Solution in this Paper 😎:
→ This paper introduces OpenCharacter, a method using large-scale synthetic personas to train LLMs for character generalization.
→ The approach synthesizes character profiles from Persona Hub's personas, adding detailed character information.
→ Two data synthesis strategies are explored: response rewriting (\ocrew{}) and response generation (\ocgen{}).
→ Response rewriting adapts existing dialogue responses to align with a character.
→ Response generation creates new character-aligned responses directly.
→ Supervised Fine-tuning (SFT) is performed on LLaMA-3 8B using the synthetic dialogue data.
-----
Key Insights from this Paper 🤔:
→ Large-scale synthetic data with diverse characters enhances LLMs' character generalization abilities.
→ Character-driven response generation (\ocgen{}) is more effective than response rewriting (\ocrew{}) for this task.
→ LLMs trained with synthetic data can achieve performance comparable to or even better than GPT-4o in role-playing dialogue.
-----
Results 🚀:
→ OpenCharacter achieves a PersonaGym-Light score (PScore-L) of 4.66.
→ OpenCharacter achieves a PersonaGym score (PScore) of 4.52.
→ OpenCharacter, an 8B parameter model, performs comparably to GPT-4o models in role-playing evaluations.
Share this post