Semantic-aware embeddings that understand the functional flow of conversations.
Dialog2Flow (D2F) maps dialog utterances into action-based latent space for automated workflow extraction
📚 https://arxiv.org/abs/2410.18481
Original Problem 🎯:
Automatically extracting structured workflows from raw dialog data remains a major challenge. This capability is crucial for dialog system design, discourse analysis, and training both AI and human agents. Current methods either require manual annotation or use ad-hoc approaches.
-----
Solution in this Paper 🔧:
• Introduces Dialog2Flow (D2F) embeddings that map utterances to a latent space where they cluster by their communicative functions
• Builds a unified dataset from 20 task-oriented dialog datasets with standardized action annotations
• Implements a novel soft contrastive loss that leverages semantic information of actions to guide representation learning
• Creates first sentence embedding model specifically pre-trained for dialog flow extraction
• Maps dialogs as continuous trajectories in latent space with distinct action-related regions
• Clusters D2F embeddings to convert dialogs into sequences of action IDs for workflow extraction
-----
Key Insights 💡:
• Soft contrastive loss outperforms standard supervised contrastive loss by better capturing semantic relationships
• The approach works consistently across domains even with limited training data
• Embeddings successfully cluster utterances by communicative functions rather than just semantic similarity
• The unified dataset created is the largest with standardized per-turn action annotations
-----
Results 📊:
• D2F achieves 6.86% average difference from reference graphs across domains vs 27.90% for best baseline
• Shows superior qualitative and quantitative results compared to various sentence embeddings
• Maintains consistent performance even in domains with only 0.11% of training data
• Extracts graphs closest in complexity to reference graphs across all tested domains
Share this post