MobA: A Two-Level Agent System for Efficient Mobile Task Automation
Mobile phones get smarter with MobA's two-agent system that breaks down complex tasks into simple steps
Mobile phones get smarter with MobA's two-agent system that breaks down complex tasks into simple steps
Original Problem 🔍:
Mobile assistants struggle with complex tasks due to API limitations and inability to handle diverse interfaces. Existing solutions lack comprehension and planning capabilities for real-world mobile environments.
Solution in this Paper 🏗️:
MobA: A two-level agent system for mobile task automation
• Global Agent: Interprets commands, manages history, plans tasks
• Local Agent: Executes precise actions based on sub-tasks
• Key components:
Plan Module: Decomposes tasks into sub-tasks
Action Module: Generates and executes actions
Reflection Module: Verifies task completion
Memory Module: Provides contextual information
Key Insights from this Paper 💡:
• Two-level agent structure enhances task comprehension and planning
• Task decomposition improves execution efficiency
• Memory mechanisms enable better adaptation to diverse interfaces
• Double-reflection process handles previously unseen tasks effectively
Results 📊:
• MobA achieved 66.2% milestone score rate on MobBench test set
• Outperformed second-highest baseline by over 17%
• Demonstrated superior performance in handling complex tasks
• Improved execution efficiency with fewer ineffective actions