MobA: A Two-Level Agent System for Efficient Mobile Task Automation

Mobile phones get smarter with MobA's two-agent system that breaks down complex tasks into simple steps

Nov 11, 2024

Mobile phones get smarter with MobA's two-agent system that breaks down complex tasks into simple steps

Original Problem 🔍:

Mobile assistants struggle with complex tasks due to API limitations and inability to handle diverse interfaces. Existing solutions lack comprehension and planning capabilities for real-world mobile environments.

Solution in this Paper 🏗️:

MobA: A two-level agent system for mobile task automation

• Global Agent: Interprets commands, manages history, plans tasks

• Local Agent: Executes precise actions based on sub-tasks

• Key components:

Plan Module: Decomposes tasks into sub-tasks
Action Module: Generates and executes actions
Reflection Module: Verifies task completion
Memory Module: Provides contextual information

Key Insights from this Paper 💡:

• Two-level agent structure enhances task comprehension and planning

• Task decomposition improves execution efficiency

• Memory mechanisms enable better adaptation to diverse interfaces

• Double-reflection process handles previously unseen tasks effectively

Results 📊:

• MobA achieved 66.2% milestone score rate on MobBench test set

• Outperformed second-highest baseline by over 17%

• Demonstrated superior performance in handling complex tasks

• Improved execution efficiency with fewer ineffective actions

Rohan's Bytes

Discussion about this post