0:00
/
0:00
Transcript

"AgentStore: Scalable Integration of Heterogeneous Agents As Specialized Generalist Computer Assistant"

The podcast on this paper is generated with Google's Illuminate.

AgentStore, proposed in this paper, integrates specialized agents into a unified platform for handling complex computer tasks

📚 https://arxiv.org/abs/2410.18603

🎯 Original Problem:

Current computer agents struggle with both generalization and specialization. Single generalist agents lack specialized abilities for specific tasks, while specialized agents can't handle broader system-wide operations, making them ineffective for real-world computer tasks.

-----

🔧 Solution in this Paper:

• AgentStore: A scalable platform to dynamically integrate heterogeneous agents with diverse capabilities

• Three core components:

- AgentPool: Collection of 20+ feature-specific agents

- AgentEnroll: Protocol for integrating new agents

- MetaAgent: Core component using AgentToken strategy

• Novel AgentToken approach:

- Encodes agents as special tokens in MetaAgent's vocabulary

- Enables efficient management without lengthy contexts

- Supports both single-agent routing and multi-agent coordination

- Uses automated self-instruct for training data generation

-----

💡 Key Insights:

• App store-inspired architecture enables continuous capability expansion

• AgentToken strategy solves the agent management scalability problem

• Multi-token prediction enables effective agent collaboration

• Minimal training overhead through self-instruct process

• Balance between generalization and specialization is achievable

-----

📊 Results:

• More than doubled OSWorld benchmark performance from 11.21% to 23.85%

• Achieved 57.8% success rate on APPAgent mobile benchmark

• AgentToken showed 80.60% routing accuracy

• Required only 0.2 hours training time vs 2.5 hours for other methods

• Used just 86K parameters vs 38M in traditional approaches

Discussion about this video