Teaching Minecraft agents to remember and navigate like humans do.
A memory-augmented controller that actually remembers where it saw resources
Mr.SteVE enhances Minecraft agents with episodic memory to remember and revisit important locations
https://arxiv.org/abs/2411.06736
Original Problem 🎯:
STEVE-1, a widely used Minecraft controller, can only remember a few seconds of gameplay due to Transformer-XL's memory constraints. This forces agents to repeatedly explore areas they've already visited, making task completion inefficient in sparse environments.
-----
Solution in this Paper 🛠️:
→ Mr.SteVE introduces Place Event Memory (PEM), storing what-where-when information about visited locations and events.
→ The system uses a Memory Module to store novel events and a Solver Module that switches between exploration and execution modes.
→ A hierarchical exploration strategy with Count-Based high-level goal selection and VPT-Nav low-level navigation optimizes area coverage.
→ The agent employs goal-conditioned VPT Navigator, fine-tuned from VPT using PPO, for precise navigation in complex terrains.
-----
Key Insights 🧠:
→ Low-level controllers need episodic memory capabilities beyond just high-level planners
→ Hierarchical memory organization is more efficient than FIFO for long-horizon tasks
→ Goal-conditioned navigation significantly improves exploration efficiency
-----
Results 📊:
→ Map Coverage increased to 84.42% compared to STEVE-1's 50.77%
→ Revisit Count reduced to 0.38 from STEVE-1's 2.68
→ Demonstrated superior navigation in complex terrains with rivers and mountains
Share this post