0:00
/
0:00
Transcript

"Mr.Steve: Instruction-Following Agents in Minecraft with What-Where-When Memory"

The podcast on this paper is generated with Google's Illuminate.

Teaching Minecraft agents to remember and navigate like humans do.

A memory-augmented controller that actually remembers where it saw resources

Mr.SteVE enhances Minecraft agents with episodic memory to remember and revisit important locations

https://arxiv.org/abs/2411.06736

Original Problem 🎯:

STEVE-1, a widely used Minecraft controller, can only remember a few seconds of gameplay due to Transformer-XL's memory constraints. This forces agents to repeatedly explore areas they've already visited, making task completion inefficient in sparse environments.

-----

Solution in this Paper 🛠️:

→ Mr.SteVE introduces Place Event Memory (PEM), storing what-where-when information about visited locations and events.

→ The system uses a Memory Module to store novel events and a Solver Module that switches between exploration and execution modes.

→ A hierarchical exploration strategy with Count-Based high-level goal selection and VPT-Nav low-level navigation optimizes area coverage.

→ The agent employs goal-conditioned VPT Navigator, fine-tuned from VPT using PPO, for precise navigation in complex terrains.

-----

Key Insights 🧠:

→ Low-level controllers need episodic memory capabilities beyond just high-level planners

→ Hierarchical memory organization is more efficient than FIFO for long-horizon tasks

→ Goal-conditioned navigation significantly improves exploration efficiency

-----

Results 📊:

→ Map Coverage increased to 84.42% compared to STEVE-1's 50.77%

→ Revisit Count reduced to 0.38 from STEVE-1's 2.68

→ Demonstrated superior navigation in complex terrains with rivers and mountains

Discussion about this video