A robot memory system that updates itself when objects move or disappear.
Real-time environment tracking.
DynaMem, proposed in this paper, enables robots to handle moving objects by maintaining a dynamic 3D memory of their environment
https://arxiv.org/abs/2411.04999
🎯 Original Problem:
Most current open-vocabulary mobile manipulation systems assume static environments, severely limiting their real-world applicability where environments constantly change due to human intervention or robot actions.
-----
🔧 Solution in this Paper:
→ DynaMem introduces a dynamic spatio-semantic memory that adapts to changing environments in real-time
→ It maintains a voxelized pointcloud representation storing 3D locations, observation counts, source image IDs, semantic features, and timestamps
→ Uses a hybrid approach combining Vision Language Models and multimodal LLMs for object localization
→ Implements ray-casting to identify and remove outdated voxels when objects move or disappear
→ Features a value-based exploration system prioritizing least-recently seen areas and semantic similarity
-----
💡 Key Insights:
→ Static environment assumptions severely limit real-world robot deployment
→ Combining VLM features with mLLM verification provides robust object detection
→ Dynamic memory updating is crucial for maintaining accurate environmental representation
→ Exploration strategies need to balance between temporal and semantic priorities
-----
📊 Results:
→ 70% average pick-and-drop success rate on non-stationary objects
→ More than 2x improvement over static baseline systems (30% success rate)
→ Only 6.7% navigation failures for dynamic objects vs 53.3% for baseline
→ Successfully deployed in both lab and home environments
Share this post