The paper proposes OGBench, a standardized benchmark to assess offline Offline goal-conditioned reinforcement learning (GCRL) algorithm performance
📚 https://arxiv.org/abs/2410.20092
Original Problem 🎯:
Offline goal-conditioned reinforcement learning (GCRL) lacks standardized benchmarks to evaluate algorithms' capabilities in handling complex tasks like stitching behaviors, long-horizon planning, and stochastic environments.
-----
Solution in this Paper 🛠️:
• Introduces OGBench: A comprehensive benchmark with:
- 8 environment types
- 85 datasets
- 6 reference GCRL algorithm implementations
- Tasks designed to test stitching, long-horizon reasoning, stochasticity
• Key Components:
- Locomotion tasks: PointMaze, AntMaze, HumanoidMaze, AntSoccer
- Manipulation tasks: Cube, Scene, Puzzle
- Drawing tasks: Powderworld
- Support for both state and pixel-based observations
-----
Key Insights 💡:
• No single method dominates across all categories
• HIQL performs strongly in locomotion and visual manipulation
• CRL excels at locomotion tasks
• GCIQL shows strength in manipulation tasks
• Different methods show distinct capabilities in handling stochasticity and stitching
-----
Results 📊:
• HIQL achieves up to 96% success on AntMaze navigation
• GCIQL reaches 95% success on puzzle manipulation
• CRL shows 94% performance on visual locomotion
• Methods show clear performance differences across tasks, providing effective research signals
Share this post