EarthView combines 15 trillion pixels of satellite data to create better Earth monitoring models. 🌍
This paper introduces EarthView, a massive 15 tera-pixel dataset combining satellite imagery from multiple sources, designed specifically for self-supervised learning in remote sensing applications. It introduces EarthMAE, a modified Masked Autoencoder architecture optimized for Earth monitoring tasks.
→ Covers 437,682 km² with 2,967,663 high-resolution patches
https://arxiv.org/abs/2501.08111
This Paper 🛠️:
→ EarthView combines data from three major sources - Satellogic (1m resolution), NEON (hyperspectral), and Sentinel (multi-spectral).
→ The architecture incorporates distinct tokenizers for different data sources and special encodings for temporal and source information.
→ Dataset spans 5 years (2017-2022) with multiple revisits per location, enabling temporal analysis.
-----
Key Insights 💡:
→ Tube masking consistently outperforms random masking across tasks
→ Including temporal information significantly improves model performance
→ Combined Sentinel-Satellogic data yields better results than individual sources
→ Achieves consistent performance improvement over models pre-trained on Sentinel data alone
→ Shows linear performance scaling with dataset size, indicating room for further improvements
Share this post