Student-teacher framework that expands aerial detection to unlimited object categories
CastDet, proposed in this paper, enables aerial object detection without predefined categories using student-teacher learning
https://arxiv.org/abs/2411.02057
🎯 Original Problem:
Current aerial object detection systems can only detect pre-defined categories and need extensive labeled data. They struggle with novel objects and can't handle diverse orientations in aerial imagery.
-----
🔍 Solution in this Paper:
→ CastDet: A student-teacher framework combining three key components:
- Student model for both horizontal and oriented object detection
- Localization teacher generating high-quality object proposals
- RemoteCLIP as external teacher providing classification knowledge
→ Uses dynamic label queue to maintain and update pseudo-labels during training
→ Implements specialized box selection strategies considering scale and orientation
→ Extends framework to handle oriented object detection with tailored algorithms
-----
💡 Key Insights:
→ First work to tackle open-vocabulary aerial object detection with orientation handling
→ Novel box selection strategies improve pseudo-label quality
→ Dynamic label queue mechanism enhances training stability
→ Integration of RemoteCLIP provides aerial-specific knowledge
-----
📊 Results:
→ Tested on multiple aerial detection datasets showing significant improvements
→ First benchmark established for open-vocabulary oriented aerial detection
→ Outperforms baseline approaches in both horizontal and oriented detection tasks
Share this post