a person re-identification system that doesn't fail in new environments
ReMix combines multi-camera and single-camera data to make person re-identification work anywhere
📚 https://arxiv.org/abs/2410.21938
🎯 Original Problem:
Person re-identification (Re-ID) systems struggle with generalization across different environments due to limited multi-camera training data. Current methods show significant accuracy drops when environments change, making them impractical for real-world applications.
-----
🔧 Solution in this Paper:
→ ReMix: A novel joint training approach combining limited labeled multi-camera data with large unlabeled single-camera data
→ Key Components:
- Novel data sampling strategy for efficient pseudo labeling
- Momentum encoder architecture with exponential moving average weight updates
- Specialized loss functions:
* Instance Loss: Handles positive/negative instance relationships
* Augmentation Loss: Maintains consistency across image variations
* Centroids Loss: Works with both labeled and pseudo-labeled data
* Camera Centroids Loss: Specific to multi-camera data
-----
💡 Key Insights:
→ Single-camera data, though simpler, provides valuable diversity for training
→ Joint training outperforms pure self-supervised pre-training
→ Different complexities of multi/single-camera data require specialized temperature parameters
→ Momentum encoder stabilizes training with unlabeled data
-----
📊 Results:
→ Outperforms SOTA on cross-dataset scenarios:
- Market-1501: 84.0% Rank-1, 61.0% mAP
- DukeMTMC-reID: 77.6% Rank-1, 61.6% mAP
→ Achieves better generalization without complex architectures
Share this post