0:00
/
0:00
Transcript

"ReMix: Training Generalized Person Re-identification on a Mixture of Data"

The podcast on this paper is generated with Google's Illuminate.

a person re-identification system that doesn't fail in new environments

ReMix combines multi-camera and single-camera data to make person re-identification work anywhere

📚 https://arxiv.org/abs/2410.21938

🎯 Original Problem:

Person re-identification (Re-ID) systems struggle with generalization across different environments due to limited multi-camera training data. Current methods show significant accuracy drops when environments change, making them impractical for real-world applications.

-----

🔧 Solution in this Paper:

→ ReMix: A novel joint training approach combining limited labeled multi-camera data with large unlabeled single-camera data

→ Key Components:

- Novel data sampling strategy for efficient pseudo labeling

- Momentum encoder architecture with exponential moving average weight updates

- Specialized loss functions:

* Instance Loss: Handles positive/negative instance relationships

* Augmentation Loss: Maintains consistency across image variations

* Centroids Loss: Works with both labeled and pseudo-labeled data

* Camera Centroids Loss: Specific to multi-camera data

-----

💡 Key Insights:

→ Single-camera data, though simpler, provides valuable diversity for training

→ Joint training outperforms pure self-supervised pre-training

→ Different complexities of multi/single-camera data require specialized temperature parameters

→ Momentum encoder stabilizes training with unlabeled data

-----

📊 Results:

→ Outperforms SOTA on cross-dataset scenarios:

- Market-1501: 84.0% Rank-1, 61.0% mAP

- DukeMTMC-reID: 77.6% Rank-1, 61.6% mAP

→ Achieves better generalization without complex architectures

Discussion about this video