Camera-aware unsupervised person re-identification method guided by pseudo-label refinement

Cheng Siyu; Chen Ying

doi:10.12086/oee.2023.230239

Article navigation > Opto-Electronic Engineering > 2023 Vol. 50 > No. 12 > 230239

Next Article Previous Article

Cheng S Y, Chen Y. Camera-aware unsupervised person re-identification method guided by pseudo-label refinement[J]. Opto-Electron Eng, 2023, 50(12): 230239. doi: 10.12086/oee.2023.230239

Citation:

Cheng S Y, Chen Y. Camera-aware unsupervised person re-identification method guided by pseudo-label refinement[J]. Opto-Electron Eng, 2023, 50(12): 230239. doi: 10.12086/oee.2023.230239

Camera-aware unsupervised person re-identification method guided by pseudo-label refinement

Cheng Siyu,
Chen Ying^,

Key Laboratory of Advanced Process Control for Light Industry (Ministry of Education), School of Internet of Things Engineering, Jiangnan University, Wuxi, Jiangsu 214122, China

Fund Project: Project supported by National Natural Science Foundation of China (62173160)

More Information

^*Corresponding author: chenying@jiangnan.edu.cn

Received Date 24 September 2023

Revised Date 27 December 2023

Accepted Date 27 December 2023

Published Date 19 January 2024

Abstract

Abstract

Unsupervised person re-identification has attracted more and more attention due to its extensive practical application prospects. Most clustering-based contrastive learning methods treat each cluster as a pseudo-identity class, overlooking intra-class variances caused by differences in camera styles. While some methods have introduced camera-aware contrastive learning by partitioning a single cluster into multiple sub-clusters based on camera views, they are susceptible to misguidance from noisy pseudo-labels. To address this issue, we first refine pseudo-labels by leveraging the similarity between instances in the feature space, using a weighted combination of the nearest neighboring predicted labels and the original clustering results. Subsequently, it dynamically associates instances with possible category centers based on refined pseudo-labels while eliminating potential false negative samples. This method enhances the selection mechanism for positive and negative samples in camera-aware contrastive learning, effectively mitigating the influence of noisy pseudo-labels on the contrastive learning task. On Market-1501, MSMT17 and Personx datasets, mAP/Rank-1 reached 85.2%/94.4%, 44.3%/74.1% and 88.7%/95.9%.
- person re-identification /
- unsupervised /
- camera-aware contrastive learning /
- refined pseudo-labels

FullText(HTML)

References

[1]	张晓艳, 张宝华, 吕晓琪, 等. 深度双重注意力的生成与判别联合学习的行人重识别[J]. 光电工程, 2021, 48(5): 200388. doi: 10.12086/oee.2021.200388 CrossRef Google Scholar Zhang X Y, Zhang B H, Lv X Q, et al. The joint discriminative and generative learning for person re-identification of deep dual attention[J]. Opto-Electron Eng, 2021, 48(5): 200388. doi: 10.12086/oee.2021.200388 CrossRef Google Scholar
[2]	Zhong Z, Zheng L, Luo Z M, et al. Invariance matters: exemplar memory for domain adaptive person re-identification[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019: 598–607. https://doi.org/10.1109/CVPR.2019.00069. Google Scholar
[3]	Ge Y X, Zhu F, Chen D P, et al. Self-paced contrastive learning with hybrid memory for domain adaptive object re-ID[C]//Proceedings of the 34th International Conference on Neural Information Processing Systems, 2020: 949. https://doi.org/10.5555/3495724.3496673. Google Scholar
[4]	Dai Z Z, Wang G Y, Yuan W H, et al. Cluster contrast for unsupervised person re-identification[C]//Proceedings of the 16th Asian Conference on Computer Vision, 2023: 319–337. https://doi.org/10.1007/978-3-031-26351-4_20. Google Scholar
[5]	Tian J J, Tang Q H, Li R, et al. A camera identity-guided distribution consistency method for unsupervised multi-target domain person re-identification[J]. ACM Trans Intell Syst Technol, 2021, 12(4): 38. doi: 10.1145/3454130 CrossRef Google Scholar
[6]	Choi Y, Choi M, Kim M, et al. StarGAN: Unified generative adversarial networks for multi-domain image-to-image translation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018: 8789–8797. https://doi.org/10.1109/CVPR.2018.00916. Google Scholar
[7]	Yang F X, Zhong Z, Luo Z M, et al. Joint noise-tolerant learning and meta camera shift adaptation for unsupervised person re-identification[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021: 4853–4862. https://doi.org/10.1109/CVPR46437.2021.00482. Google Scholar
[8]	Xuan S Y, Zhang S L. Intra-inter camera similarity for unsupervised person re-identification[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021: 11921–11930. https://doi.org/10.1109/CVPR46437.2021.01175. Google Scholar
[9]	Wang M L, Lai B S, Huang J Q, et al. Camera-aware proxies for unsupervised person re-identification[C]//Proceedings of the AAAI Conference on Artificial Intelligence, 2021: 2764–2772. https://doi.org/10.1609/aaai.v35i4.16381. Google Scholar
[10]	Li X, Liang T F, Jin Y, et al. Camera-aware style separation and contrastive learning for unsupervised person re-identification[C]//2022 IEEE International Conference on Multimedia and Expo, 2022: 1–6. https://doi.org/10.1109/ICME52920.2022.9859842. Google Scholar
[11]	Lee G, Lee S, Kim D, et al. Camera-driven representation learning for unsupervised domain adaptive person re-identification[Z]. arXiv: 2308.11901, 2023. https://doi.org/10.48550/arXiv.2308.11901. Google Scholar
[12]	Ge Y X, Chen D P, Li H S. Mutual mean-teaching: pseudo label refinery for unsupervised domain adaptation on person re-identification[C]//8th International Conference on Learning Representations, 2020. Google Scholar
[13]	Zhai Y P, Ye Q X, Lu S J, et al. Multiple expert brainstorming for domain adaptive person re-identification[C]//16th European Conference on Computer Vision, 2020: 594–611. https://doi.org/10.1007/978-3-030-58571-6_35. Google Scholar
[14]	Zhang X, Ge Y X, Qiao Y, et al. Refining pseudo labels with clustering consensus over generations for unsupervised object re-identification[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021: 3435–3444. https://doi.org/10.1109/CVPR46437.2021.00344. Google Scholar
[15]	Cho Y, Kim W J, Hong S, et al. Part-based pseudo label refinement for unsupervised person re-identification[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2022: 7298–7308. https://doi.org/10.1109/CVPR52688.2022.00716. Google Scholar
[16]	Wu Y H, Huang T T, Yao H T, et al. Multi-centroid representation network for domain adaptive person re-ID[C]//Proceedings of the AAAI Conference on Artificial Intelligence, 2022: 2750–2758. https://doi.org/10.1609/aaai.v36i3.20178. Google Scholar
[17]	Chen H, Lagadec B, Bremond F. ICE: inter-instance contrastive encoding for unsupervised person re-identification[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021: 14940–14949. https://doi.org/10.1109/ICCV48922.2021.01469. Google Scholar
[18]	Lan L, Teng X, Zhang J, et al. Learning to purification for unsupervised person re-identification[J]. IEEE Trans Image Process, 2023, 33: 3338−3353. doi: 10.1109/TIP.2023.3278860 CrossRef Google Scholar
[19]	Chen Z Q, Cui Z C, Zhang C, et al. Dual clustering co-teaching with consistent sample mining for unsupervised person re-identification[J]. IEEE Trans Circuits Syst Video Technol, 2023, 33(10): 5908−5920. doi: 10.1109/TCSVT.2023.3261898 CrossRef Google Scholar
[20]	Pang Z Q, Zhao L L, Liu Q Y, et al. Camera invariant feature learning for unsupervised person re-identification[J]. IEEE Trans Multimed, 2023, 25: 6171−6182. doi: 10.1109/TMM.2022.3206662 CrossRef Google Scholar
[21]	Wang H J, Yang M, Liu J L, et al. Pseudo-label noise prevention, suppression and softening for unsupervised person re-identification[J]. IEEE Trans Inf Forensics Secur, 2023, 18: 3222−3237. doi: 10.1109/TIFS.2023.3277694 CrossRef Google Scholar
[22]	Li P N, Wu K Y, Zhou S P, et al. Pseudo labels refinement with intra-camera similarity for unsupervised person re-identification[C]//2023 IEEE International Conference on Image Processing, 2023: 366–370. https://doi.org/10.1109/ICIP49359.2023.10222317. Google Scholar
[23]	Zheng L, Shen L Y, Tian L, et al. Scalable person re-identification: a benchmark[C]//Proceedings of the 2015 IEEE International Conference on Computer Vision, 2015: 1116–1124. https://doi.org/10.1109/ICCV.2015.133. Google Scholar
[24]	Wei L H, Zhang S L, Gao W, et al. Person transfer GAN to bridge domain gap for person re-identification[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018: 79–88. https://doi.org/10.1109/CVPR.2018.00016. Google Scholar
[25]	Sun X X, Zheng L. Dissecting person re-identification from the viewpoint of viewpoint[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019: 608–617. https://doi.org/10.1109/CVPR.2019.00070. Google Scholar
[26]	Ester M, Kriegel H P, Sander J, et al. A density-based algorithm for discovering clusters in large spatial databases with noise[C]//Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, 1996: 226–231. https://doi.org/10.5555/3001460.3001507. Google Scholar
[27]	Zhou D Y, Bousquet O, Lal T N, et al. Learning with local and global consistency[C]//Proceedings of the 16th International Conference on Neural Information Processing Systems, 2003: 321–328. https://doi.org/10.5555/2981345.2981386. Google Scholar
[28]	Deng J, Dong W, Socher R, et al. ImageNet: a large-scale hierarchical image database[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2009: 248–255. https://doi.org/10.1109/CVPR.2009.5206848. Google Scholar
[29]	He K M, Zhang X Y, Ren S Q, et al. Deep residual learning for image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016: 770–778. https://doi.org/10.1109/CVPR.2016.90. Google Scholar
[30]	Zhong Z, Zheng L, Cao D L, et al. Re-ranking person re-identification with k-reciprocal encoding[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017: 3652–3661. https://doi.org/10.1109/CVPR.2017.389. Google Scholar
[31]	Zheng K C, Liu W, He L X, et al. Group-aware label transfer for domain adaptive person re-identification[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021: 5306–5315. https://doi.org/10.1109/CVPR46437.2021.00527. Google Scholar
[32]	李慧, 张晓伟, 赵新鹏, 等. 基于多标签协同学习的跨域行人重识别[J]. 北京航空航天大学学报, 2022, 48(8): 1534−1542. doi: 10.13700/j.bh.1001-5965.2021.0600 CrossRef Google Scholar Li H, Zhang X W, Zhao X P, et al. Multi-label cooperative learning for cross domain person re-identification[J]. J Beijing Univ Aeronaut Astronaut, 2022, 48(8): 1534−1542. doi: 10.13700/j.bh.1001-5965.2021.0600 CrossRef Google Scholar
[33]	Liu Y X, Ge H W, Sun L, et al. Complementary attention-driven contrastive learning with hard-sample exploring for unsupervised domain adaptive person re-ID[J]. IEEE Trans Circuits Syst Video Technol, 2023, 33(1): 326−341. doi: 10.1109/TCSVT.2022.3200671 CrossRef Google Scholar
[34]	陈利文, 叶锋, 黄添强, 等. 基于摄像头域内域间合并的无监督行人重识别方法[J]. 计算机研究与发展, 2023, 60(2): 415−425. doi: 10.7544/issn1000-1239.202110732 CrossRef Google Scholar Chen L W, Ye F, Huang T Q, et al. An unsupervised person re-Identification method based on intra-/inter-camera merger[J]. J Comput Res Dev, 2023, 60(2): 415−425. doi: 10.7544/issn1000-1239.202110732 CrossRef Google Scholar
[35]	Zhang H W, Zhang G Q, Chen Y H, et al. Global relation-aware contrast learning for unsupervised person re-identification[J]. IEEE Trans Circuits Syst Video Technol, 2022, 32(12): 8599−8610. doi: 10.1109/TCSVT.2022.3194084 CrossRef Google Scholar
[36]	钱亚萍, 王凤随, 熊磊. 基于局部细化多分支与全局特征共享的无监督行人重识别方法[J]. 电子测量与仪器学报, 2023, 37(1): 106−115. doi: 10.13382/j.jemi.B2205837 CrossRef Google Scholar Qian Y P, Wang F S, Xiong L. Unsupervised person re-identification method based on local refinement multi-branch and global feature sharing[J]. J Electron Meas Instrum, 2023, 37(1): 106−115. doi: 10.13382/j.jemi.B2205837 CrossRef Google Scholar
[37]	Peng J J, Jiang G Q, Wang H B. Adaptive memorization with group labels for unsupervised person re-identification[J]. IEEE Trans Circuits Syst Video Technol, 2023, 33(10): 5802−5813. doi: 10.1109/TCSVT.2023.3258917 CrossRef Google Scholar
[38]	Wang F, Liu H P. Understanding the behaviour of contrastive loss[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021: 2495–2504. https://doi.org/10.1109/CVPR46437.2021.00252. Google Scholar

Overview

Overview

Unsupervised person re-identification has received increasing attention due to its wide practical application prospects. Most clustering-based contrastive learning methods treat each cluster as a pseudo-identity class, focusing on improving inter-class differences while ignoring intra-class differences caused by factors such as perspective, lighting, and background between different cameras. This makes it difficult for clustering algorithms to accurately cluster samples with the same identity into the same cluster, inevitably leading to noisy pseudo-labels. Some methods have introduced camera-aware contrastive learning, which divide a single cluster into multiple sub-clusters based on the camera's perspective, and calculate the intra-camera and inter-camera contrastive loss separately. However, the noise in pseudo-labels may interfere with the selection of positive and negative samples in camera-aware contrastive learning, thereby misleading the model's learning process. To address this issue, this paper proposes a camera-aware unsupervised person re-identification method guided by refined pseudo-labels. By calculating the similarity between training instances in feature space, a neighborhood set is determined for each instance. Subsequently, the model refines one-hot pseudo-labels by combining the predicted labels for samples within the neighborhood with the original clustering results using weighted aggregation. The core idea behind this approach is to encourage the model to not only bring samples closer to their respective cluster centers but also establish associations with other nearby samples that may contain identity information. This strategy effectively enhances the model's robustness against noisy labels while reducing the risk of over-fitting. Building upon this, this paper further proposes camera-aware contrastive learning guided by refined pseudo-labels. By leveraging the probability distribution of each class in the refined pseudo-labels for instances, the model dynamically associates instances with potential class centers, no longer relying on a single class center as the positive sample. Additionally, potential false positive and false negative samples are filtered out. This method enhances the selection mechanism of positive and negative samples in camera-aware contrastive learning, effectively mitigating the influence of noisy pseudo-labels on the contrastive learning task. The method proposed in this article was validated on three large-scale public datasets, and the results showed that this method has significantly improved compared to the baseline method and is superior to current advanced methods in the same field. This method achieved mAP/Rank-1 of 85.2%/94.4%, 44.3%/74.1%, and 88.7%/95.9% on the Market-1501, MSMT17, and Personx datasets, respectively, demonstrating superiority. Specifically, on the Market-1501, MSMT17, and Personx datasets, this paper’s method achieves mAP/Rank-1 scores of 85.2%/94.4%, 44.3%/74.1%, and 88.7%/95.9%, respectively, showcasing its superiority.