Multi-candidate association online multi-target tracking based on R-FCN framework

E Gui; Wang Yongxiong

doi:10.12086/oee.2020.190136

Article navigation > Opto-Electronic Engineering > 2020 Vol. 47 > No. 1 > 190136

Next Article Previous Article

E G, Wang Y X. Multi-candidate association online multi-target tracking based on R-FCN framework[J]. Opto-Electron Eng, 2020, 47(1): 190136. doi: 10.12086/oee.2020.190136

Citation:

E G, Wang Y X. Multi-candidate association online multi-target tracking based on R-FCN framework[J]. Opto-Electron Eng, 2020, 47(1): 190136. doi: 10.12086/oee.2020.190136

Multi-candidate association online multi-target tracking based on R-FCN framework

E Gui,
Wang Yongxiong^,

School of Optical-Electrical and Computer Engineering, University of Shanghai for Science and Technology, Shanghai 200093, China

Fund Project: Supported by National Natural Science Foundation of China (61673276, 61703277)

More Information

^*Corresponding author: Wang Yongxiong, E-mail: wyxiong@usst.edu.cn

Received Date 25 March 2019

Revised Date 27 June 2019

Published Date 01 January 2020

Abstract

Abstract

Online multi-target tracking is an important prerequisite for real-time video sequence analysis. Because of low reliability in target detection, high tracking loss rate and unsmooth trajectory in online multi-target tracking, an online multi-target tracking model based on R-FCN (region based fully convolutional networks) network framework is proposed. Firstly, the target evaluation function based on R-FCN network framework is used to select more reliable candidates in the next frame between KF and detection results. Second, the Siamese network is used to perform similarity measurement based on appearance features to complete the match between candidates and tracks. Finally, the tracking trajectory is optimized by the RANSAC (random sample consensus) algorithm. In crowded and partially occluded complex scenes, the proposed algorithm has higher target recognition ability, greatly reduces the phenomenon of missed detection and false detection, and the tracking track is more continuous and smooth. The experimental results show that under the same conditions, compared with the existing methods, the performance indicators of the proposed method, such as target tracking accuracy (MOTA), number of lost trajectories (ML) and number of false positives (FN), have been greatly improved.
- multi-target tracking /
- candidate model /
- Siamese network /
- trajectory estimation

FullText(HTML)

References

[1]	Ren S Q, He K M, Girshick R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(6): 1137-1149. doi: 10.1109/TPAMI.2016.2577031 CrossRef Google Scholar
[2]	Redmon J, Divvala S, Girshick R, et al. You only look once: Unified, real-time object detection[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, 2016: 779-788. Google Scholar
[3]	刘鑫, 金晅宏.四帧间差分与光流法结合的目标检测及追踪[J].光电工程, 2018, 45(8): 170665. doi: 10.12086/oee.2018.170665 CrossRef Google Scholar Liu X, Jin X H. Algorithm for object detection and tracking combined on four inter-frame difference and optical flow methods[J]. Opto-Electronic Engineering, 2018, 45(8): 170665. doi: 10.12086/oee.2018.170665 CrossRef Google Scholar
[4]	Bewley A, Ge Z Y, Ott L, et al. Simple online and realtime tracking[C]//2016 IEEE International Conference on Image Processing (ICIP), Phoenix, 2016: 3464-3468. Google Scholar
[5]	Wojke N, Bewley A, Paulus D. Simple online and realtime tracking with a deep association metric[C]//2017 IEEE International Conference on Image Processing (ICIP), Beijing, 2017: 3645-3649. Google Scholar
[6]	Thoreau M, Kottege N. Improving online multiple object tracking with deep metric learning[Z]. arXiv: 1806.07592v2[cs: CV], 2018. Google Scholar
[7]	Sadeghian A, Alahi A, Savarese S. Tracking the untrackable: Learning to track multiple cues with long-term dependencies[Z]. arXiv: 1701.01909[cs: CV], 2017. Google Scholar
[8]	Baisa N L. Online multi-target visual tracking using a HISP filter[C]//13th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications, Funchal, 2018. Google Scholar
[9]	Bae S H, Yoon K J. Confidence-based data association and discriminative deep appearance learning for robust online multi-object tracking[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40(3): 595-610. doi: 10.1109/TPAMI.2017.2691769 CrossRef Google Scholar
[10]	Milan A, Schindler K, Roth S. Multi-target tracking by discrete-continuous energy minimization[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016, 38(10): 2054-2068. doi: 10.1109/TPAMI.2015.2505309 CrossRef Google Scholar
[11]	Dehghan A, Assari S M, Shah M. GMMCP tracker: Globally optimal generalized maximum multi clique problem for multiple object tracking[C]//2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, 2015: 4091-4099. Google Scholar
[12]	齐美彬, 岳周龙, 疏坤, 等.基于广义关联聚类图的分层关联多目标跟踪[J].自动化学报, 2017, 43(1): 152-160. Google Scholar Qi M B, Yue Z L, Shu K, et al. Multi-object tracking using hierarchical data association based on generalized correlation clustering graphs[J]. Acta Automatica Sinica, 2017, 43(1): 152-160. Google Scholar
[13]	Wen L Y, Li W B, Yan J J, et al. Multiple target tracking based on undirected hierarchical relation hypergraph[C]//2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, 2014: 1282-1289. Google Scholar
[14]	Zagoruyko S, Komodakis N. Learning to compare image patches via convolutional neural networks[C]//2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, 2015: 4353-4361. Google Scholar
[15]	Dai J F, Li Y, He K M, et al. R-FCN: Object detection via region-based fully convolutional networks[Z]. arXiv: 1605.06409[cs: CV], 2016. Google Scholar
[16]	Iandola F N, Han S, Moskewicz M W, et al. SqueezeNet: Alexnet-level accuracy with 50x fewer parameters and < 0.5 MB model size[Z]. arXiv: 1602.07360[cs: CV], 2016. Google Scholar
[17]	He K M, Zhang X Y, Ren S Q, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition[Z]. arXiv: 1406.4729[cs: CV], 2014. Google Scholar
[18]	He K M, Zhang X Y, Ren S Q, et al. Deep residual learning for image recognition[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, 2016: 770-778. Google Scholar
[19]	Zheng L, Shen L Y, Tian L, et al. Scalable person re-identification: A benchmark[C]//2015 IEEE International Conference on Computer Vision (ICCV), Santiago, 2015: 1116-1124. Google Scholar
[20]	Bernardin K, Stiefelhagen R. Evaluating multiple object tracking performance: the CLEAR MOT metrics[J]. EURASIP Journal on Image and Video Processing, 2008, 2008: 246309. doi: 10.1155/2008/246309 CrossRef Google Scholar

Overview

Overview

Overview: As the application basis of human behavior recognition, semantic segmentation and unmanned driving, multi-target tracking is one of the research hotspots in the field of computer vision. In complex tracking scenarios, in order to track multiple targets stably and accurately, many difficulties in tracking need to be considered, such as camera motion, interaction between targets, missed detection and error detection. In recent years, with the rapid development of deep learning, many excellent multi-target tracking algorithms based on detection framework have emerged, which are mainly divided into online multi-target tracking method and offline multi-target tracking method. The multi-target tracking framework process on the basis of detection is as following: the target is detected by the off-line trained target detector, and then the similarity matching method is applied to correlate the detection target. Ultimately, the generated trajectory is continuously used to match the detection result to generate more reliable trajectory. Among them, online multi-target tracking methods mainly include Sort, Deep-sort, SDMT, etc., while offline multi-target tracking methods mainly include network flow model, conditional random field model and generalized association graph model. The offline multi-target tracking methods use multi-frame data information to realize the correlation between the target trajectory and the detection result in the data association process, and can obtain better tracking performance, simultaneously. Unfortunately, those methods are not used to real-time application scenarios. The online tracking methods only use the single-frame data information to complete the data association between the trajectory and the new target which is often unreliable, thus the data association of the lost target will be invalid and the ideal tracking effect cannot be obtained. For purpose of solving the reliability problem of the detection results, an online multi-target tracking method based on R-FCN framework is proposed. Firstly, a candidate model combining Kalman filtering prediction results with detection results is devised. The candidate targets are no longer only from the detection results, which enhances the robustness of the algorithm. Secondly, the Siamese network framework is applied to realize the similarity measurement with respect to the target appearance, and the multiple feature information of the target is merged to complete the data association between multiple targets, which improves the discriminating ability of the target in the complex tracking scene. In addition, on account of the possible missed detection and false detection of the target trajectory in the complex scene, the RANSAC algorithm is used to optimize the existing tracking trajectory so that we can obtain more complete and accurate trajectory information and synchronously the trajectories are more continuous and smoother. Finally, compared to some existing excellent algorithms, the experimental result indicates that the proposed method has brilliant performances in tracking accuracy, the number of lost tracks and target missed detections.