Citation: | Zuo H R, Xu Z Y, Zhang J L, Jia G. Visual tracking based on transfer learning of deep salience information. Opto-Electron Adv 3, 190018 (2020). doi: 10.29026/oea.2020.190018 |
[1] | Nam H, Han B. Learning multi-domain convolutional neural networks for visual tracking. In Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition 4293-4302 (IEEE, 2016); http://doi.org/10.1109/CVPR.2016.465. |
[2] | Yang M H, Lin R S, Lim J, Ross D. Adaptive discriminative generative model and application to visual tracking: US, 7369682. 2008. |
[3] | Liu S, Zhang T Z, Cao X C, Xu C S. Structural correlation filter for robust visual tracking. In Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition 4312-4320 (IEEE, 2016); http://doi.org/10.1109/CVPR.2016.467. |
[4] | Ko B C, Kwak J Y, Nam J Y. Human tracking in thermal images using adaptive particle filters with online random forest learning. Opt Eng 52, 113105 (2013). |
[5] | Li X, Dick A, Wang H Z, Shen C H, Van Der Hengel A. Graph mode-based contextual kernels for robust SVM tracking. In Proceedings of 2011 International Conference on Computer Vision 1156-1163 (IEEE, 2011); http://doi.org/10.1109/ICCV.2011.6126364. |
[6] | Wang N Y, Yeung D Y. Learning a deep compact image representation for visual tracking. In Proceedings of the 26th International Conference on Neural Information Processing Systems 809-817 (Curran Associates Inc, 2013). |
[7] | Cui Z, Xiao S T, Feng J S, Yan S C. Recurrently target-attending tracking. In Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition 1449-1458 (IEEE, 2016); http://doi.org/10.1109/CVPR.2016.161. |
[8] | Wang L J, Ouyang W L, Wang X G, Lu H C. STCT: sequentially training convolutional networks for visual tracking. In Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition 1373-1381 (IEEE, 2016); http://doi.org/10.1109/CVPR.2016.153. |
[9] | Krizhevsky A, Sutskever I, Hinton G E. Imagenet classification with deep convolutional neural networks. In Proceedings of the 25th International Conference on Neural Information Processing Systems 1097-1105 (Curran Associates Inc, 2012). |
[10] | Hou X D, Zhang L Q. Saliency detection: a spectral residual approach. In Proceedings of 2007 IEEE Conference on Computer Vision and Pattern Recognition 1-8 (IEEE, 2007); http://doi.org/10.1109/CVPR.2007.383267. |
[11] | Girshick R, Donahue J, Darrell T, Malik J. Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition 580-587 (IEEE, 2014); http://doi.org/10.1109/CVPR.2014.81. |
[12] | Yi Y, Su L, Huang Q M, Wu Z, Wang C F. Saliency detection with two-level fully convolutional networks. In Proceedings of 2017 IEEE International Conference on Multimedia and Expo 271-276 (IEEE, 2017); http://doi.org/10.1109/ICME.2017.8019309. |
[13] | Zhang L H, Ai J W, Jiang B W, Lu H C, Li X K. Saliency Detection via Absorbing Markov Chain with Learnt Transition Probability. IEEE Transactions on image processing: a Publication of the IEEE Signal Processing Society. 27 (2), 987-998 (IEEE, 2018) |
[14] | Achanta R, Hemami S, Estrada F, Susstrunk S. Frequency-tuned salient region detection. In Proceedings of 2009 IEEE Conference on Computer Vision and Pattern Recognition 1597-1604 (IEEE, 2009); http://doi.org/10.1109/CVPR.2009.5206596. |
[15] | Cheng M M, Zhang G X, Mitra N J, Huang X L, Hu S M, Global contrast based salient region detection. In Proceedings of CVPR 2011 409-416 (IEEE, 2011); http://doi.org/10.1109/CVPR.2011.5995344. |
[16] | Zhao R, Ouyang W L, Li H S, Wang X G. Saliency detection by multi-context deep learning. In Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition 1265-1274 (IEEE, 2015); http://doi.org/10.1109/CVPR.2015.7298731. |
[17] | Wang L Z, Wang L J, Lu H C, Zhang P P, Ruan X. Saliency detection with recurrent fully convolutional networks. In Proceedings of the 14th European Conference on Computer Vision 825-841 (Springer, 2016); http://doi.org/10.1007/978-3-319-46493-0_50. |
[18] | Liu N, Han J W. DHSnet: deep hierarchical saliency network for salient object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition 678-686 (IEEE, 2016); http://doi.org/10.1109/CVPR.2016.80. |
[19] | Li X, Zhao L M, Wei L N, Yang M H, Wu F et al. DeepSaliency: multi-task deep neural network model for salient object detection. IEEE Trans Image Process 25, 3919-3930 (2016). doi: 10.1109/TIP.2016.2579306 |
[20] | Wang N Y, Li S Y, Gupta A, Yeung D Y. Transferring rich feature hierarchies for robust visual tracking. In Proceedings of 2015 Conference on Computer Vision and Pattern Recognition (2015). |
[21] | Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. arXiv: 1409.1556 (2014). |
[22] | Danelljan M, Robinson A, Khan F S, Felsberg M. Beyond correlation filters: learning continuous convolution operators for visual tracking. In Proceedings of the 14th European Conference on Computer Vision 2016 472-488 (Springer, 2016); http://doi.org/10.1007/978-3-319-46454-1_29. |
[23] | Danelljan M, Bhat G, Khan F S, Felsberg M. ECO: efficient convolution operators for tracking. In Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition 6931-6939 (IEEE, 2016); http://doi.org/10.1109/CVPR.2017.733. |
[24] | Jian M W, Lam K M, Dong J Y, Shen L L. Visual-patch-attention-aware saliency detection. IEEE Trans Cybern 45, 1575-1586 (2015). doi: 10.1109/TCYB.2014.2356200 |
[25] | Fang Y M, Lin W S, Lau C T, Lee B S. A visual attention model combining top-down and bottom-up mechanisms for salient object detection. In Proceedings of 2011 IEEE International Conference on Acoustics, Speech and Signal Processing 1293-1296 (IEEE, 2011); http://doi.org/10.1109/ICASSP.2011.5946648. |
[26] | Ochs P, Malik J, Brox T. Segmentation of moving objects by long term video analysis. IEEE Trans Pattern Anal Mach Intell 36, 1187-1200 (2014). doi: 10.1109/TPAMI.2013.242 |
[27] | Li F X, Kim T, Humayun A, Tsai D, Rehg J M. Video segmentation by tracking many figure-ground segments. In Proceedings of 2013 IEEE International Conference on Computer Vision 2192-2199 (IEEE, 2013); http://doi.org/10.1109/ICCV.2013.273. |
[28] | Bottou L. Large-scale machine learning with stochastic gradient descent. In Proceedings of COMPSTAT'2010 177-186 (Springer, 2010); http://doi.org/10.1007/978-3-7908-2604-3_16. |
[29] | Hoffman J, Kulis B, Darrell T, Saenko K. Discovering latent domains for multisource domain adaptation. In Proceedings of the 12th European Conference on Computer Vision 702-715 (Springer, 2012); http://doi.org/10.1007/978-3-642-33709-3_50 |
[30] | Wu Y, Lim J, Yang M H. Online object tracking: a benchmark. In Proceedings of 2013 IEEE Conference on Computer Vision and Pattern Recognition 2411-2418 (IEEE, 2013); http://doi.org/10.1109/CVPR.2013.312. |
[31] | Kalal Z, Mikolajczyk K, Matas J. Tracking-learning-detection. IEEE Trans Pattern Anal Mach Intell 34, 1409-1422 (2012). doi: 10.1109/TPAMI.2011.239 |
[32] | Hare S, Saffari A, Torr P H S. Struck: structured output tracking with kernels. In Proceedings of 2011 International Conference on Computer Vision 263-270 (IEEE, 2011); http://doi.org/10.1109/ICCV.2011.6126251. |
[33] | Danelljan M, Häger G, Khan F S, Felsberg M. Accurate scale estimation for robust visual tracking. In Proceedings of British Machine Vision Conference (BMVA Press, 2014). |
[34] | Valmadre J, Bertinetto L, Henriques J, Vedaldi A, Torr P H S. End-to-end representation learning for correlation filter based tracking. In Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition 5000-5008 (IEEE, 2017); http://doi.org/10.1109/CVPR.2017.531. |
[35] | Wang Q, Gao J, Xing J L, Zhang M D, Hu W M. DCFNet: discriminant correlation filters network for visual tracking. arXiv: 1704.04057 (2017). |
[36] | Zhang J M, Ma S G, Sclaroff S. MEEM: robust tracking via multiple experts using entropy minimization. In Proceedings of the 13th European Conference on Computer Vision 188-203 (Springer, 2014); http://doi.org/10.1007/978-3-319-10599-4_13. |
[37] | Bertinetto L, Valmadre J, Golodetz S, Miksik O, Torr P H S. Staple: complementary learners for real-time tracking. In Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition 1404-1409 (IEEE, 2016); http://doi.org/10.1109/CVPR.2016.156. |
[38] | Ma C, Yang X K, Zhang C Y, Yang M H. Long-term correlation tracking. In Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition 5388-5396 (IEEE, 2015); http://doi.org/10.1109/CVPR.2015.7299177. |
[39] | Henriques J F, Caseiro R, Martins P, Batista J. High-speed tracking with kernelized correlation filters. IEEE Trans Pattern Anal Mach Intell 37, 583-596 (2015). doi: 10.1109/TPAMI.2014.2345390 |
[40] | Fan H, Ling H B. SANet: structure-aware network for visual tracking. In Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops 2217-2224 (IEEE, 2017); http://doi.org/10.1109/CVPRW.2017.275. |
[41] | Yun S, Choi J, Yoo Y, Yun K, Choi J Y. Action-decision networks for visual tracking with deep reinforcement learning. In Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition 1349-1358 (IEEE, 2017); http://doi.org/10.1109/CVPR.2017.148. |
[42] | Danelljan M, Häger G, Khan F S, Felsberg M. Learning spatially regularized correlation filters for visual tracking. In Proceedings of 2015 IEEE International Conference on Computer Vision 4310-4318 (IEEE, 2015); http://doi.org/10.1109/ICCV.2015.490. |
[43] | Danelljan M, Häger G, Khan F S, Felsberg M. Convolutional features for correlation filter based visual tracking. In Proceedings of 2015 IEEE International Conference on Computer Vision Workshop 621-629 (IEEE, 2015); http://doi.org/10.1109/ICCVW.2015.84. |
[44] | Bertinetto L, Valmadre J, Henriques J F, Vedaldi A, Torr P H S. Fully-convolutional Siamese networks for object tracking. In Proceedings of the European Conference on Computer Vision 850-865 (Springer, 2016); http://doi.org/10.1007/978-3-319-48881-3_56. |
[45] | Wu H R, Xu Z Y, Zhang J L, Yan W, Ma X. Face recognition based on convolution Siamese networks. In Proceedings of 2017 10th International Congress on Image and Signal Processing, BioMedical Engineering and Informatics 1-5 (IEEE, 2017); http://doi.org/10.1109/CISP-BMEI.2017.8302003. |
[46] | Chen L W, Yin Y M, Li Y, Hong M H. Multifunctional inverse sensing by spatial distribution characterization of scattering photons. Opto-Electron Adv 2, 190019 (2019). |
[47] | Wu H R, Xu Z Y, Zhang J L, Jia G. Offset-adjustable deformable convolution and region proposal network for visual tracking. IEEE Access 7, 85158-85168 (2019). doi: 10.1109/ACCESS.2019.2925737 |
The FCNs for salience detection.
Salience sketch image from videos.
The architecture of new multi-domain network.
Representative images selected from the Car2 sequence of VOT15.
The weights for the images distributed by Gaussian distribution to generate certain numbers of samples.
The Precision plots and the success plots on the OTB50 dataset.
The Precision plots and the success plots on the OTB50 dataset.
Comparison with the state-of-the-art methods on the OTB100 dataset.
Comparison among the proposed method and several deep-learning methods and traditional methods on UAV123.
The tracking examples where our proposed algorithm is compared with other trackers.