基于深度学习的飞机目标跟踪应用研究

赵春梅, 陈忠碧, 张建林. 基于深度学习的飞机目标跟踪应用研究[J]. 光电工程, 2019, 46(9): 180261. doi: 10.12086/oee.2019.180261
引用本文: 赵春梅, 陈忠碧, 张建林. 基于深度学习的飞机目标跟踪应用研究[J]. 光电工程, 2019, 46(9): 180261. doi: 10.12086/oee.2019.180261
Zhao Chunmei, Chen Zhongbi, Zhang Jianlin. Application of aircraft target tracking based on deep learning[J]. Opto-Electronic Engineering, 2019, 46(9): 180261. doi: 10.12086/oee.2019.180261
Citation: Zhao Chunmei, Chen Zhongbi, Zhang Jianlin. Application of aircraft target tracking based on deep learning[J]. Opto-Electronic Engineering, 2019, 46(9): 180261. doi: 10.12086/oee.2019.180261

基于深度学习的飞机目标跟踪应用研究

  • 基金项目:
    重大专项基金资助项目(G158207)
详细信息
    作者简介:
    通讯作者: 陈忠碧(1975-),女,博士,副研究员,主要从事运动目标检测与跟踪。E-mail:ioeyoyo@126.com
  • 中图分类号: TB872

Application of aircraft target tracking based on deep learning

  • Fund Project: Supported by Major Special Fund (G158207)
More Information
  • 本文针对飞机目标,提出了基于多域网络(MDNet)的改进网络用于飞机跟踪的快速深度学习(FDLAT)跟踪网络,使用迁移学习弥补目标跟踪的小样本集缺陷。卷积层作为特征提取层,全连接层作为目标和背景的分类层,采用特定的飞机数据集来更新网络参数。训练完成之后,结合回归模型,采用简单的线性更新对飞机进行跟踪,算法实现了飞机旋转、相似目标、模糊目标、复杂环境、尺度变换、目标遮挡以及形态变换等复杂状态的鲁棒跟踪,速度达到平均20.36 f/s,在ILSVRC2015飞机检测数据集上成功率均值达到0.592,基本满足飞机实时跟踪。

  • Overview: Deep learning has achieved good results in image classification, semantic segmentation, target detection and recognition. However, it is still restricted by small sample training sets on object tracking. Object tracking is one of the most important research topics in the field of computer vision, and has a wide range of applications. The challenge of object tracking lies in the complex states such as the target rotation, multi target, blur target, complex background, size change, target occlusion, fast moving and so on. In this paper, based on muti-domain network (MDNet), fast deep learning for aircraft tracking (FDLAT) algorithm is proposed to track aircraft target. This algorithm uses feature-based transfer learning to make up the inferiority of small sample sets, uses specific data sets to update parameters of convolutional layers and fully connected layers, and use it to distinguish aircraft from background. After building the training model, we put the aircraft video sets into the model and tracked the aircraft using regression model and a simple line on-line update, to increase the speed while ensuring the accuracy. This algorithm achieves robust tracking for aircraft in rotation, similar targets, fuzzy targets, complex environment, scale transformation, target occlusion, morphological transformation and other complex states. FDLAT is designed for lifting the speed while guaranteeing the precision of tracking. For the application of aircraft tracking, the FDLAT networks is trained by using 3 convolutional layers (Conv1~Conv3 of VGGNets) to extract the feature of aircraft target. Fc6 is a single layer and Fc4~Fc6 are used for the two classifications of aircraft and background, and the outputs are the probability of aircraft and background. In the process of tracking, the trained networks are used as feed-forward networks, and the candidate box of the maximum score of outputs is regressed to get the target location, while on-line updating is done by a simple linear operation. Our FDLAT algorithm is robust in aircraft target tracking, and basically meets the real-time requirements with high accuracy. This algorithm uses convolutional layers for feature extraction and fully connected layers for classification. Then the outputs of the networks perform a regression and location update operation in the testing process, which has good performance for scaling, occlusion, stealth, interference scene and covers the shortage of MDNet. A speed of 20.36 frames with the overlap reached 0.592 is achieved in the ILSVRC2015 detection sets of aircraft, basically meets the requirement of real-time for aircraft target tracking application.

  • 加载中
  • 图 1  FDLAT前馈网络

    Figure 1.  Feedforward network of FDLAT

    图 2  FDLAT与MDNet定性对比。

    Figure 2.  Qualitatively comparison between FDLAT and MDNet.

    图 3  定性评价指标分析。

    Figure 3.  Qualitative analysis of evaluation index.

    表 1  FDLAT网络操作以及结果

    Table 1.  Operation and results of FDLAT network

    Operation Input Fiter_size Strides Output
    Conv1 3@107×107 96@7×7 2 96@51×51
    ReLU, LRN
    Max_pooling 96@51×51 96@3×3 2 96@25×25
    Conv2 96@25×25 256@5×5 2 256@11×11
    ReLU, LRN
    Max_pooling 256@11×11 256@3×3 2 256@5×5
    Conv3 256@5×5 512@3×3 1 512@3×3
    ReLU
    Fc4, Dropout 512×3×3 512
    ReLU
    Fc5, Dropout 512 512
    ReLU
    Fc6 512 2
    下载: 导出CSV

    表 算法1  FDLAT训练过程算法流程

    Net: pretrained Conv1~Conv3 filters{w1, w2, w3}, Fc4~Fc6 filters{w4, w5, w6}
    Data: negative samples with neg=1, positive samples with pos=1
    Mini-batch: 128 samples with negative 96 and positive 32
    Loss: SGD with momentum=0.9 and weight_dacay=0.0005
    Updataweght: {w1, w2, w3}, with learning_rate =0.0001, {w4, w5, w6} with learning_rate=0.001
    Loop: loop_time=100, video_number=134, running_time=13400
    Output: neg=fn and pos=fp
    下载: 导出CSV

    表 算法2  回归算法流程

    Regression net: liner-regression
    Input of train: convolution feature of FDLAT as X, 800 candidate boxes as bbox and ground-truth as gt (come from the first frame)
    Error: computing the center error and width-height from gt and bbox error as Y
    Train: using feature X and error Y to train the liner-regression net
    Input of predict: convolution feature of FDLAT as X, candidate boxes as bbox of objects
    Output of predict: ground-truth of objects
    下载: 导出CSV

    表 算法3  FDLAT跟踪过程算法流程

    Net: fixed {w1, w2, w3, w4, w5, w6}
    Data: 32 candidate boxes with Gaussion distribution for every frame
    Regression: 800 candidate boxes with Gaussion distribution for first frame and output 1 box as plane position for training. Put xmax into regression net and get a regression_box
    Online updata: gt[i]=0.59*regressionbox+0.41*gt[i-1]
    Output:the position of plane gt
    下载: 导出CSV

    表 2  FDLAT与MDNet的定量分析结果

    Table 2.  Quantitative analysis results of FDLAT and MDNet

    Sequences MDNet FDLAT
    Overlap FPS CLE Overlap FPS CLE
    0/0030004 0.566 0.90 102.62 0.534 10.64 122.755
    0/0034004 0.579 2.52 33.777 0.636 21.41 20.466
    0/0034009 0.636 2.61 13.549 0.663 24.77 16.612
    0/0034014 0.650 1.87 33.85 0.629 16.85 38.619
    0/0034019 0.539 2.43 22.368 0.609 22.21 20.521
    0/0034023 0.634 2.60 31.659 0.468 22.90 59.312
    0/0117004 0.357 3.37 30.684 0.587 21.98 18.592
    0/0117019 0.666 2.72 7.505 0.623 27.96 10.787
    0/0117041 0.556 2.89 22.420 0.556 27.11 23.200
    1/0259029 0.355 1.26 277.460 0.497 14.14 215.733
    1/0321003 0.809 1.35 40.024 0.751 14.60 49.550
    2/0473003 0.693 1.44 83.954 0.701 15.75 76.477
    2/0555003 0.788 3.22 5.050 0.680 23.38 8.635
    2/0743004 0.418 2.05 44.799 0.485 23.74 92.193
    3/0939002 0.306 2.26 138.078 0.395 24.12 146.211
    3/1054001 0.698 0.66 189.850 0.599 16.24 218.183
    3/1099003 0.425 1.30 103.770 0.647 18.33 57.920
    Mean 0.569 2.08 69.495 0.592 20.36 70.339
    下载: 导出CSV
  • [1]

    Sivanantham S, Paul N N, Iyer R S. Object tracking algorithm implementation for security applications[J]. Far East Journal of Electronics and Communications, 2016, 16(1): 1–13. doi: 10.17654/EC016010001

    [2]

    Kwak S, Cho M, Laptev I, et al. Unsupervised object discovery and tracking in video collections[C]//Proceedings of 2015 IEEE International Conference on Computer Vision, 2015: 3173–3181.http://www.oalib.com/paper/4075520

    [3]

    罗海波, 许凌云, 惠斌, 等.基于深度学习的目标跟踪方法研究现状与展望[J].红外与激光工程, 2017, 46(5): 502002. http://d.old.wanfangdata.com.cn/Periodical/hwyjggc201705002

    Luo H B, Xu L Y, Hui B, et al. Status and prospect of target tracking based on deep learning[J]. Infrared and Laser Engineering, 2017, 46(5): 502002. http://d.old.wanfangdata.com.cn/Periodical/hwyjggc201705002

    [4]

    Comaniciu D, Ramesh V, Meer P. Kernel-based object tracking[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2003, 25(5): 564–577. doi: 10.1109/TPAMI.2003.1195991

    [5]

    Jia X, Lu H C, Yang M H. Visual tracking via adaptive structural local sparse appearance model[C]//Proceedings of 2012 IEEE Conference on Computer Vision and Pattern Recognition, 2012: 1822–1829.http://www.researchgate.net/publication/261334218_Visual_tracking_via_adaptive_structural_local_sparse_appearance_model

    [6]

    Henriques J F, Caseiro R, Martins P, et al. High-speed tracking with kernelized correlation filters[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(3): 583–596. doi: 10.1109/TPAMI.2014.2345390

    [7]

    樊香所, 徐智勇, 张建林.改进粒子滤波的弱小目标跟踪[J].光电工程, 2018, 45(8): 170569. doi: 10.12086/oee.2018.170569

    Fan X S, Xu Z Y, Zhang J L. Dim small target tracking based on improved particle filter[J]. Opto-Electronic Engineering, 2018, 45(8): 170569. doi: 10.12086/oee.2018.170569

    [8]

    奚玉鼎, 于涌, 丁媛媛, 等.一种快速搜索空中低慢小目标的光电系统[J].光电工程, 2018, 45(4): 170654. doi: 10.12086/oee.2018.170654

    Xi Y D, Yu Y, Ding Y Y, et al. An optoelectronic system for fast search of low slow small target in the air[J]. Opto-Electronic Engineering, 2018, 45(4): 170654. doi: 10.12086/oee.2018.170654

    [9]

    Krizhevsky A, Sutskever I, Hinton G E. ImageNet classification with deep convolutional neural networks[C]//Proceedings of the 25th International Conference on Neural Information Processing Systems, 2012: 1097–1105.http://www.researchgate.net/publication/267960550_ImageNe

    [10]

    Chatfield K, Simonyan K, Vedaldi A, et al. Return of the devil in the details: delving deep into convolutional nets[J]. arXiv: 1405.3531[cs.CV], 2014.

    [11]

    Hyeonseob N, Mooyeol B, Bohyung H. Modeling and Propagating CNNs in a Tree Structure for Visual Tracking[J]. arXiv: 1608.07242v1[cs.CV], 2016: 1–10.

    [12]

    Shelhamer E, Long J, Darrell T. Fully convolutional networks for semantic segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(4): 640–651. doi: 10.1109/TPAMI.2016.2572683

    [13]

    Girshick R, Donahue J, Darrell T, et al. Rich feature hierarchies for accurate object detection and semantic segmentation[C]//Proceedings of 2014 Conference on Computer Vision and Pattern Recognition, 2014: 580–587.

    [14]

    Bertinetto L, Valmadre J, Henriques J F, et al. Fully-convolutional Siamese networks for object tracking[C]//Proceedings of 2016 European Conference on Computer Vision, 2016: 850–865.http://link.springer.com/chapter/10.1007/978-3-319-48881-3_56

    [15]

    Valmadre J, Bertinetto L, Henriques J F, et al. End-to-end representation learning for Correlation Filter based tracking[C]//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017.http://www.researchgate.net/publication/320971954_End-to-End_Representation_Learning_for_Correlation_Filter_Based_Tracking

    [16]

    Nam H, Han B. Learning multi-domain convolutional neural networks for visual tracking[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016: 4293–4302.http://www.oalib.com/paper/4053404

    [17]

    Held D, Thrun S, Savarese S. Learning to track at 100 FPS with deep regression networks[C]//Proceedings of the 14th European Conference on Computer Vision, 2016: 745–765.http://link.springer.com/chapter/10.1007/978-3-319-46448-0_45

    [18]

    Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition[J]. arXiv: 1409.1556[cs.CV], 2014. http://www.oalib.com/paper/4068791

    [19]

    Chen K, Tao W B. Once for all: a two-flow convolutional neural network for visual tracking[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2018, 28(12): 3377–3386. doi: 10.1109/TCSVT.2017.2757061

    [20]

    Leal-Taixé L, Canton-Ferrer C, Schindler C. Learning by tracking: Siamese CNN for robust target association[C]//Proceedings of 2016 Computer Vision and Pattern Recognition Workshops, 2016: 418–425.http://ieeexplore.ieee.org/document/7789549/

    [21]

    Tao R, Gavves E, Smeulders A W M. Siamese instance search for tracking[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016: 1420–1429.http://xueshu.baidu.com/s?wd=paperuri%3A%28cd6b46ac6e7a4082d9c7c7732adcc0e1%29&filter=sc_long_sign&tn=SE_xueshusource_2kduw22v&sc_vurl=http%3A%2F%2Fieeexplore.ieee.org%2Fdocument%2F7780527%2F&ie=utf-8&sc_us=7275013320574764237

    [22]

    Wang N Y, Li S Y, Gupta A, et al. Transferring rich feature hierarchies for robust visual tracking[J]. arXiv: 1501.04587[cs.CV], 2015. http://www.oalib.com/paper/4069892

    [23]

    Zhai M Y, Roshtkhari M J, Mori G. Deep learning of appearance models for online object tracking[J]. arXiv: 1607.02568[cs.CV], 2016. http://www.researchgate.net/publication/305185999_Deep_Learning_of_Appearance_Models_for_Online_Object_Tracking

    [24]

    王慧燕, 杨宇涛, 张政, 等.深度学习辅助的多行人跟踪算法[J].中国图象图形学报, 2017, 22(3): 349–357. http://d.old.wanfangdata.com.cn/Periodical/zgtxtxxb-a201703009

    Wang H Y, Yang Y T, Zhang Z, et al. Deep-learning-aided multi-pedestrian tracking algorithm[J]. Journal of Image and Graphics, 2017, 22(3): 349–357. http://d.old.wanfangdata.com.cn/Periodical/zgtxtxxb-a201703009

    [25]

    王晓冬.视觉角度对游戏可玩性的影响[J].河南科技, 2014(7): 12. http://d.old.wanfangdata.com.cn/Periodical/hnkj201407010

    [26]

    Horikoshi K, Misawa K, Lang K. 20-fps motion capture of phase-controlled wave-packets for adaptive quantum control[C]//Proceedings of the 15th International Conference on Ultrafast Phenomena XV, 2006: 175–177.http://www.springerlink.com/content/n2217125511444v1/

  • 加载中

(3)

(5)

计量
  • 文章访问数:  9605
  • PDF下载数:  2565
  • 施引文献:  0
出版历程
收稿日期:  2018-05-17
修回日期:  2018-10-15
刊出日期:  2019-09-30

目录

/

返回文章
返回