﻿ 四帧间差分与光流法结合的目标检测及追踪
 光电工程  2018, Vol. 48 Issue (8): 170665      DOI: 10.12086/oee.2018.170665

Algorithm for object detection and tracking combined on four inter-frame difference and optical flow methods
Liu Xin, Jin Xuanhong
School of Optical-Electrical and Computer Engineering, University of Shanghai for Science and Technology, Shanghai 200093, China
Abstract: To solve the problem of multiple targets' detection and tracking under the complex environment, in this paper, an improved moving objects detection method is proposed based on four inter-frame differential method and optical flow algorithm. Firstly, four inter-frame difference method is used to process the of video sequences. Then objects in the video is detected accurately by the optical flow algorithm used on light streaming video sequences. This improved method enhances the processing speed of optical flow method and reduces the effects of environment's illumination. Finally, the paper compares the proposed algorithm with particle filter, ViBe algorithm under different scenarios with different moving targets and individual number. This improved method is proved not only with good robustness, but also can work more quickly and accurately on the target detection and tracking.
Keywords: object detection and tracking    four inter-frame difference method    optical flow method    particle filter    ViBe

1 引言

2 算法理论推导 2.1 四帧间差分算法

2.2 金字塔LK光流算法

1) 亮度不变。假设场景中的检测对象无论如何运动，其外观颜色是恒定的，即在图像中每连续两帧中的像素，它们的亮度保持恒定。

2) 时间恒定或小范围运动。即随着时间变化，图像中的目标运动变化缓慢，在每连续两帧之间目标仅有较小的位移量。

3) 空间恒定。图像中同一检测目标邻近的像素点的运动是相同的，且它们一定是汇聚在某个区域内的。

 $E(\mathit{\boldsymbol{d}}) = E({\mathit{\boldsymbol{d}}_x},{\mathit{\boldsymbol{d}}_y})\\ \quad = \sum\limits_{x = {\mathit{\boldsymbol{u}}_x} - w}^{{\mathit{\boldsymbol{u}}_x} + w} {\sum\limits_{y = {\mathit{\boldsymbol{u}}_y} - w}^{{\mathit{\boldsymbol{u}}_y} + w} {{{(I(x,y) - J(x + {\mathit{\boldsymbol{d}}_x},x + {\mathit{\boldsymbol{d}}_y}))}^2}} } 。$ (7)

 ${\mathit{\boldsymbol{u}}^L} = \frac{\mathit{\boldsymbol{u}}}{{{2^L}}}。$ (8)

 ${E^L}({\mathit{\boldsymbol{d}}^L}) = E(\mathit{\boldsymbol{d}}_x^L,\mathit{\boldsymbol{d}}_y^L) = \sum\limits_{x = \mathit{\boldsymbol{u}}_x^L - w}^{\mathit{\boldsymbol{u}}_x^L + w} {\sum\limits_{y = \mathit{\boldsymbol{u}}_y^L - w}^{\mathit{\boldsymbol{u}}_y^L + w} {[I(x,y)} } \\ \quad - J(x + \mathit{\boldsymbol{g}}_x^L + \mathit{\boldsymbol{d}}_x^L,y + \mathit{\boldsymbol{g}}_y^L + \mathit{\boldsymbol{d}}_y^L){]^2}。$ (9)

 ${\mathit{\boldsymbol{g}}^{L - 1}} = 2({\mathit{\boldsymbol{g}}^L} + {\mathit{\boldsymbol{d}}^L})。$ (10)

 ${\mathit{\boldsymbol{g}}^L} = {[\begin{array}{*{20}{c}} 0&0 \end{array}]^{\rm T}}。$ (11)

 $\mathit{\boldsymbol{d}} = {\mathit{\boldsymbol{g}}^0} + {\mathit{\boldsymbol{d}}^0}。$ (12)

$A(x,y) = {I^L}(x,y)$$B(x,y) = {J^L}(x + \mathit{\boldsymbol{g}}_x^L,y + \mathit{\boldsymbol{g}}_y^L) 式(7)变为式(13)：  {E^L}{\rm{(}}{\mathit{\boldsymbol{d}}^L}) = E{\rm{(}}\mathit{\boldsymbol{d}}_x^L,\mathit{\boldsymbol{d}}_y^L)\\ \quad = \sum\limits_{x = \mathit{\boldsymbol{u}}_x^L - w}^{\mathit{\boldsymbol{u}}_x^L + w} {\sum\limits_{y = \mathit{\boldsymbol{u}}_y^L - w}^{\mathit{\boldsymbol{u}}_y^L + w} {{{(I(x,y) - J(x + \mathit{\boldsymbol{d}}_x^L,y + \mathit{\boldsymbol{d}}_y^L))}^2}} } 。 (13)  \frac{{\partial E({\mathit{\boldsymbol{d}}^L})}}{{\partial {\mathit{\boldsymbol{d}}^L}}} \approx - 2\sum\limits_{x = \mathit{\boldsymbol{u}}_x^L - w}^{\mathit{\boldsymbol{u}}_x^L + w} {\sum\limits_{y = \mathit{\boldsymbol{u}}_y^L - w}^{\mathit{\boldsymbol{u}}_y^L + w} {\left( {A(x,y) - B(x,y)} \right.} }\\ \quad \left. { - \left[ {\begin{array}{*{20}{c}} {\frac{{\partial B}}{{\partial x}}}&{\frac{{\partial B}}{{\partial y}}} \end{array}} \right]{\mathit{\boldsymbol{d}}^L}} \right) \cdot \left[ {\begin{array}{*{20}{c}} {\frac{{\partial B}}{{\partial x}}}&{\frac{{\partial B}}{{\partial y}}} \end{array}} \right]。 (14)  \left[ {\begin{array}{*{20}{c}} {{\mathit{\boldsymbol{I}}_x}} \\ {{\mathit{\boldsymbol{I}}_y}} \end{array}} \right] = \left[ {\begin{array}{*{20}{c}} {\frac{{\partial B}}{{\partial x}}}&{\frac{{\partial B}}{{\partial y}}} \end{array}} \right],  \mathit{\boldsymbol{\delta I}} = A(x,y) - B(x,y), 推出：  \frac{1}{2}\frac{{\partial E({\mathit{\boldsymbol{d}}^L})}}{{\partial {\mathit{\boldsymbol{d}}^L}}} \approx \sum\limits_{x = \mathit{\boldsymbol{u}}_x^L - w}^{\mathit{\boldsymbol{u}}_x^L + w} {\sum\limits_{y = \mathit{\boldsymbol{u}}_y^L - w}^{\mathit{\boldsymbol{u}}_y^L + w} {\left( {\left[ {\begin{array}{*{20}{c}} {\mathit{\boldsymbol{I}}_x^2}&{{\mathit{\boldsymbol{I}}_x}{\mathit{\boldsymbol{I}}_y}} \\ {{\mathit{\boldsymbol{I}}_x}{\mathit{\boldsymbol{I}}_y}}&{\mathit{\boldsymbol{I}}_y^2} \end{array}} \right] \cdot {\mathit{\boldsymbol{d}}^L}} \right.} }\\ \quad \quad \left. { - \left[ {\begin{array}{*{20}{c}} {\mathit{\boldsymbol{\delta I}} \cdot {\mathit{\boldsymbol{I}}_x}} \\ {\mathit{\boldsymbol{\delta I}} \cdot {\mathit{\boldsymbol{I}}_y}} \end{array}} \right]} \right)。 (15) 记：  \mathit{\boldsymbol{G}} \approx \sum\limits_{x = \mathit{\boldsymbol{u}}_x^L - w}^{\mathit{\boldsymbol{u}}_x^L + w} {\sum\limits_{y = \mathit{\boldsymbol{u}}_y^L - w}^{\mathit{\boldsymbol{u}}_y^L + w} {\left[ {\begin{array}{*{20}{c}} {\mathit{\boldsymbol{I}}_x^2}&{{\mathit{\boldsymbol{I}}_x}{\mathit{\boldsymbol{I}}_y}} \\ {{\mathit{\boldsymbol{I}}_x}{\mathit{\boldsymbol{I}}_y}}&{\mathit{\boldsymbol{I}}_y^2} \end{array}} \right]} } ,  \mathit{\boldsymbol{b}} \approx \sum\limits_{x = \mathit{\boldsymbol{u}}_x^L - w}^{\mathit{\boldsymbol{u}}_x^L + w} {\sum\limits_{y = \mathit{\boldsymbol{u}}_y^L - w}^{\mathit{\boldsymbol{u}}_y^L + w} {\left[ {\begin{array}{*{20}{c}} {\mathit{\boldsymbol{\delta I}} \cdot {\mathit{\boldsymbol{I}}_x}} \\ {\mathit{\boldsymbol{\delta I}} \cdot {\mathit{\boldsymbol{I}}_y}} \end{array}} \right]} } 最终得到：  \frac{1}{2}\frac{{\partial E({\mathit{\boldsymbol{d}}^L})}}{{\partial {\mathit{\boldsymbol{d}}^L}}} \approx \mathit{\boldsymbol{G}} \cdot {\mathit{\boldsymbol{d}}^L} - \mathit{\boldsymbol{b}}。 (16) 3 四帧间差分-金字塔光流算法 综上可知，金字塔光流法可以检测出运动速度较快的物体，但计算量过于庞大，对硬件依赖程度高，且不能消除光照等外界环境所带来的影响。鉴于帧间差算法处理速度快的优点及金字塔光流模型高准确度的特点，将帧间差分法和金字塔光流法两种算法进行结合来对动态背景的检测目标进行提取。 在算法中，首先对图像进行四帧间差分运算，目的是得到检测目标的大致区域，再利用金字塔光流算法的特性进行运算处理。这样既减少了光线的影响，又能更好的对运动速度较快的目标进行检测，同时也降低了计算量。其流程图如图 1所示。  图 1 改进算法流程图 Fig. 1 Improved algorithm flow chart 该算法的具体步骤如下: 1) 将捕捉到的连续视频序列进行图像的预处理与降噪处理； 2) 取视频序列的连续四帧图像{I_k}(x,y)$${I_{k + 1}}(x,y)$${I_{k + 2}}(x,y)$${I_{k + 3}}(x,y)$

3) 对${I_k}(x,y)$${I_{k + 1}}(x,y)进行差分处理，获取帧差图像{d_{1k}}；对{I_{k + 2}}(x,y)$${I_{k + 3}}(x,y)$进行同样的差分运算处理，得到帧差图像${d_{2k}}$

4) 进一步对帧差图像${d_{1k}}$${d_{2k}}进行相应的图像二值化处理，然后逐位进行逻辑与运算，获取运动区域J(x,y) 5) 将J(x,y)建立金字塔模型{\mathit{\boldsymbol{J}}^L}$$L = 0,1,...,{L_m}$；初始化金字塔光流估计值${\mathit{\boldsymbol{g}}^L} = {[\begin{array}{*{20}{c}} 0&0 \end{array}]^{\rm{T}}}$${L_m}$层图像上特征点u的速度为${\mathit{\boldsymbol{u}}^L} = \mathit{\boldsymbol{u}}/{2^L}$

6) 对图像${\mathit{\boldsymbol{J}}^L}$求关于x的偏导数$\mathit{\boldsymbol{J}}_x^L$和关于y的偏导数$\mathit{\boldsymbol{J}}_y^L$，计算：

 $\mathit{\boldsymbol{G}} = \sum\limits_{x = \mathit{\boldsymbol{u}}_x^L - w}^{\mathit{\boldsymbol{u}}_x^L + w} {\sum\limits_{y = \mathit{\boldsymbol{u}}_y^L - w}^{\mathit{\boldsymbol{u}}_y^L + w} {\left[ {\begin{array}{*{20}{c}} {\mathit{\boldsymbol{J}}_x^2}&{{\mathit{\boldsymbol{J}}_x}{\mathit{\boldsymbol{J}}_y}} \\ {{\mathit{\boldsymbol{J}}_x}{\mathit{\boldsymbol{J}}_y}}&{\mathit{\boldsymbol{J}}_y^2} \end{array}} \right]} } ;$

7) 初始化光流值${\mathit{\boldsymbol{d}}_L} = {[\begin{array}{*{20}{c}} 0&0 \end{array}]^{\rm{T}}}$

8) 求$\mathit{\boldsymbol{\delta I}} = {J^L}(x + \mathit{\boldsymbol{g}}_x^L,y + \mathit{\boldsymbol{g}}_y^L)$，计算：

 $\mathit{\boldsymbol{b}} = \sum\limits_{x = \mathit{\boldsymbol{u}}_x^L - w}^{\mathit{\boldsymbol{u}}_x^L + w} {\sum\limits_{y = \mathit{\boldsymbol{u}}_y^L - w}^{\mathit{\boldsymbol{u}}_y^L + w} {\left[ {\begin{array}{*{20}{c}} {\mathit{\boldsymbol{\delta I}} \cdot {\mathit{\boldsymbol{I}}_x}} \\ {\mathit{\boldsymbol{\delta I}} \cdot {\mathit{\boldsymbol{I}}_y}} \end{array}} \right]} };$

9) 计算${L_m}$层的光流值${\mathit{\boldsymbol{d}}^L} = {\mathit{\boldsymbol{G}}^{ - 1}}\mathit{\boldsymbol{b}}$

10)计算L－1层的光流值：${\mathit{\boldsymbol{g}}^{L - 1}} = 2({\mathit{\boldsymbol{g}}^L} + {\mathit{\boldsymbol{d}}^L})$，以此类推得到最后的光流值：$\mathit{\boldsymbol{d}} = {\mathit{\boldsymbol{g}}^0} + {\mathit{\boldsymbol{d}}^0}$

11)图像$P(x,y)$的对应特征点为

 $\mathit{\boldsymbol{v}} = \mathit{\boldsymbol{u}} + \mathit{\boldsymbol{d}};$

12)得到运动目标图像$P(x,y)$

4 实验结果与分析

 图 2 静态背景下多目标处理效果图。(a)原图；(b)四帧间差分法；(c) ViBe算法；(d)光流法；(e)粒子滤波算法；(f)本文算法 Fig. 2 Multiple targets processing the renderings in static background. (a) The original image; (b) Four-frame difference method; (c) ViBe algorithm; (d) Optical flow algorithm; (e) Particle filter algorithm; (f) The algorithm in this paper

 处理方法 平均速率/(f·s-1) 追踪目标总数量 实际追踪数量 四帧差分法 104.17 16 9 ViBe算法 780.00 16 7 光流算法 23.67 16 10 粒子滤波算法 26.67 16 11 本算法 46.05 16 10

 图 3 动态背景下单个高速目标处理效果图。(a)原图；(b)四帧间差分法；(c) ViBe算法；(d)光流法；(e)粒子滤波算法；(f)本文算法 Fig. 3 Single high speed moving target processing the renderings in dynamic background. (a) The original image; (b) Four-frame difference method; (c) ViBe algorithm; (d) Optical flow algorithm; (e) Particle filter algorithm; (f) The algorithm in this paper

 处理方法 平均速率/(f·s-1) 追踪目标总数量 实际追踪数量 四帧差分法 86.91 1 无法识别 ViBe算法 780.00 1 1 光流算法 25.00 1 1 粒子滤波算法 29.70 1 1 本算法 41.67 1 1

 图 4 晃动摄像头下多目标处理效果图。(a)原图；(b)四帧间差分法；(c) ViBe算法；(d)光流法；(e)粒子滤波算法；(f)本文算法 Fig. 4 Multiple targets processing the renderings under the shaking camera. (a) The original image; (b) Four-frame difference method; (c) ViBe algorithm; (d) Optical flow algorithm; (e) Particle filter algorithm; (f) The algorithm in this paper

 处理方法 平均速率/(f·s-1) 追踪目标总数量 实际追踪数量 四帧差分法 84.27 2 无法识别 ViBe算法 757.58 2 无法识别 光流算法 24.00 2 2 粒子滤波算法 29.86 2 2 本算法 41.03 2 2

5 结论

 [1] Yuan G W. Research on moving objects detection and tracking methods in intelligent visual surveillance system[D]. Kunming: Yunnan University, 2012. 袁国武. 智能视频监控中的运动目标检测和跟踪算法研究[D]. 昆明: 云南大学, 2012. http://cdmd.cnki.com.cn/article/cdmd-10673-1012420011.htm [2] Tang X. Crowd abnormal behavior detection based on sparse coding[D]. Harbin: Harbin Institute of Technology, 2013. 唐迅. 基于稀疏编码的群体异常行为检测[D]. 哈尔滨: 哈尔滨工业大学, 2013. http://cdmd.cnki.com.cn/Article/CDMD-10213-1014003385.htm [3] Bai X F, Yang W, Chen P H. Improved moving object detection and tracking method[J]. Video Engineering, 2014, 38(1): 180-182. 白晓方, 杨卫, 陈佩珩. 一种改进的运动目标检测与跟踪方法[J]. 电视技术, 2014, 38(1): 180-182. [4] Xu J B. The research of detection and tracking of moving object[D]. Wuhan: China University of Geosciences, 2007. 许俊波. 目标检测与跟踪方法研究[D]. 武汉: 中国地质大学, 2007. http://cdmd.cnki.com.cn/Article/CDMD-10491-2007142922.htm [5] He H K, Tang N J, Li Z, et al. Research of model for dynamic object segmentation based on LBP kernel density estimation[J]. Application Research of Computers, 2012, 29(7): 2719-2721. 何黄凯, 唐宁九, 李征, 等. 基于LBP核密度估计的动态目标分割模型研究[J]. 计算机应用研究, 2012, 29(7): 2719-2721. [6] Tian X T, Guo D. Motion-compensated interpolation method based on codebook[J]. Computer Engineering, 2016, 42(9): 214-219. 田绪婷, 郭丹. 基于Codebook的运动补偿内插方法[J]. 计算机工程, 2016, 42(9): 214-219. [7] Zhang J M, Wang B. Moving object detection under condition of fast illumination change[J]. Opto-Electronic Engineering, 2016, 43(2): 14-21. 张金敏, 王斌. 光照快速变化条件下的运动目标检测[J]. 光电工程, 2016, 43(2): 14-21. DOI:10.3969/j.issn.1003-501X.2016.02.003 [8] Yuan G W, Chen Z Q, Gong J, et al. A Moving object detection algorithm based on a combination of optical flow and three-frame difference[J]. Journal of Chinese Computer Systems, 2013, 34(3): 668-671. 袁国武, 陈志强, 龚健, 等. 一种结合光流法与三帧差分法的运动目标检测算法[J]. 小型微型计算机系统, 2013, 34(3): 668-671. [9] Wang K, Wu M, Yao H, et al. Target detection method based on multi-frame background subtractionand cauchy model[J]. Opto-Electronic Engineering, 2016, 43(10): 12-17. 王凯, 吴敏, 姚辉, 等. 多帧背景差与Cauchy模型融合的目标检测[J]. 光电工程, 2016, 43(10): 12-17. DOI:10.3969/j.issn.1003-501X.2016.10.003 [10] Yu P, Wang X, Tong T L, et al. A target extraction algorithm based on GrabCut segmentation algorithm and four frame differencing[J]. Microcomputer & its Applications, 2016, 35(11): 40-42. 庾鹏, 王旭, 仝天乐, 等. 基于GrabCut算法和四帧差分法的目标提取算法[J]. 微型机与应用, 2016, 35(11): 40-42. [11] Enkelmann W. Investigations of multigrid algorithms for the estimation of optical flow fields in image sequences[J]. Computer Vision, Graphics, and Image Processing, 1988, 43(2): 150-177. DOI:10.1016/0734-189X(88)90059-X [12] Ren K Q, Yu Q M, Luo H L. Improved algorithm of moving objects detection based on gaussian mixture model[J]. Video Engineering, 2012, 36(23): 168-171. 任克强, 余启明, 罗会兰. 一种改进的混合高斯模型运动目标检测算法[J]. 电视技术, 2012, 36(23): 168-171. DOI:10.3969/j.issn.1002-8692.2012.23.047 [13] Yuan B H. Research on video moving object detection and tracking[D]. Hefei: Anhui University, 2014. 袁宝红. 基于视频的运动目标检测与跟踪研究[D]. 合肥: 安徽大学, 2014. http://cdmd.cnki.com.cn/Article/CDMD-10357-1014229726.htm [14] Hu S J, Ge X W, Chen Z H. Based on corner feature KLT track panoramic mosaic algorithm[J]. Journal of System Simulation, 2007, 19(8): 1742-1744. 胡社教, 葛西旺, 陈宗海. 基于角点特征的KLT跟踪全景图像拼接算法[J]. 系统仿真学报, 2007, 19(8): 1742-1744. [15] Schmidt R A, Cathey W T. Optical representations for artificial intelligence problems[C]//O-E/lase'86 Symp, Los Angeles: SPIE, 1986: 226-233. [16] Wu X G, Luo L M. An improved method of optical flow estimation[J]. Acta Electronica Sinica, 2000, 28(1): 130-131. 吴新根, 罗立民. 一种改进的光流场计算方法[J]. 电子学报, 2000, 28(1): 130-131. [17] Liu D. parallel optimization for video moving object detection and tracking algorithm based on GPU[D]. Changsha: National University of Defense Technology, 2013. 刘丹. 视频运动目标检测与跟踪算法的GPU并行优化[D]. 长沙: 国防科学技术大学, 2013. [18] Cai R C, Xie W H, Hao Z F, et al. Abnormal crowd detection based on multi-scale recurrent neural network[J]. Journal of Software, 2015, 26(11): 2884-2896. 蔡瑞初, 谢伟浩, 郝志峰, 等. 基于多尺度时间递归神经网络的人群异常检测[J]. 软件学报, 2015, 26(11): 2884-2896.