Multi-occluded pedestrian real-time detection algorithm based on preprocessing R-FCN
Liu Hui, Peng Li, Wen Jiwei
Engineering Research Center of Internet of Things Technology Applications of the Ministry of Education, School of Internet of Things Engineering, Jiangnan University, Wuxi, Jiangsu 214122, China
Abstract: One of main challenges of driver assistance systems is to detect multi-occluded pedestrians in real-time in complicated scenes, to reduce the number of traffic accidents. In order to improve the accuracy and speed of detection system, we proposed a real-time multi-occluded pedestrian detection algorithm based on R-FCN. RoI Align layer was introduced to solve misalignments between the feature map and RoI of original images. A separable convolution was optimized to reduce the dimensions of position-sensitive score maps, to improve the detection speed. For occluded pedestrians, a multi-scale context algorithm is proposed, which adopt a local competition mechanism for adaptive context scale selection. For low visibility of the body occlusion, deformable RoI pooling layers were introduced to expand the pooled area of the body model. Finally, in order to reduce redundant information in the video sequence, Seq-NMS algorithm is used to replace traditional NMS algorithm. The experiments have shown that there is low detection error on the datasets Caltech and ETH, the accuracy of our algorithm is better than that of the detection algorithms in the sets, works particularly well with occluded pedestrians.
Keywords: multi-occluded pedestrian    separable convolution layer    multi-scale context    deformable RoI pooling layer

1 引言

 图 1 整体网络结构图 Fig. 1 Schematic of the network structure
2 R-FCN网络

 图 2 R-FCN结构图 Fig. 2 Schematic of the R-FCN structure
 ${r_c}(i, j|\mathit{\Theta} ) = \frac{1}{n}\sum\limits_{(x, y) \in bin(i, j)} {{z_{i, j, c}}} (x + {x_0}, y + {y_0}|\mathit{\Theta} ),$ (1)

3.3 可形变池化层

 图 4 3×3可形变RoI池化示例 Fig. 4 Illustration of 3×3 deformable RoI pooling
 ${{\mathit{\boldsymbol{y}}}}(i, j){\rm{ }} = \sum\nolimits_{p \in bin(i, j)} {{{\mathit{\boldsymbol{x}}}}({p_0} + p)/{n_{i, j}}} ,$ (4)

 ${{\mathit{\boldsymbol{y}}}}(i, j){\rm{ }} = \sum\nolimits_{p \in bin(i, j)} {{{\mathit{\boldsymbol{x}}}}({p_0} + p + \Delta {p_{i, j}})/{n_{i, j}}} 。$ (5)

 ${{\mathit{\boldsymbol{x}}}}(p) = \sum\nolimits_q {G(q, p) \cdot {{\mathit{\boldsymbol{x}}}}(q)},$ (6)

3.5 算法训练检测步骤

4 实验结果与分析

 $I{\rm{o}}U = \frac{{area({B_{{\rm{dt}}}} \cap {B_{{\rm{gt}}}})}}{{area({B_{{\rm{dt}}}} \cup {B_{{\rm{gt}}}})}} > 0.5,$ (10)

 $FPPI = \frac{{FP}}{{TN + FP}} \times 100\% {\rm{ }},$ (11)
 $MR = \frac{{FN}}{{FN + TP}} \times 100\% {\rm{ }},$ (12)

4.1 Caltech实验结果比较

 图 5 Caltech数据集的结果比较。(a)部分遮挡；(b)严重遮挡 Fig. 5 Comparison results on the Caltech bench-mark. (a) Part-occlusion; (b) Heavy-occlusion

 Algorithm Fast D-FCN SSD R-FCN Test size Base-model Part-occlusion(MR)/% 640x480 ResNet-50 14.86 512x512 ResNet-50 20.49 640x480 ResNet-50 16.09 Heavy-occlusion(MR)/% 42.36 57.64 55.81 Speed/(f/s) 48.71 35.42 11.24

4.2 ETH数据实验比较

 图 6 ETH数据集检测结果 Fig. 6 Results on the ETH benchmark

Caltech数据检测效果如图 7(a)，7(b)所示，7(a)表示部分遮挡，7(b)表示严重遮挡。ETH数据检测效果如图 7(c)，7(d)所示，7(c)表示部分遮挡，7(d)表示严重遮挡。

 图 7 算法检测效果 Fig. 7 Test result carried out by the algorithm
5 结论

