Saliency detection hybrid information flows based on sub-network cascading

Dong Bo; Wang Yongxiong; Zhou Yan; Liu Han; Gao Yuanzhi; Yu Jiamin; Zhang Mengyin

doi:10.12086/oee.2020.190627

Article navigation > Opto-Electronic Engineering > 2020 Vol. 47 > No. 7 > 190627

Next Article Previous Article

Dong B, Wang Y X, Zhou Y, et al. Saliency detection hybrid information flows based on sub-network cascading[J]. Opto-Electron Eng, 2020, 47(7): 190627. doi: 10.12086/oee.2020.190627

Citation:

Dong B, Wang Y X, Zhou Y, et al. Saliency detection hybrid information flows based on sub-network cascading[J]. Opto-Electron Eng, 2020, 47(7): 190627. doi: 10.12086/oee.2020.190627

Saliency detection hybrid information flows based on sub-network cascading

Institute of Optical-Electrical and Computer Engineering, University of Shanghai for Science and Technology, Shanghai 200093, China

Fund Project: Supported by National Natural Science Foundation of China (61673276)

More Information

Corresponding author: Wang Yongxiong, E-mail:wyxiong@usst.edu.cn

Received Date 17 October 2019

Revised Date 11 December 2019

Published Date 01 July 2020

Abstract

Abstract

In view of the detail feature loss issue existing in the complex scenario of existing saliency detection algorithms, a fusion method of multi-layer sub-network cascade hybrid information flows is proposed in this paper. We first use the FCNs backbone network to obtain multi-scale features. Through the multi-layer sub-network layering mining to build a cascading network framework, the context information of the characteristic of each level is fully used. The detection and segmentation tasks are processed jointly. Multi-scale features are integrated by hybrid information flows, and more characteristic information with discernment is learned step by step. Finally, the embedded attention mechanism effectively compensates the deep semantic information as a mask, and further distinguishes the foreground and the messy background. Compared with the existing 9 algorithms on the basis of the 6 public datasets, the running speed of the proposed algorithm can reach 20.76 frames and the experimental results are generally optimal on 5 evaluation indicators, even for the challenging new dataset SOC. The proposed method is obviously better than the classic algorithm. Experimental results were improved by 1.96%, 3.53%, 0.94%, and 0.26% for F-measure, weighted F-measure, S-measure, and E-measure, respectively. These experimental results show that the demonstrating the proposed model has higher accuracy and robustness and can be suitable for more complex environments, the proposed framework improves the performance significantly for state-of-the-art models on a series of datasets.
- saliency detection /
- cascade /
- hybrid information flows /
- attention mechanism

FullText(HTML)

References

[1]	张学典, 汪泓, 江旻珊, 等.显著性分析在对焦图像融合方面的应用[J].光电工程, 2017, 44(4): 435-441. doi: 10.3969/j.issn.1003-501X.2017.04.008 CrossRef Google Scholar Zhang X D, Wang H, Jiang M S, et al. Applications of saliency analysis in focus image fusion[J]. Opto-Electronic Engineering, 2017, 44(4): 435-441. doi: 10.3969/j.issn.1003-501X.2017.04.008 CrossRef Google Scholar
[2]	Piao Y R, Rong Z K, Zhang M, et al. Deep light-field-driven saliency detection from a single view[C]//Proceedings of the 28th International Joint Conference on Artificial Intelligence, 2019: 904-911. Google Scholar
[3]	Zhao J X, Cao Y, Fan D P, et al. Contrast prior and fluid pyramid integration for RGBD salient object detection[C]//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019: 3927-3936. Google Scholar
[4]	Zeng Y, Zhang P P, Lin Z, et al. Towards high-resolution salient object detection[C]//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision, 2019: 7233-7242. Google Scholar
[5]	Fan D P, Wang W G, Cheng M M, et al. Shifting more attention to video salient object detection[C]//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019: 8554-8564. Google Scholar
[6]	Shen C, Huang X, Zhao Q. Predicting eye fixations on webpage with an ensemble of early features and high-level representations from deep network[J]. IEEE Trans. Multimedia, 2015, 17(11): 2084-2093. Google Scholar
[7]	Itti L, Koch C, Niebur E. A model of saliency-based visual attention for rapid scene analysis[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1998, 20(11): 1254-1259. doi: 10.1109/34.730558 CrossRef Google Scholar
[8]	Perazzi F, Krähenbühl P, Pritch Y, et al. Saliency filters: Contrast based filtering for salient region detection[C]//2012 IEEE Conference on Computer Vision and Pattern Recognition, 2012: 733-740. Google Scholar
[9]	赵宏伟, 何劲松.基于贝叶斯框架融合深度信息的显著性检测[J].光电工程, 2018, 45(2): 170341. doi: 10.12086/oee.2018.170341 CrossRef Google Scholar Zhao H W, He J S. Saliency detection method fused depth information based on Bayesian framework[J]. Opto-Electronic Engineering, 2018, 45(2): 170341. doi: 10.12086/oee.2018.170341 CrossRef Google Scholar
[10]	Wei Y C, Wen F, Zhu W J, et al. Geodesic saliency using background priors[C]//Proceedings of the 12th European Conference on Computer Vision, 2012: 29-42. Google Scholar
[11]	Liu J J, Hou Q, Cheng M M, et al. A Simple Pooling-Based Design for Real-Time Salient Object Detection[C]//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019: 3917-3926. Google Scholar
[12]	Liu N, Han J W, Yang M H. PiCANet: Learning pixel-wise contextual attention for saliency detection[C]//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018: 3089-3098. Google Scholar
[13]	Chen K, Pang J M, Wang J Q, et al. Hybrid task cascade for instance segmentation[C]//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019: 4974-4983. Google Scholar
[14]	Long J, Shelhamer E, Darrell T. Fully convolutional networks for semantic segmentation[C]//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition, 2015: 3431-3440. Google Scholar
[15]	Yan Q, Xu L, Shi J P, et al. Hierarchical saliency detection[C]//Proceedings of 2013 IEEE Conference on Computer Vision and Pattern Recognition, 2013: 1155-1162. Google Scholar
[16]	Li G B, Yu Y Z. Visual saliency based on multiscale deep features[C]//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition, 2015: 5455-5463. Google Scholar
[17]	Xi X Y, Luo Y K, Wang P, et al. Salient object detection based on an efficient end-to-end saliency regression network[J]. Neurocomputing, 2019, 323: 265-276. doi: 10.1016/j.neucom.2018.10.002 CrossRef Google Scholar
[18]	Li Y, Hou X D, Koch C, et al. The secrets of salient object segmentation[C]//Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014: 280-287. Google Scholar
[19]	He X M, Zemel R S, Carreira-Perpinan M A. Multiscale conditional random fields for image labeling[C]//Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004: II. Google Scholar
[20]	Wang L J, Lu H C, Wang Y F, et al. Learning to detect salient objects with image-level supervision[C]//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017: 136-145. Google Scholar
[21]	Fan D P, Cheng M M, Liu J J, et al. Salient objects in clutter: bringing salient object detection to the foreground[C]//Proceedings of the 15th European Conference on Computer Vision, 2018: 186-202. Google Scholar
[22]	Cheng M M, Mitra N J, Huang X L, et al. Global contrast based salient region detection[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(3): 569-582. doi: 10.1109/TPAMI.2014.2345401 CrossRef Google Scholar
[23]	Cheng M M, Warrell J, Lin W Y, et al. Efficient salient region detection with soft image abstraction[C]//Proceedings of 2013 IEEE International Conference on Computer vision, 2013: 1529-1536. Google Scholar
[24]	Margolin R, Zelnik-Manor L, Tal A. How to evaluate foreground maps[C]//Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition, 2014: 248-255. Google Scholar
[25]	Fan D P, Cheng M M, Liu Y, et al. Structure-measure: a new way to evaluate foreground maps[C]//Proceedings of 2017 IEEE International Conference on Computer Vision, 2017: 4548-4557. Google Scholar
[26]	Fan D P, Gong C, Cao Y, et al. Enhanced-alignment measure for binary foreground map evaluation[C]//Proceedings of the 27th International Joint Conference on Artificial Intelligence, 2018: 698-704. Google Scholar
[27]	Li G B, Yu Y Z. Deep contrast learning for salient object detection[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition, 2016: 478-487. Google Scholar
[28]	Hou Q B, Cheng M M, Hu X W, et al. Deeply supervised salient object detection with short connections[C]//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017: 3203-3212. Google Scholar
[29]	Liu N, Han J W. DHSNet: deep hierarchical saliency network for salient object detection[C]//Proceedings of 2018 IEEE Conference on Computer Vision and Pattern Recognition, 2016: 678-686. Google Scholar
[30]	Zhang P P, Wang D, Lu H C, et al. Amulet: aggregating multi-level convolutional features for salient object detection[C]//Proceedings of 2017 IEEE International Conference on Computer Vision, 2017: 202-211. Google Scholar
[31]	Hu P, Shuai B, Liu J, et al. Deep level sets for salient object detection[C]//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2016: 2300-2309. Google Scholar
[32]	Luo Z M, Mishra A, Achkar A, et al. Non-local deep features for salient object detection[C]//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition, 2017: 6609-6617. Google Scholar
[33]	Wang T T, Borji A, Zhang L H, et al. A stagewise refinement model for detecting salient objects in images[C]//Proceedings of 2017 IEEE International Conference on Computer Vision, 2017: 4039-4048. Google Scholar
[34]	Jiang H D, Wang J D, Yuan Z J, et al. Salient object detection: a discriminative regional feature integration approach[C]//Proceedings of 2013 IEEE Conference on Computer Vision and Pattern Recognition, 2013: 2083-2090. Google Scholar
[35]	Wang L Z, Wang L J, Lu H C, et al. Saliency detection with recurrent fully convolutional networks[C]//Proceedings of the 14th European Conference on Computer Vision, 2016: 825-841. Google Scholar

Overview

Overview

Overview: Saliency detection (SOD) is to detect and segment most important foreground objects that are modeled to accurately locate the mechanism of human visual attention. It has many types, including RGB SOD, light field SOD, RGB-D SOD, and high-resolution SOD. In the video scene, there are object SOD and fixation SOD, while the specific task is broken down into object-level saliency detection and instance-level significance detection. In view of the multi-scale feature fusion problem existing in the complex scenario of the existing saliency object detection algorithms, a fusion method of multi-layer sub-network cascade hybrid information flows is proposed in this paper. First of all, the FCNs backbone network and feature pyramid structure are used to learn multi-scale features. Then, through the multi-layer sub-network layering mining to build a cascading network framework, the context information of the characteristic of each level is fully used. The method of information extraction and flows determines the effect of final feature fusion, so we use the hybrid information flows to integrate multi-scale characteristics and learn more characteristic information with discernment. In order to solve the problem of semantic information fusion, high-level semantic information is used to guide the bottom layer, obtaining more effective context information. In this paper, we adopt the way of channel combination fusion, and the sampling processing is accompanied by the convolution layer smoothing the fusion feature map, making the next fusion more effective. Finally, the effective saliency feature is transmitted as mask information, which realizes the efficient transmission of information flows and further distinguishes the foreground and messy background. Finally, the multi-stage saliency mapping nonlinear weighted fusion is combined to complement the redundant features. Compared with the existing 9 algorithms on the basis of the 6 public datasets, the run speed of the proposed algorithm can reach 20.76 frames and the experimental results are generally optimal on 5 evaluation indicators, even for the challenging new dataset SOC. The proposed method is obviously better than the classic algorithm. Experimental results were improved by 1.96%, 3.53%, 0.94%, and 0.26% for F-measure, weighted F-measure, S-measure, and E-measure, respectively, effectively demonstrating the accuracy and robustness of the proposed model. Through the visual qualitative analysis verification, the correlation analysis and running speed analysis of different indicators are carried out, which further highlights the superior performance of the proposed model. In addition, this paper verifies the effectiveness of each module, which further explains the efficiency of the proposed cascading framework that mixes information flow and attention mechanisms. This model may provide a new way for multi-scale integration, which is conducive to further study.