Citation: | Xie B, Liu Y Q, Li Y L. Colorectal polyp segmentation method combining polarized self-attention and Transformer[J]. Opto-Electron Eng, 2024, 51(10): 240179. doi: 10.12086/oee.2024.240179 |
[1] | Liang H, Cheng Z M, Zhong H Q, et al. A region-based convolutional network for nuclei detection and segmentation in microscopy images[J]. Biomed Signal Process Control, 2022, 71: 103276. doi: 10.1016/j.bspc.2021.103276 |
[2] | Jha D, Smedsrud P H, Johansen D, et al. A comprehensive study on colorectal polyp segmentation with ResUNet++, conditional random field and test-time augmentation[J]. IEEE J Biomed Health Inform, 2021, 25(6): 2029−2040. doi: 10.1109/JBHI.2021.3049304 |
[3] | Li W S, Zhao Y H, Li F Y, et al. MIA-Net: multi-information aggregation network combining transformers and convolutional feature learning for polyp segmentation[J]. Knowl-Based Syst, 2022, 247: 108824. doi: 10.1016/j.knosys.2022.108824 |
[4] | 丁俊华, 袁明辉. 基于双分支多尺度融合网络的毫米波SAR图像多目标语义分割方法[J]. 光电工程, 2023, 50(12): 230242. doi: 10.12086/oee.2023.230242 Ding J H, Yuan M H. A multi-target semantic segmentation method for millimetre wave SAR images based on a dual-branch multi-scale fusion network[J]. Opto-Electron Eng, 2023, 50(12): 230242. doi: 10.12086/oee.2023.230242 |
[5] | Vala M H J, Baxi A. A review on Otsu image segmentation algorithm[J]. Int J Adv Res Comput Eng Technol, 2013, 2(2): 387−389. |
[6] | Vincent L, Soille P. Watersheds in digital spaces: an efficient algorithm based on immersion simulations[J]. IEEE Trans Pattern Anal Mach Intell, 1991, 13(6): 583−598. doi: 10.1109/34.87344 |
[7] | Canny J. A computational approach to edge detection[J]. IEEE Trans Pattern Anal Mach Intell, 1986, PAMI-8(6): 679−698. doi: 10.1109/TPAMI.1986.4767851 |
[8] | Liang Y B, Fu J. Watershed algorithm for medical image segmentation based on morphology and total variation model[J]. Int J Patt Recogn Artif Intell, 2019, 33(5): 1954019. doi: 10.1142/S0218001419540193 |
[9] | Ali S M F, Khan M T, Haider S U, et al. Depth-wise separable atrous convolution for polyps segmentation in gastro-intestinal tract[C]//Proceedings of the Working Notes Proceedings of the MediaEval 2020 Workshop, 2021. |
[10] | Dosovitskiy A, Beyer L, Kolesnikov A, et al. An image is worth 16x16 words: transformers for image recognition at scale[C]//Proceedings of the 9th International Conference on Learning Representations, 2021. |
[11] | Wang W H, Xie E Z, Li X, et al. Pyramid vision transformer: a versatile backbone for dense prediction without convolutions[C]//Proceedings of 2021 IEEE/CVF International Conference on Computer Vision, 2021: 548–558. https://doi.org/10.1109/ICCV48922.2021.00061. |
[12] | Wu C, Long C, Li S J, et al. MSRAformer: multiscale spatial reverse attention network for polyp segmentation[J]. Comput Biol Med, 2022, 151: 106274. doi: 10.1016/j.compbiomed.2022.106274 |
[13] | Liu Z, Lin Y T, Cao Y, et al. Swin transformer: hierarchical vision transformer using shifted windows[C]//Proceedings of 2021 IEEE/CVF International Conference on Computer Vision, 2021: 9992–10002. https://doi.org/10.1109/ICCV48922.2021.00986. |
[14] | Tang Y H, Han K, Guo J Y, et al. An image patch is a wave: phase-aware vision MLP[C]//Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022: 10925–10934. https://doi.org/10.1109/CVPR52688.2022.01066. |
[15] | Liu H J, Liu F Q, Fan X Y, et al. Polarized self-attention: towards high-quality pixel-wise regression[Z]. arXiv: 2107.00782, 2021. https://arxiv.org/abs/2107.00782. |
[16] | Li R, Gong D, Yin W, et al. Learning to fuse monocular and multi-view cues for multi-frame depth estimation in dynamic scenes[C]//Proceedings of 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023: 21539–21548. https://doi.org/10.1109/CVPR52729.2023.02063. |
[17] | Bernal J, Sánchez F J, Fernández-Esparrach G, et al. WM-DOVA maps for accurate polyp highlighting in colonoscopy: validation vs. saliency maps from physicians[J]. Comput Med Imaging Graph, 2015, 43: 99−111. doi: 10.1016/j.compmedimag.2015.02.007 |
[18] | Amsaleg L, Huet B, Larson M, et al. Proceedings of the 27th ACM international conference on multimedia[C]. New York: ACM Press, 2019. |
[19] | Silva J, Histace A, Romain O, et al. Toward embedded detection of polyps in WCE images for early diagnosis of colorectal cancer[J]. Int J Comput Assist Radiol Surg, 2014, 9(2): 283−293. doi: 10.1007/s11548-013-0926-3 |
[20] | Tajbakhsh N, Gurudu S R, Liang J M. Automated polyp detection in colonoscopy videos using shape and context information[J]. IEEE Trans Med Imaging, 2016, 35(2): 630−644. doi: 10.1109/TMI.2015.2487997 |
[21] | Ronneberger O, Fischer P, Brox T. U-Net: convolutional networks for biomedical image segmentation[C]//Proceedings of the 18th International Conference on Medical Image Computing and Computer-Assisted Intervention, 2015: 234–241. https://doi.org/10.1007/978-3-319-24574-4_28. |
[22] | Fan D P, Ji G P, Zhou T, et al. PraNet: parallel reverse attention network for polyp segmentation[C]//Proceedings of the 23rd International Conference on Medical Image Computing and Computer-Assisted Intervention, 2020: 263–273. https://doi.org/10.1007/978-3-030-59725-2_26. |
[23] | Patel K, Bur A M, Wang G H. Enhanced U-Net: a feature enhancement network for polyp segmentation[C]//Proceedings of 2021 18th Conference on Robots and Vision, 2021: 181–188. https://doi.org/10.1109/CRV52889.2021.00032. |
[24] | Yin Z J, Liang K M, Ma Z Y, et al. Duplex contextual relation network for polyp segmentation[C]//Proceedings of 2022 IEEE 19th International Symposium on Biomedical Imaging, 2022: 1–5. https://doi.org/10.1109/ISBI52829.2022.9761402. |
[25] | Wang J F, Huang Q M, Tang F L, et al. Stepwise feature fusion: local guides global[C]//Proceedings of the 25th International Conference on Medical Image Computing and Computer-Assisted Intervention, 2022: 110–120. https://doi.org/10.1007/978-3-031-16437-8_11. |
Among malignant diseases, colorectal cancer is one of the most common cancers in life, and its morbidity and mortality have been high. Therefore, it is urgent to develop an automatic recognition and automatic segmentation algorithm for colorectal polyp image segmentation to help doctors improve the efficiency of diagnosing patients. However, the traditional colorectal polyp segmentation method requires manual extraction of lesion features and the integration strategy will over-rely on the experience of the implementor. Therefore, the traditional colorectal polyp segmentation method is prone to problems such as inaccurate target segmentation, insufficient contrast and blurred edge details during segmentation. In order to solve the problems existing in the traditional method, In this paper, a new colorectal polyp segmentation network TPSA-Net, which combines polarized self-attention and Transformer, is proposed. Firstly, in order to make better use of the semantic information of image blocks at different phase levels to improve the segmentation accuracy of target images, an improved phase sensing hybrid module is designed in this paper, which can dynamically capture multi-scale context information at different levels of colorectal polyp images to improve the accuracy of target segmentation. Secondly, the polarization self-attention module is introduced to fully consider the characteristics of pixels and strengthen the self-attention of the image, so as to improve the contrast between the lesion area and the normal tissue area. Finally, the dynamic capturing ability of the geometric structure of the image was enhanced by the cross-fusion module of the clues, and the complementary characteristics of the two clues in single/multi-frame were improved to solve the problem of blurred edge details during colorectal polyp segmentation. Experiments were conducted on four datasets, CVC-ClinicDB, Kvasir, CVC-ColonDB and ETIS-LaribPolypDB, and the Dice similarity index was 0.946, 0.927, 0.805 and 0.781, respectively. Compared with U-Net, the traditional medical image segmentation network was improved by 12.4%, 14.5%, 29.3% and 37.5 respectively. The average MIou intersection ratio index was 0.901, 0.880, 0.729 and 0.706, respectively, which had certain application value in the diagnosis of colorectal polyps. A large number of experimental results show that the TPSA-Net method proposed in this paper can not only effectively improve the accuracy and contrast of colorectal polyp segmentation, but also overcome the problem of blurred detail in the segmentation image. How to use deep learning technology to research more simple and efficient colorectal polyp segmentation methods is the future focus.
Colorectal polyp segmentation network combining polarized self-attention and Transformer
Phase-aware hybrid module
Segmentation results obtained with/without PAHM
Polarized self-attention module
Segmentation results with or without PSA
Cross-cue fusion module
Segmentation results obtained with or without CCF
Visualization of segmentation results of different network models on CVC-ClinicDB and Kvasir datasets
Visualization of segmentation results of different network models on CVC-ColonDB and ETIS