Multi-granularity feature and shape-position similarity metric method for ship detection in SAR images

Li Shibo; Xiao Zhenjiu; Qu Haicheng; Li Fukun; Wang Jingjing

doi:10.12086/oee.2025.240254

Article navigation > Opto-Electronic Engineering > 2025 Vol. 52 > No. 2 > 240254

Next Article Previous Article

Li S B, Xiao Z J, Qu H C, et al. Multi-granularity feature and shape-position similarity metric method for ship detection in SAR images[J]. Opto-Electron Eng, 2025, 52(2): 240254. doi: 10.12086/oee.2025.240254

Citation:

Li S B, Xiao Z J, Qu H C, et al. Multi-granularity feature and shape-position similarity metric method for ship detection in SAR images[J]. Opto-Electron Eng, 2025, 52(2): 240254. doi: 10.12086/oee.2025.240254

Multi-granularity feature and shape-position similarity metric method for ship detection in SAR images

1.
School of Software, Liaoning Technical University, Huludao, Liaoning 125105, China
2.
College of Computer and Information Engineering, Henan Normal University, Xinxiang, Henan 453007, China
3.
School of Geophysics and Geomatics, China University of Geosciences, Wuhan, Hubei 430074, China

Fund Project: The Scientific Research Foundation of the Higher Education Institutions of Liaoning Province (LJKMZ20220699), Discipline Innovation Team of Liaoning Technical University (LNTU20TD-23)

More Information

^*Corresponding author: xiaozhenjiu@lntu.edu.cn
CSTR: 32245.14.oee.2025.240254

Received Date 30 October 2024

Revised Date 16 December 2024

Accepted Date 17 December 2024

Published Date 28 February 2025

Abstract

Abstract

To address the challenges of background complexity and target scale changes in synthetic aperture radar (SAR) images, especially in densely populated small-target scenes prone to false and missed detections, a multi-granularity feature and shape-position similarity metric method for ship detection in SAR images is proposed. First, a multi-granularity feature aggregation structure containing two branches is designed in the feature extraction stage. One branch decomposes the feature map cascade by Haar wavelet transform to expand the global receptive field to extract coarse-grained features. The other branch introduces spatial and channel reconstruction convolution to capture detailed texture information, thereby minimizing the loss of contextual information. The two branches effectively suppress the complex background and clutter interference by synergistically exploiting the interaction of local and non-local features to achieve accurate extraction of multi-scale features. Next, by utilizing the Euclidean distance and combining position and shape information, we propose a shape-position similarity metric to solve the problem of position deviation sensitivity in small target-dense scenes, thereby balancing the allocation of positive and negative samples. In a comprehensive comparison with 11 detectors from one-stage, two-stage, and DETR series on the SSDD and HRSID datasets, our method achieves mAP scores of 68.8% and 98.3%, and mAP50 scores of 70.8% and 93.8%, respectively. In addition, our model is highly efficient, with just 2.4 M parameters and a computational load of only 6.4 GFLOPs, outperforming the comparison methods. The proposed method shows excellent detection performance under complex backgrounds and ship targets of different scales. While reducing the false detection rate and missed detection rate, it has a low model parameter amount and computational complexity.
- SAR images /
- ship detection /
- feature extraction /
- wavelet transform /
- Euclidean distance

FullText(HTML)

References

[1]	Ai J Q, Yang X Z, Song J T, et al. An adaptively truncated clutter-statistics-based two-parameter CFAR detector in SAR imagery[J]. IEEE J Oceanic Eng, 2018, 43(1): 267−279. doi: 10.1109/joe.2017.2768198 CrossRef Google Scholar
[2]	Eldhuset K. An automatic ship and ship wake detection system for spaceborne SAR images in coastal regions[J]. IEEE Trans Geosci Remote Sens, 1996, 34(4): 1010−1019. doi: 10.1109/36.508418 CrossRef Google Scholar
[3]	Liu T, Zhang J F, Gao G, et al. CFAR ship detection in polarimetric synthetic aperture radar images based on whitening filter[J]. IEEE Trans Geosci Remote Sens, 2020, 58(1): 58−81. doi: 10.1109/tgrs.2019.2931353 CrossRef Google Scholar
[4]	Girshick R. Fast R-CNN[C]//Proceedings of 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, 2015: 1440–1448. https://doi.org/10.1109/iccv.2015.169. Google Scholar
[5]	Ren S Q, He K M, Girshick R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Trans Pattern Anal Mach Intell, 2017, 39(6): 1137−1149. doi: 10.1109/tpami.2016.2577031 CrossRef Google Scholar
[6]	Zhang H K, Chang H, Ma B P, et al. Dynamic R-CNN: towards high quality object detection via dynamic training[C]//Proceedings of the 16th European Conference on Computer Vision, Glasgow, 2020: 260–275. https://doi.org/10.1007/978-3-030-58555-6_16. Google Scholar
[7]	Liu W, Anguelov D, Erhan D, et al. SSD: single shot multibox detector[C]//Proceedings of the 14th European Conference on Computer Vision, Amsterdam, 2016: 21–37. https://doi.org/10.1007/978-3-319-46448-0_2. Google Scholar
[8]	Redmon J, Divvala S, Girshick R, et al. You only look once: unified, real-time object detection[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, 2016: 779–788. https://doi.org/10.1109/cvpr.2016.91. Google Scholar
[9]	Tian Z, Shen C H, Chen H, et al. FCOS: a simple and strong anchor-free object detector[J]. IEEE Trans Pattern Anal Mach Intell, 2022, 44(4): 1922−1933. doi: 10.1109/tpami.2020.3032166 CrossRef Google Scholar
[10]	Guo H Y, Yang X, Wang N N, et al. A CenterNet++ model for ship detection in SAR images[J]. Pattern Recognit, 2021, 112: 107787. doi: 10.1016/j.patcog.2020.107787 CrossRef Google Scholar
[11]	徐安林, 杜丹, 王海红, 等. 结合层次化搜索与视觉残差网络的光学舰船目标检测方法[J]. 光电工程, 2021, 48(4): 200249. doi: 10.12086/oee.2021.200249 CrossRef Google Scholar Xu A L, Du D, Wang H H, et al. Optical ship target detection method combining hierarchical search and visual residual network[J]. Opto-Electron Eng, 2021, 48(4): 200249. doi: 10.12086/oee.2021.200249 CrossRef Google Scholar
[12]	Yang X, Zhang X, Wang N N, et al. A robust one-stage detector for multiscale ship detection with complex background in massive SAR images[J]. IEEE Trans Geosci Remote Sens, 2022, 60: 5217712. doi: 10.1109/tgrs.2021.3128060 CrossRef Google Scholar
[13]	丁俊华, 袁明辉. 基于双分支多尺度融合网络的毫米波SAR图像多目标语义分割方法[J]. 光电工程, 2023, 50(12): 230242. doi: 10.12086/oee.2023.230242 CrossRef Google Scholar Ding J H, Yuan M H. A multi-target semantic segmentation method for millimetre wave SAR images based on a dual-branch multi-scale fusion network[J]. Opto-Electron Eng, 2023, 50(12): 230242. doi: 10.12086/oee.2023.230242 CrossRef Google Scholar
[14]	Bai L, Yao C, Ye Z, et al. Feature enhancement pyramid and shallow feature reconstruction network for SAR ship detection[J]. IEEE J Sel Top Appl Earth Obs Remote Sens, 2023, 16: 1042−1056. doi: 10.1109/jstars.2022.3230859 CrossRef Google Scholar
[15]	Tang J, Han Y H, Xian Y T. SAR-ShipSwin: enhancing SAR ship detection with robustness in complex environment[J]. J Supercomput, 2024, 80(14): 2079−20812. doi: 10.1007/s11227-024-06237-z CrossRef Google Scholar
[16]	Cao S, Zhao C X, Dong J, et al. Ship detection in synthetic aperture radar images under complex geographical environments, based on deep learning and morphological networks[J]. Sensors, 2024, 24(13): 4290. doi: 10.3390/s24134290 CrossRef Google Scholar
[17]	Zhang T W, Zhang X L, Ke X. Quad-FPN: a novel quad feature pyramid network for SAR ship detection[J]. Remote Sens, 2021, 13(14): 2771. doi: 10.3390/rs13142771 CrossRef Google Scholar
[18]	Tang G, Zhao H R, Claramunt C, et al. PPA-Net: pyramid pooling attention network for multi-scale ship detection in SAR images[J]. Remote Sens, 2023, 15(11): 2855. doi: 10.3390/rs15112855 CrossRef Google Scholar
[19]	袁志安, 谷雨, 马淦. 面向多类别舰船多目标跟踪的改进CSTrack算法[J]. 光电工程, 2023, 50(12): 230218. doi: 10.12086/oee.2023.230218 CrossRef Google Scholar Yuan Z A, Gu Y, Ma G. Improved CSTrack algorithm for multi-class ship multi-object tracking[J]. Opto-Electron Eng, 2023, 50(12): 230218. doi: 10.12086/oee.2023.230218 CrossRef Google Scholar
[20]	Li X N, Chen P, Yang J S, et al. TKP-net: a three keypoint detection network for ships using SAR imagery[J]. IEEE J Sel Top Appl Earth Obs Remote Sens, 2024, 17: 364−376. doi: 10.1109/jstars.2023.3329252 CrossRef Google Scholar
[21]	Tang Y, Wang S G, Wei J, et al. Scene-aware data augmentation for ship detection in SAR images[J]. Int J Remote Sens, 2024, 45(10): 3396−3411. doi: 10.1080/01431161.2024.2343433 CrossRef Google Scholar
[22]	Chen S W, Mei L Y. Structure similarity virtual map generation network for optical and SAR image matching[J]. Front Phys, 2024, 12: 1287050. doi: 10.3389/fphy.2024.1287050 CrossRef Google Scholar
[23]	Li Y X, Li X, Li W J, et al. SARDet-100k: towards open-source benchmark and toolkit for large-scale SAR object detection[Z]. arXiv: 2403.06534, 2024. https://doi.org/10.48550/arxiv.2403.06534. Google Scholar
[24]	Khanam R, Hussain M. YOLOv11: an overview of the key architectural enhancements[Z]. arXiv: 2410.17725, 2024. https://doi.org/10.48550/arXiv.2410.17725. Google Scholar
[25]	Feng C J, Zhong Y J, Gao Y, et al. TOOD: task-aligned one-stage object detection[C]//Proceedings of 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, 2021: 3490–3499. https://doi.org/10.1109/ICCV48922.2021.00349. Google Scholar
[26]	Li J F, Wen Y, He L H. SCConv: spatial and channel reconstruction convolution for feature redundancy[C]//Proceedings of 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, 2023: 6153–6162. https://doi.org/10.1109/CVPR52729.2023.00596. Google Scholar
[27]	Zhang T W, Zhang X L, Li J W, et al. SAR ship detection dataset (SSDD): official release and comprehensive data analysis[J]. Remote Sens, 2021, 13(18): 3690. doi: 10.3390/rs13183690 CrossRef Google Scholar
[28]	Wei S J, Zeng X F, Qu Q Z, et al. HRSID: a high-resolution SAR images dataset for ship detection and instance segmentation[J]. IEEE Access, 2020, 8: 120234−120254. doi: 10.1109/access.2020.3005861 CrossRef Google Scholar
[29]	Zheng Z H, Wang P, Liu W, et al. Distance-IoU loss: faster and better learning for bounding box regression[C]//Proceedings of the AAAI Conference on Artificial Intelligence, New York, 2020: 12993–13000. https://doi.org/10.1609/aaai.v34i07.6999. Google Scholar
[30]	Rezatofighi H, Tsoi N, Gwak J Y, et al. Generalized intersection over union: a metric and a loss for bounding box regression[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, 2019: 658–666. https://doi.org/10.1109/cvpr.2019.00075. Google Scholar
[31]	Zheng Z H, Wang P, Ren D W, et al. Enhancing geometric factors in model learning and inference for object detection and instance segmentation[J]. IEEE Trans Cybern, 2022, 52(8): 8574−8586. doi: 10.1109/TCYB.2021.3095305 CrossRef Google Scholar
[32]	Gevorgyan Z. SIoU loss: more powerful learning for bounding box regression[Z]. arXiv: 2205.12740, 2022. https://doi.org/10.48550/arxiv.2205.12740. Google Scholar
[33]	Ma S L, Xu Y. MPDIoU: a loss for efficient and accurate bounding box regression[Z]. arXiv: 2307.07662, 2023. https://doi.org/10.48550/arxiv.2307.07662. Google Scholar
[34]	Liu C, Wang K G, Li Q, et al. Powerful-IoU: more straightforward and faster bounding box regression loss with a nonmonotonic focusing mechanism[J]. Neural Networks, 2024, 170: 276−284. doi: 10.1016/j.neunet.2023.11.041 CrossRef Google Scholar
[35]	Sun P Z, Zhang R F, Jiang Y, et al. Sparse R-CNN: end-to-end object detection with learnable proposals[C]//Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, 2021: 14449–14458. https://doi.org/10.1109/CVPR46437.2021.01422. Google Scholar
[36]	Lyu C Q, Zhang W W, Huang H A, et al. RTMDet: an empirical study of designing real-time object detectors[Z]. arXiv: 2212.07784, 2022. https://doi.org/10.48550/ARXIV.2212.07784. Google Scholar
[37]	Jocher G, Qiu J, Chaurasia A. YOLOv8 by Ultralytics[EB/OL]//GitHub. (2023-01-01)[2023-12-17]. https://github.com/ultralytics/ultralytics. Google Scholar
[38]	Wang Z Y, Li C, Xu H Y, et al. Mamba YOLO: a simple baseline for object detection with state space model[Z]. arXiv preprint arXiv: 2406.05835, 2024. https://doi.org/10.48550/arxiv.2406.05835. Google Scholar
[39]	Zhang S L, Wang X J, Wang J Q, et al. Dense distinct query for end-to-end object detection[C]//Proceedings of 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, 2023: 7329–7338. https://doi.org/10.1109/cvpr52729.2023.00708. Google Scholar
[40]	Zhang H, Li F, Liu S L, et al. DINO: DETR with improved DeNoising anchor boxes for end-to-end object detection[C]//Proceedings of the 11th International Conference on Learning Representations, Kigali, 2022: 1–19. Google Scholar
[41]	Zhao Y, Lv W Y, Xu S L, et al. DETRs beat YOLOs on real-time object detection[C]//Proceedings of 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, 2024: 16965–16974. https://doi.org/10.1109/CVPR52733.2024.01605. Google Scholar

Overview

Overview

Synthetic aperture radar (SAR) images have all-weather and all-day observation capabilities, can penetrate clouds, rain, and fog, provide high-resolution images, and achieve large-scale coverage and fast scanning. Based on these advantages, SAR images have been widely used in ship detection, such as maritime patrol, search and rescue, fishery supervision, traffic management, and military surveillance. However, in actual detection, due to factors such as complex background interference, large changes in target scale, and dense distribution of small targets, missed detections and false detections are prone to occur. This paper proposes a multi-granularity feature and shape-position similarity method for ship detection in SAR images to meet these challenges and improve the detection effect. In the feature extraction stage, a dual-branch multi-granularity feature aggregation (MGFA) structure is designed to cope with the complex background and multi-scale targets in SAR images. MGFA uses two branches, coarse-grained and fine-grained, for bidirectional feature extraction. The coarse-grained feature extraction branch is implemented through the Haar wavelet transform, which can decompose the input features at multiple levels, thereby expanding the receptive field of the model and enhancing the ability to capture global context information. The fine-grained feature extraction branch uses spatial and channel reconstruction convolution (SCConv), which focuses more on the extraction of details and local information. SCConv can efficiently capture the detailed texture of small targets, reduce the influence of clutter interference and background noise, and thus improve the accuracy of small target detection. In the detection regression stage, the shape-position similarity (SPS) metric is proposed using Euclidean distance to replace the traditional IoU. For small ships, SPS solves the sensitivity problem of position deviation through center distance, and can accurately reflect the position deviation even when there is no overlap in the bounding box. For large ships, SPS introduces shape constraints, enhances the focus on the shape of the prediction box, and makes up for the deficiency of IoU focusing only on position overlap. By considering the similarity of position and shape, SPS effectively addresses the problem of ship detection in SAR images with dense small targets and significant scale changes. In a comprehensive comparison with 11 detectors from one-stage, two-stage, and DETR series on the SSDD and HRSID datasets, our method achieves mAP scores of 68.8% and 98.3%, and mAP50 scores of 70.8% and 93.8%, respectively. In addition, our model is highly efficient, with just 2.4 M parameters and a computational load of only 6.4 GFLOPs, outperforming the comparison methods.