Dynamic SAR image target detection by fusing space-frequency domain

Shen Xueli; Wang Jiahui; Wu Zhengwei

doi:10.12086/oee.2025.240245

Article navigation > Opto-Electronic Engineering > 2025 Vol. 52 > No. 1 > 240245

Next Article Previous Article

Shen X L, Wang J H, Wu Z W. Dynamic SAR image target detection by fusing space-frequency domain[J]. Opto-Electron Eng, 2025, 52(1): 240245. doi: 10.12086/oee.2025.240245

Citation:

Shen X L, Wang J H, Wu Z W. Dynamic SAR image target detection by fusing space-frequency domain[J]. Opto-Electron Eng, 2025, 52(1): 240245. doi: 10.12086/oee.2025.240245

Dynamic SAR image target detection by fusing space-frequency domain

School of Software, Liaoning Technical University, Huludao, Liaoning 125105, China

Fund Project: National Natural Science Foundation of China under the Upper Level Program (62173171)

More Information

^*Corresponding author: 2245414310@qq.com
CSTR: 32245.14.oee.2025.240245

Received Date 19 October 2024

Revised Date 03 December 2024

Accepted Date 10 December 2024

Published Date 25 January 2025

Abstract

Abstract

A dynamic SAR image target detection algorithm integrating space-frequency domains is proposed to address challenges such as significant feature differences in SAR image samples, imbalanced target scales, and high speckle noise in the background, which result in low detection accuracy and slow inference speed. First, a dual-stream perception strategy constructs spatial-frequency perception units, leveraging dynamic receptive fields and fractional-order Gabor transforms to enhance the model’s ability to capture spatial diversity and frequency scattering features. This way improves the retention of global contextual information, accelerates inference, reduces the similarity of feature mapping patterns, and mitigates background noise interference, effectively reducing missed and false detections. Second, a re-parameterization-based adaptive feature fusion module is designed to optimize interactions across multi-scale features, enriching feature diversity, alleviating mapping discrepancies and information loss caused by feature sampling, and enhancing the salience of small target and key frequency information during fusion, thereby improving detection precision. Finally, the DY_IoU dynamic regression loss function is introduced, utilizing adaptive scale penalty factors and a dynamic non-monotonic attention mechanism to address anchor box expansion and positional deviation, further enhancing the localization and detection capabilities for multi-scale targets. This way also accelerates model convergence and reduces computational overhead. Experiments conducted on the public datasets SAR-Acraft-1.0 and HRSID demonstrate that the proposed method achieves mAP@0.5 values of 95.9% and 98.8%, respectively, representing 5.2% and 1.2% improvements over baseline models and outperforming other comparison algorithms. These results indicate that the proposed algorithm improves detection accuracy and exhibits strong robustness and generalization capabilities.
- SAR image /
- dual-stream perception /
- fractional-order Gabor transform method /
- feature fusion /
- multi-scale samples /
- small target /
- DY_IoU

FullText(HTML)

References

[1]	梁礼明, 陈康泉, 王成斌, 等. 融合视觉中心机制和并行补丁感知的遥感图像检测算法[J]. 光电工程, 2024, 51 (7): 240099. doi: 10.12086/oee.2024.240099 CrossRef Google Scholar Liang L M, Chen K Q, Wang C B, et al. Remote sensing image detection algorithm integrating visual center mechanism and parallel patch perception[J]. Opto-Electron Eng, 2024, 51 (7): 240099. doi: 10.12086/oee.2024.240099 CrossRef Google Scholar
[2]	肖振久, 张杰浩, 林渤翰. 特征协同与细粒度感知的遥感图像小目标检测[J]. 光电工程, 2024, 51 (6): 240066. doi: 10.12086/oee.2024.240066 CrossRef Google Scholar Xiao Z J, Zhang J H, Lin B H. Feature coordination and fine-grained perception of small targets in remote sensing images[J]. Opto-Electron Eng, 2024, 51 (6): 240066. doi: 10.12086/oee.2024.240066 CrossRef Google Scholar
[3]	马梁, 苟于涛, 雷涛, 等. 基于多尺度特征融合的遥感图像小目标检测[J]. 光电工程, 2022, 49 (4): 210363. doi: 10.12086/oee.2022.210363 CrossRef Google Scholar Ma L, Gou Y T, Lei T, et al. Small object detection based on multi-scale feature fusion using remote sensing images[J]. Opto-Electron Eng, 2022, 49 (4): 210363. doi: 10.12086/oee.2022.210363 CrossRef Google Scholar
[4]	Ren S Q, He K M, Girshick R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Trans Pattern Anal Mach Intell, 2017, 39 (6): 1137−1149. doi: 10.1109/TPAMI.2016.2577031 CrossRef Google Scholar
[5]	Redmon J, Divvala S, Girshick R, et al. You only look once: unified, real-time object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016: 779–788. https://doi.org/10.1109/CVPR.2016.91. Google Scholar
[6]	Redmon J, Farhadi A. YOLO9000: better, faster, stronger[C]//IEEE Conference on Computer Vision and Pattern Recognition, 2017: 6517–6525. https://doi.org/10.1109/CVPR.2017.690. Google Scholar
[7]	Redmon J, Farhadi A. YOLOv3: an incremental improvement[Z]. arXiv: 1804.02767, 2018. https://doi.org/10.48550/arXiv.1804.02767. Google Scholar
[8]	Bochkovskiy A, Wang C Y, Liao H Y M. YOLOv4: optimal speed and accuracy of object detection[Z]. arXiv: 2004.10934, 2020. https://doi.org/10.48550/arXiv.2004.10934. Google Scholar
[9]	Ge Z, Liu S T, Wang F, et al. YOLOX: exceeding YOLO series in 2021[Z]. arXiv: 2107.08430, 2021. https://doi.org/10.48550/arXiv.2107.08430. Google Scholar
[10]	Lyu Z W, Jin H F, Zhen T, et al. Small object recognition algorithm of grain pests based on SSD feature fusion[J]. IEEE Access, 2021, 9: 43202−43213. doi: 10.1109/ACCESS.2021.3066510 CrossRef Google Scholar
[11]	Zhang L P, Liu Y, Zhao W D, et al. Frequency-adaptive learning for SAR ship detection in clutter scenes[J]. IEEE Trans Geosci Remote Sens, 2023, 61: 5215514. doi: 10.1109/TGRS.2023.3249349 CrossRef Google Scholar
[12]	Si J H, Song B B, Wu J X, et al. Maritime ship detection method for satellite images based on multiscale feature fusion[J]. IEEE J Sel Top Appl Earth Obs Remote Sens, 2023, 16: 6642−6655. doi: 10.1109/JSTARS.2023.3296898 CrossRef Google Scholar
[13]	Qin C, Wang X Q, Li G, et al. A semi-soft label-guided network with self-distillation for SAR inshore ship detection[J]. IEEE Trans Geosci Remote Sens, 2023, 61: 5211814. doi: 10.1109/TGRS.2023.3293535 CrossRef Google Scholar
[14]	胥小我, 张晓玲, 张天文, 等. 基于自适应锚框分配与IOU监督的复杂场景SAR舰船检测[J]. 雷达学报, 2023, 12 (5): 1097−1111. doi: 10.12000/JR23059 CrossRef Google Scholar Xu X W, Zhang X L, Zhang T W, et al. SAR ship detection in complex scenes based on adaptive anchor assignment and IOU supervise[J]. J Radars, 2023, 12 (5): 1097−1111. doi: 10.12000/JR23059 CrossRef Google Scholar
[15]	肖振久, 林渤翰, 曲海成. 融合多重机制的SAR舰船检测[J]. 中国图象图形学报, 2024, 29 (2): 545−558. doi: 10.11834/jig.230166 CrossRef Google Scholar Xiao Z J, Lin B H, Qu H C. SAR ship detection with multi-mechanism fusion[J]. J Image Graphics, 2024, 29 (2): 545−558. doi: 10.11834/jig.230166 CrossRef Google Scholar
[16]	孙培双, 温显斌. 基于改进YOLOv5模型的SAR图像舰船目标检测算法[J]. 电光与控制, 2024, 31 (8): 32−37,85. doi: 10.3969/j.issn.1671-637X.2024.08.005 CrossRef Google Scholar Sun P S, Wen X B. An improved algorithm for detecting ship target in SAR images based on YOLOv5 model[J]. Electron-Opt Control, 2024, 31 (8): 32−37,85. doi: 10.3969/j.issn.1671-637X.2024.08.005 CrossRef Google Scholar
[17]	Li K, Wang D, Hu Z Y, et al. Unleashing channel potential: space-frequency selection convolution for SAR object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024: 17323–17332. https://doi.org/10.1109/CVPR52733.2024.01640. Google Scholar
[18]	Zhou J, Xiao C, Peng B, et al. DiffDet4SAR: diffusion-based aircraft target detection network for SAR images[J]. IEEE Geosci Remote Sens Lett, 2024, 21: 4007905. doi: 10.1109/LGRS.2024.3386020 CrossRef Google Scholar
[19]	Wang A, Chen H, Liu L H, et al. YOLOv10: real-time end-to-end object detection[Z]. arXiv: 2405.14458, 2024. https://doi.org/10.48550/arXiv.2405.14458. Google Scholar
[20]	Zhang P F, Lo E, Lu B T. High performance depthwise and pointwise convolutions on mobile devices[C]//Proceedings of the 34th AAAI Conference on Artificial Intelligence, 2020: 6795–6802. https://doi.org/10.1609/aaai.v34i04.6159. Google Scholar
[21]	Guo Y H, Li Y D, Wang L Q, et al. Depthwise convolution is all you need for learning multiple visual domains[C]//Proceedings of the 33rd AAAI Conference on Artificial Intelligence, 2019: 8368–8375. https://doi.org/10.1609/aaai.v33i01.33018368. Google Scholar
[22]	He K M, Zhang X Y, Ren S Q, et al. Deep residual learning for image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016: 770–778. https://doi.org/10.1109/CVPR.2016.90. Google Scholar
[23]	Wu H B, Kuo H C, Zheng N J, et al. Partially fake audio detection by self-attention-based fake span discovery[C]//ICASSP 2022–2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2022: 9236–9240. https://doi.org/10.1109/ICASSP43922.2022.9746162. Google Scholar
[24]	Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems, 2017: 6000–6010. Google Scholar
[25]	Bebis G, Georgiopoulos M. Feed-forward neural networks[J]. IEEE Potentials, 1994, 13 (4): 27−31. doi: 10.1109/45.329294 CrossRef Google Scholar
[26]	Chollet F. Xception: deep learning with depthwise separable convolutions[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017: 1800–1807. https://doi.org/10.1109/CVPR.2017.195. Google Scholar
[27]	Neubeck A, Van Gool L. Efficient non-maximum suppression[C]//18th International Conference on Pattern Recognition (ICPR'06), 2006: 850–855. https://doi.org/10.1109/ICPR.2006.479. Google Scholar
[28]	Li J F, Wen Y, He L H. SCConv: spatial and channel reconstruction convolution for feature redundancy[C]// Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023: 6153–6162. https://doi.org/10.1109/CVPR52729.2023.00596. Google Scholar
[29]	Hsiao T Y, Chang Y C, Chou H H, et al. Filter-based deep-compression with global average pooling for convolutional networks[J]. J Syst Archit, 2019, 95: 9−18. doi: 10.1016/j.sysarc.2019.02.008 CrossRef Google Scholar
[30]	McClenny L, Braga-Neto U. Self-adaptive physics-informed neural networks using a soft attention mechanism[Z]. arXiv:2009.04544, 2020. https://doi.org/10.48550/arXiv.2009.04544. Google Scholar
[31]	Chen J W, An D X, Ge B B, et al. Detection, parameters estimation, and imaging of moving targets based on extended post-Doppler STAP in multichannel WasSAR-GMTI[J]. IEEE Trans Geosci Remote Sens, 2024, 62: 5223515. doi: 10.1109/TGRS.2024.3465435 CrossRef Google Scholar
[32]	Koç E, Alikaşifoğlu T, Aras A C, et al. Trainable fractional Fourier transform[J]. IEEE Signal Process Lett, 2024, 31: 751−755. doi: 10.1109/LSP.2024.3372779 CrossRef Google Scholar
[33]	Luan S Z, Chen C, Zhang B C, et al. Gabor convolutional networks[J]. IEEE Trans Image Process, 2018, 27 (9): 4357−4366. doi: 10.1109/TIP.2018.2835143 CrossRef Google Scholar
[34]	Yin X Y, Goudriaan J, Lantinga E A, et al. A flexible sigmoid function of determinate growth[J]. Ann Bot, 2003, 91 (3): 361−371. doi: 10.1093/aob/mcg029 CrossRef Google Scholar
[35]	Gevorgyan Z. SIoU loss: more powerful learning for bounding box regression[Z]. arXiv: 2205.12740, 2022. https://doi.org/10.48550/arXiv.2205.12740. Google Scholar
[36]	Zheng Z H, Wang P, Ren D W, et al. Enhancing geometric factors in model learning and inference for object detection and instance segmentation[J]. IEEE Trans Cybern, 2022, 52 (8): 8574−8586. doi: 10.1109/TCYB.2021.3095305 CrossRef Google Scholar
[37]	Zhang Y F, Ren W Q, Zhang Z, et al. Focal and efficient IOU loss for accurate bounding box regression[J]. Neurocomputing, 2022, 506: 146−157. doi: 10.1016/j.neucom.2022.07.042 CrossRef Google Scholar
[38]	Tong Z J, Chen Y H, Xu Z W, et al. Wise-IoU: bounding box regression loss with dynamic focusing mechanism[Z]. arXiv: 2301.10051, 2023. https://doi.org/10.48550/arXiv.2301.10051. Google Scholar
[39]	王智睿, 康玉卓, 曾璇, 等. SAR-AIRcraft-1.0: 高分辨率SAR飞机检测识别数据集[J]. 雷达学报, 2023, 12 (4): 906−922. doi: 10.12000/JR23043 CrossRef Google Scholar Wang Z R, Kang Y Z, Zeng X, et al. SAR-AIRcraft-1.0: High-resolution SAR aircraft detection and recognition dataset[J]. J Radars, 2023, 12 (4): 906−922. doi: 10.12000/JR23043 CrossRef Google Scholar
[40]	Wei S J, Zeng X F, Qu Q Z, et al. HRSID: a high-resolution SAR images dataset for ship detection and instance segmentation[J]. IEEE Access, 2020, 8: 120234−120254. doi: 10.1109/ACCESS.2020.3005861 CrossRef Google Scholar
[41]	Wang Y Y, Wang C, Zhang H, et al. Automatic ship detection based on RetinaNet using multi-resolution Gaofen-3 imagery[J]. Remote Sens, 2019, 11 (5): 531. doi: 10.3390/rs11050531 CrossRef Google Scholar
[42]	Pan D C, Gao X, Dai W, et al. SRT-net: scattering region topology network for oriented ship detection in large-scale SAR images[J]. IEEE Trans Geosci Remote Sens, 2024, 62: 5202318. doi: 10.1109/TGRS.2024.3351366 CrossRef Google Scholar

Overview

Overview

A dynamic SAR image target detection algorithm integrating spatial-frequency domains is proposed to address several challenges inherent to SAR imagery, including significant feature variability, imbalanced target scales, and high speckle noise in background regions. These challenges contribute to decreased detection accuracy and slower inference speeds, posing difficulties for real-time applications. The proposed method is specifically designed to overcome these limitations through multiple innovative components that enhance both detection performance and computational efficiency. The algorithm first employs a dual-stream perception strategy to construct spatial-frequency perception units. This design integrates both dynamic receptive fields and fractional-order Gabor transforms, significantly improving the model’s ability to capture spatial diversity and frequency scattering features. By expanding the receptive fields adaptively, the algorithm captures both local and global contexts, leading to more effective extraction of complex patterns in the input data. Using fractional-order Gabor transforms further enhances the model's sensitivity to fine-grained texture and frequency features, which helps retain important global contextual information. These improvements collectively speed up inference by minimizing redundant feature representations, reducing the interference from background noise, and decreasing the similarity of feature mapping patterns. Consequently, the algorithm effectively addresses common issues such as missed and false detections, are typical in cluttered SAR images. In the next stage, a re-parameterization-based adaptive feature fusion module is introduced to optimize the interaction between multi-scale features. This module facilitates the efficient integration of features across different scales, enriching feature diversity and mitigating the discrepancies introduced during the sampling process. Additionally, the fusion process highlights the salience of small targets and key frequency information, often challenging to detect in traditional SAR detection frameworks. This enhanced multi-scale feature integration improves the detection accuracy, particularly for small and subtle objects, which are crucial in applications like maritime surveillance and remote sensing. To further enhance the algorithm’s effectiveness, a dynamic regression loss function, DY_IoU, is incorporated. This loss function employs adaptive scale penalty factors and a dynamic non-monotonic attention mechanism to address anchor box expansion and positional deviations. By dynamically adjusting the focus during training, the model achieves more precise localization of multi-scale targets. Moreover, the improved loss function facilitates faster convergence, reduces the computational burden, and ensures the algorithm remains lightweight and efficient for practical deployment. The proposed method was evaluated on two publicly available datasets, SAR-Acraft-1.0 and HRSID. Experimental results show that the algorithm achieves mAP@0.5 values of 95.9% and 98.8%, respectively, representing 5.2% and 1.2% improvements over baseline models. Additionally, the proposed approach outperforms other comparison algorithms, demonstrating its superiority. These results confirm that the algorithm not only enhances detection accuracy but also exhibits strong robustness and generalization capabilities, making it suitable for a wide range of real-world applications.