融合多尺度特征的光场图像超分辨率方法

赵圆圆,施圣贤. 融合多尺度特征的光场图像超分辨率方法[J]. 光电工程,2020,47(12):200007. doi: 10.12086/oee.2020.200007
引用本文: 赵圆圆,施圣贤. 融合多尺度特征的光场图像超分辨率方法[J]. 光电工程,2020,47(12):200007. doi: 10.12086/oee.2020.200007
Zhao Y Y, Shi S X. Light-field image super-resolution based on multi-scale feature fusion[J]. Opto-Electron Eng, 2020, 47(12): 200007. doi: 10.12086/oee.2020.200007
Citation: Zhao Y Y, Shi S X. Light-field image super-resolution based on multi-scale feature fusion[J]. Opto-Electron Eng, 2020, 47(12): 200007. doi: 10.12086/oee.2020.200007

融合多尺度特征的光场图像超分辨率方法

  • 基金项目:
    国家自然科学基金资助项目(11772197)
详细信息
    作者简介:
    通讯作者: 施圣贤(1980-),男,博士,副教授,主要从事机器视觉、光场三维测试技术的研究。E-mail:kirinshi@sjtu.edu.cn
  • 中图分类号: TP391.4

Light-field image super-resolution based on multi-scale feature fusion

  • Fund Project: Supported by National Natural Science Foundation of China (11772197)
More Information
  • 光场相机作为新一代的成像设备,能够同时捕获光线的空间位置和入射角度,然而其记录的光场存在空间分辨率和角度分辨率之间的制约关系,尤其子孔径图像有限的空间分辨率在一定程度上限制了光场相机的应用场景。因此本文提出了一种融合多尺度特征的光场图像超分辨网络,以获取更高空间分辨率的光场子孔径图像。该基于深度学习的网络框架分为三大模块:多尺度特征提取模块、全局特征融合模块和上采样模块。网络首先通过多尺度特征提取模块学习4D光场中固有的结构特征,然后采用融合模块对多尺度特征进行融合与增强,最后使用上采样模块实现对光场的超分辨率。在合成光场数据集和真实光场数据集上的实验结果表明,该方法在视觉评估和评价指标上均优于现有算法。另外本文将超分辨后的光场图像用于深度估计,实验结果展示出光场图像空间超分辨率能够增强深度估计结果的准确性。

  • Overview: As a new generation of imaging equipment, a light-field camera can simultaneously capture the spatial position and incident angle of light rays. However, the recorded light-field has a trade-off between spatial resolution and angular resolution. Especially the limited spatial resolution of sub-aperture images limits the application scenarios of light-field cameras. Therefore, a light-field super-resolution network that fuses multi-scale features to obtain super-resolved light-field is proposed in this paper. The deep-learning-based network framework contains three major modules: multi-scale feature extraction module, global feature fusion module, and up-sampling module. The design ideas of different modules are as follows.

    a) Multi-scale feature extraction module: To explore the complex texture information in the 4D light-field space, the feature extraction module uses ResASPP blocks to expand the perception field and to extract multi-scale features. The low-resolution light-field sub-aperture images are first sent to a Conv block and a Res block for low level feature extraction, and then a ResASPP block and a Res block are alternated twice to learn multi-scale features that accumulate high-frequency information in the 4D light-field.

    b) Global feature fusion module: The light-field images contain not only spatial information but also angular information, which implies inherent structures of 4D light-field. The global feature fusion module is proposed to geometrically reconstruct the super-resolved light-field by exploiting the angular clues. It should be noted that the feature maps of all the sub-images from the upstream are first stacked in the channel dimension of the network and then are sent to this module for high-level features extraction.

    c) Up-sampling module: After learning the global features in the 4D light-field structure, the high-level feature maps could be sent to the up-sampling module for light-field super resolution. This module uses sub-pixel convolution or pixel shuffle operation to obtain 2 spatial super-resolution, after feature maps are sent to a conventional convolution layer to perform feature fusion and finally output a super-resolved light-field sub-images array.

    The network proposed in this paper was applied to the synthetic light-field dataset and the real-world light-field dataset for light-field images super-resolution. The experimental results on the synthetic light-field dataset and real-world light-field dataset showed that this method outperforms other state-of-the-art methods in both visual and numerical evaluations. In addition, the super-resolved light-field images were applied to depth estimation, and the results illustrated the parallax calculation enhancement of light-field spatial super-resolution, especially in occlusion and edge regions.

  • 加载中
  • 图 1  融合多尺度特征的光场超分辨率网络示意图。(a)网络结构;(b) ASPP块;(c) ResASPP块结构;(d)融合模块结构

    Figure 1.  Schematic diagram of light-field super-resolution based on multi-scale features. (a) Structure of the network; (b) ASPP block; (c) ResASPP block; (d) Structure of the fusion module

    图 2  融合块FusionB原理示意图。(a) FusionB结构;(b)多尺度特征与经过FusionB融合后的特征对比

    Figure 2.  Schematic diagram of fusion block. (a) Structure of FusionB; (b) Comparison of multi-scale features and the features fused by FusionB

    图 3  合成数据光场超分辨结果。(a) Buddha场景;(b) Mona场景;(c) Papillon场景

    Figure 3.  Light-field super resolution results on synthetic data. (a) Buddha; (b) Mona; (c) Papillon

    图 4  真实数据光场超分辨结果。(a) Fence场景;(b) Cars场景;(c) Flowers场景

    Figure 4.  Light-field super resolution results on real-world data. (a) Fence; (b) Cars; (c) Flowers

    图 5  视差估计结果。(a) Mona场景视差图;(b) Flowers场景视差图

    Figure 5.  Depth estimation results. (a) Disparity map of Mona; (b) Disparity map of Flowers

    图 6  深度估计结果与真实深度之间的误差图(单位:像素)。(a),(b) Mona场景误差图;(c),(d) Flowers场景误差图

    Figure 6.  Error maps between depth estimation results and the ground truth (unit: pixel). (a), (b) Mona's error map; (c), (d) Flowers's error map

    表 1  不同超分辨算法在合成数据上的性能比较

    Table 1.  Performance comparison of different image super resolution algorithms on synthetic data

    Method Buddha Mona Papillon
    PSNR/dB SSIM PSNR/dB SSIM PSNR/dB SSIM
    Bicubic 33.0865 0.9208 32.6579 0.9301 33.4031 0.9365
    GBSR 35.7463 0.9568 38.1479 0.9769 38.7855 0.9802
    FALSR 34.9493 0.9373 34.8104 0.9412 34.7569 0.9504
    ResLF 35.4988 0.9689 34.3314 0.9614 35.1983 0.9754
    Proposed 39.8095 0.9807 41.5483 0.9865 41.0616 0.9852
    下载: 导出CSV

    表 2  不同超分辨算法在真实数据上的性能对比

    Table 2.  Performance comparison of different image super resolution algorithms on real-world data

    Method Fence Cars Flowers
    PSNR/dB SSIM PSNR/dB SSIM PSNR/dB SSIM
    Bicubic 30.8720 0.9541 31.3657 0.9401 30.2619 0.9194
    FALSR 35.1476 0.9639 31.5821 0.9422 31.5795 0.9192
    ResLF 34.9172 0.9844 31.6191 0.9722 31.3748 0.9538
    Proposed 31.5522 0.9816 35.2929 0.9800 40.6967 0.9874
    下载: 导出CSV
  • [1]

    Lippmann G. Épreuves réversibles donnant la sensation du relief[J]. Journal de Physique Théorique et Appliquée, 1908, 7(1): 821-825. doi: 10.1051/jphystap:019080070082100

    [2]

    Adelson E H, Wang J Y A. Single lens stereo with a plenoptic camera[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1992, 14(2): 99-106. doi: 10.1109/34.121783

    [3]

    Ng R, Levoy M, Brédif M, et al. Light field photography with a hand-held plenoptic camera[R]. Stanford Tech Report CTSR 2005-02, 2005.

    [4]

    Tan Z P, Johnson K, Clifford C, et al. Development of a modular, high-speed plenoptic-camera for 3D flow-measurement[J]. Optics Express, 2019, 27(9): 13400-13415. doi: 10.1364/OE.27.013400

    [5]

    Fahringer T W, Lynch K P, Thurow B S. Volumetric particle image velocimetry with a single plenoptic camera[J]. Measurement Science and Technology, 2015, 26(11): 115201. doi: 10.1088/0957-0233/26/11/115201

    [6]

    Shi S X, Ding J F, New T H, et al. Volumetric calibration enhancements for single-camera light-field PIV[J]. Experiments in Fluids, 2019, 60(1): 21. doi: 10.1007/s00348-018-2670-5

    [7]

    Shi S X, Ding J F, New T H, et al. Light-field camera-based 3D volumetric particle image velocimetry with dense ray tracing reconstruction technique[J]. Experiments in Fluids, 2017, 58(7): 78. doi: 10.1007/s00348-017-2365-3

    [8]

    Shi S X, Wang J H, Ding J F, et al. Parametric study on light field volumetric particle image velocimetry[J]. Flow Measurement and Instrumentation, 2016, 49: 70-88. doi: 10.1016/j.flowmeasinst.2016.05.006

    [9]

    Sun J, Xu C L, Zhang B, et al. Three-dimensional temperature field measurement of flame using a single light field camera[J]. Optics Express, 2016, 24(2): 1118-1132. doi: 10.1364/OE.24.001118

    [10]

    Shi S X, Xu S M, Zhao Z, et al. 3D surface pressure measurement with single light-field camera and pressure-sensitive paint[J]. Experiments in Fluids, 2018, 59(5): 79. doi: 10.1007/s00348-018-2534-z

    [11]

    Ding J F, Li H T, Ma H X, et al. A novel light field imaging based 3D geometry measurement technique for turbomachinery blades[J]. Measurement Science and Technology, 2019, 30(11): 115901. doi: 10.1088/1361-6501/ab310b

    [12]

    Cheng Z, Xiong Z W, Chen C, et al. Light field super-resolution: a benchmark[C]//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, Long Beach, CA, 2019.

    [13]

    Lim J, Ok H, Park B, et al. Improving the spatail resolution based on 4D light field data[C]//Proceedings of the 16th IEEE International Conference on Image Processing, Cairo, Egypt, 2009, 2: 1173-1176.

    [14]

    Georgiev T, Chunev G, Lumsdaine A. Superresolution with the focused plenoptic camera[J]. Proceedings of SPIE, 2011, 7873: 78730X. doi: 10.1117/12.872666

    [15]

    Bishop T E, Favaro P. The light field camera: extended depth of field, aliasing, and superresolution[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2012, 34(5): 972-986. doi: 10.1109/TPAMI.2011.168

    [16]

    Rossi M, Frossard P. Graph-based light field super-resolution[C]//Proceedings of the IEEE 19th International Workshop on Multimedia Signal Processing, Luton, UK, 2017: 1-6.

    [17]

    Alain M, Smolic A. Light field super-resolution via LFBM5D sparse coding[C]//Proceedings of the 25th IEEE International Conference on Image Processing, Athens, Greece, 2018: 1-5.

    [18]

    Egiazarian K, Katkovnik V. Single image super-resolution via BM3D sparse coding[C]//Proceedings of the 23rd European Signal Processing Conference, Nice, France, 2015: 2849-2853.

    [19]

    Alain M, Smolic A. Light field denoising by sparse 5D transform domain collaborative filtering[C]//Proceedings of the IEEE 19th International Workshop on Multimedia Signal Processing, Luton, UK, 2017: 1-6.

    [20]

    Yoon Y, Jeon H G, Yoo D, et al. Learning a deep convolutional network for light-field image super-resolution[C]//Proceedings of 2015 IEEE International Conference on Computer Vision Workshop, Santiago, Chile, 2015: 57-65.

    [21]

    Wang Y L, Liu F, Zhang K B, et al. LFNet: a novel bidirectional recurrent convolutional neural network for light-field image super-resolution[J]. IEEE Transactions on Image Processing, 2018, 27(9): 4274-4286. doi: 10.1109/TIP.2018.2834819

    [22]

    Zhang S, Lin Y F, Sheng H. Residual networks for light field image super-resolution[C]//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 2019: 11046-11055.

    [23]

    Wang L G, Wang Y Q, Liang Z F, et al. Learning parallax attention for stereo image super-resolution[C]//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 2019: 12250-12259.

    [24]

    Chen L C, Zhu Y K, Papandreou G, et al. Encoder-decoder with atrous separable convolution for semantic image segmentation[C]//Proceedings of the European Conference on Computer Vision, Glasgow, United Kingdom, 2018: 801-818.

    [25]

    汪荣贵, 刘雷雷, 杨娟, 等.基于聚类和协同表示的超分辨率重建[J].光电工程, 2018, 45(4): 170537. doi: 10.12086/oee.2018.170537

    Wang R G, Liu L L, Yang J, et al. Image super-resolution based on clustering and collaborative representation[J]. Opto-Electronic Engineering, 2018, 45(4): 170537. doi: 10.12086/oee.2018.170537

    [26]

    Shi W Z, Caballero J, Huszár F, et al. Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, 2016: 1874-1883.

    [27]

    徐亮, 符冉迪, 金炜, 等.基于多尺度特征损失函数的图像超分辨率重建[J].光电工程, 2019, 46(11): 180419. doi: 10.12086/oee.2019.180419

    Xu L, Fu R D, Jin W, et al. Image super-resolution reconstruction based on multi-scale feature loss function[J]. Opto-Electronic Engineering, 2019, 46(11): 180419. doi: 10.12086/oee.2019.180419

    [28]

    Wanner S, Meister S, Goldluecke B. Datasets and benchmarks for densely sampled 4D light fields[M]//Bronstein M, Favre J, Hormann K. Vision, Modeling & Visualization, Lugano, Switzerland: The Eurographics Association, 2013: 225-226.

    [29]

    Honauer K, Johannsen O, Kondermann D, et al. A dataset and evaluation methodology for depth estimation on 4D light fields[C]//Proceedings of the Asian Conference on Computer Vision, Taipei, Taiwan, China, 2016: 19-34.

    [30]

    Raj S A, Lowney M, Shah R, et al. Stanford lytro light field archive[EB/OL]. http://lightfields.stanford.edu/LF2016.html. 2016.

    [31]

    Rerabek M, Ebrahimi T. New light field image dataset[C]//Proceedings of the 8th International Conference on Quality of Multimedia Experience, Lisbon, Portugal, 2016.

    [32]

    Chu X X, Zhang B, Ma H L, et al. Fast, accurate and lightweight super-resolution with neural architecture search[Z]. arXiv: 1901.07261, 2019.

    [33]

    Kingma D P, Ba L J. Adam: a method for stochastic optimization[C]//Proceedings of the International Conference on Learning Representations, San Diego, America, 2015.

    [34]

    Glorot X, Bengio Y. Understanding the difficulty of training deep feedforward neural networks[C]//Proceedings of the 13th International Conference on Artificial Intelligence and Statistics, Sardinia, Italy, 2010: 249-256.

    [35]

    Bevilacqua M, Roumy A, Guillemot C, et al. Low-complexity single-image super-resolution based on nonnegative neighbor embedding[C]//British Machine Vision Conference, Guildford, UK, 2012.

    [36]

    Chen J, Hou J H, Ni Y, et al. Accurate light field depth estimation with superpixel regularization over partially occluded regions[J]. IEEE Transactions on Image Processing, 2018, 27(10): 4889-4900. doi: 10.1109/TIP.2018.2839524

  • 加载中

(6)

(2)

计量
  • 文章访问数:  7515
  • PDF下载数:  1333
  • 施引文献:  0
出版历程
收稿日期:  2020-01-03
修回日期:  2020-04-15
刊出日期:  2020-12-15

目录

/

返回文章
返回