Light-field image super-resolution based on multi-scale feature fusion

Zhao Yuanyuan; Shi Shengxian

doi:10.12086/oee.2020.200007

Article navigation > Opto-Electronic Engineering > 2020 Vol. 47 > No. 12 > 200007

Next Article Previous Article

Zhao Y Y, Shi S X. Light-field image super-resolution based on multi-scale feature fusion[J]. Opto-Electron Eng, 2020, 47(12): 200007. doi: 10.12086/oee.2020.200007

Citation:

Zhao Y Y, Shi S X. Light-field image super-resolution based on multi-scale feature fusion[J]. Opto-Electron Eng, 2020, 47(12): 200007. doi: 10.12086/oee.2020.200007

Light-field image super-resolution based on multi-scale feature fusion

Zhao Yuanyuan,
Shi Shengxian^,

School of Mechanical Engineering, Shanghai Jiao Tong University, Shanghai 200240, China

Fund Project: Supported by National Natural Science Foundation of China (11772197)

More Information

Corresponding author: Shi Shengxian, E-mail: kirinshi@sjtu.edu.cn

Received Date 03 January 2020

Revised Date 15 April 2020

Published Date 15 December 2020

Abstract

Abstract

As a new generation of the imaging device, light-field camera can simultaneously capture the spatial position and incident angle of light rays. However, the recorded light-field has a trade-off between spatial resolution and angular resolution. Especially the application range of light-field cameras is restricted by the limited spatial resolution of sub-aperture images. Therefore, a light-field super-resolution neural network that fuses multi-scale features to obtain super-resolved light-field is proposed in this paper. The deep-learning-based network framework contains three major modules: multi-scale feature extraction, global feature fusion, and up-sampling. Firstly, inherent structural features in the 4D light-field are learned through the multi-scale feature extraction module, and then the fusion module is exploited for feature fusion and enhancement. Finally, the up-sampling module is used to achieve light-field super-resolution. The experimental results on the synthetic light-field dataset and real-world light-field dataset showed that this method outperforms other state-of-the-art methods in both visual and numerical evaluations. In addition, the super-resolved light-field images were applied to depth estimation in this paper, the results illustrated that the disparity map was enhanced through the light-field spatial super-resolution.
- super-resolution /
- light-field /
- deep learning /
- multi-scale feature extraction /
- feature fusion

FullText(HTML)

References

[1]	Lippmann G. Épreuves réversibles donnant la sensation du relief[J]. Journal de Physique Théorique et Appliquée, 1908, 7(1): 821-825. doi: 10.1051/jphystap:019080070082100 CrossRef Google Scholar
[2]	Adelson E H, Wang J Y A. Single lens stereo with a plenoptic camera[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1992, 14(2): 99-106. doi: 10.1109/34.121783 CrossRef Google Scholar
[3]	Ng R, Levoy M, Brédif M, et al. Light field photography with a hand-held plenoptic camera[R]. Stanford Tech Report CTSR 2005-02, 2005. Google Scholar
[4]	Tan Z P, Johnson K, Clifford C, et al. Development of a modular, high-speed plenoptic-camera for 3D flow-measurement[J]. Optics Express, 2019, 27(9): 13400-13415. doi: 10.1364/OE.27.013400 CrossRef Google Scholar
[5]	Fahringer T W, Lynch K P, Thurow B S. Volumetric particle image velocimetry with a single plenoptic camera[J]. Measurement Science and Technology, 2015, 26(11): 115201. doi: 10.1088/0957-0233/26/11/115201 CrossRef Google Scholar
[6]	Shi S X, Ding J F, New T H, et al. Volumetric calibration enhancements for single-camera light-field PIV[J]. Experiments in Fluids, 2019, 60(1): 21. doi: 10.1007/s00348-018-2670-5 CrossRef Google Scholar
[7]	Shi S X, Ding J F, New T H, et al. Light-field camera-based 3D volumetric particle image velocimetry with dense ray tracing reconstruction technique[J]. Experiments in Fluids, 2017, 58(7): 78. doi: 10.1007/s00348-017-2365-3 CrossRef Google Scholar
[8]	Shi S X, Wang J H, Ding J F, et al. Parametric study on light field volumetric particle image velocimetry[J]. Flow Measurement and Instrumentation, 2016, 49: 70-88. doi: 10.1016/j.flowmeasinst.2016.05.006 CrossRef Google Scholar
[9]	Sun J, Xu C L, Zhang B, et al. Three-dimensional temperature field measurement of flame using a single light field camera[J]. Optics Express, 2016, 24(2): 1118-1132. doi: 10.1364/OE.24.001118 CrossRef Google Scholar
[10]	Shi S X, Xu S M, Zhao Z, et al. 3D surface pressure measurement with single light-field camera and pressure-sensitive paint[J]. Experiments in Fluids, 2018, 59(5): 79. doi: 10.1007/s00348-018-2534-z CrossRef Google Scholar
[11]	Ding J F, Li H T, Ma H X, et al. A novel light field imaging based 3D geometry measurement technique for turbomachinery blades[J]. Measurement Science and Technology, 2019, 30(11): 115901. doi: 10.1088/1361-6501/ab310b CrossRef Google Scholar
[12]	Cheng Z, Xiong Z W, Chen C, et al. Light field super-resolution: a benchmark[C]//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, Long Beach, CA, 2019. Google Scholar
[13]	Lim J, Ok H, Park B, et al. Improving the spatail resolution based on 4D light field data[C]//Proceedings of the 16th IEEE International Conference on Image Processing, Cairo, Egypt, 2009, 2: 1173-1176. Google Scholar
[14]	Georgiev T, Chunev G, Lumsdaine A. Superresolution with the focused plenoptic camera[J]. Proceedings of SPIE, 2011, 7873: 78730X. doi: 10.1117/12.872666 CrossRef Google Scholar
[15]	Bishop T E, Favaro P. The light field camera: extended depth of field, aliasing, and superresolution[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2012, 34(5): 972-986. doi: 10.1109/TPAMI.2011.168 CrossRef Google Scholar
[16]	Rossi M, Frossard P. Graph-based light field super-resolution[C]//Proceedings of the IEEE 19th International Workshop on Multimedia Signal Processing, Luton, UK, 2017: 1-6. Google Scholar
[17]	Alain M, Smolic A. Light field super-resolution via LFBM5D sparse coding[C]//Proceedings of the 25th IEEE International Conference on Image Processing, Athens, Greece, 2018: 1-5. Google Scholar
[18]	Egiazarian K, Katkovnik V. Single image super-resolution via BM3D sparse coding[C]//Proceedings of the 23rd European Signal Processing Conference, Nice, France, 2015: 2849-2853. Google Scholar
[19]	Alain M, Smolic A. Light field denoising by sparse 5D transform domain collaborative filtering[C]//Proceedings of the IEEE 19th International Workshop on Multimedia Signal Processing, Luton, UK, 2017: 1-6. Google Scholar
[20]	Yoon Y, Jeon H G, Yoo D, et al. Learning a deep convolutional network for light-field image super-resolution[C]//Proceedings of 2015 IEEE International Conference on Computer Vision Workshop, Santiago, Chile, 2015: 57-65. Google Scholar
[21]	Wang Y L, Liu F, Zhang K B, et al. LFNet: a novel bidirectional recurrent convolutional neural network for light-field image super-resolution[J]. IEEE Transactions on Image Processing, 2018, 27(9): 4274-4286. doi: 10.1109/TIP.2018.2834819 CrossRef Google Scholar
[22]	Zhang S, Lin Y F, Sheng H. Residual networks for light field image super-resolution[C]//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 2019: 11046-11055. Google Scholar
[23]	Wang L G, Wang Y Q, Liang Z F, et al. Learning parallax attention for stereo image super-resolution[C]//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 2019: 12250-12259. Google Scholar
[24]	Chen L C, Zhu Y K, Papandreou G, et al. Encoder-decoder with atrous separable convolution for semantic image segmentation[C]//Proceedings of the European Conference on Computer Vision, Glasgow, United Kingdom, 2018: 801-818. Google Scholar
[25]	汪荣贵, 刘雷雷, 杨娟, 等.基于聚类和协同表示的超分辨率重建[J].光电工程, 2018, 45(4): 170537. doi: 10.12086/oee.2018.170537 CrossRef Google Scholar Wang R G, Liu L L, Yang J, et al. Image super-resolution based on clustering and collaborative representation[J]. Opto-Electronic Engineering, 2018, 45(4): 170537. doi: 10.12086/oee.2018.170537 CrossRef Google Scholar
[26]	Shi W Z, Caballero J, Huszár F, et al. Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, 2016: 1874-1883. Google Scholar
[27]	徐亮, 符冉迪, 金炜, 等.基于多尺度特征损失函数的图像超分辨率重建[J].光电工程, 2019, 46(11): 180419. doi: 10.12086/oee.2019.180419 CrossRef Google Scholar Xu L, Fu R D, Jin W, et al. Image super-resolution reconstruction based on multi-scale feature loss function[J]. Opto-Electronic Engineering, 2019, 46(11): 180419. doi: 10.12086/oee.2019.180419 CrossRef Google Scholar
[28]	Wanner S, Meister S, Goldluecke B. Datasets and benchmarks for densely sampled 4D light fields[M]//Bronstein M, Favre J, Hormann K. Vision, Modeling & Visualization, Lugano, Switzerland: The Eurographics Association, 2013: 225-226. Google Scholar
[29]	Honauer K, Johannsen O, Kondermann D, et al. A dataset and evaluation methodology for depth estimation on 4D light fields[C]//Proceedings of the Asian Conference on Computer Vision, Taipei, Taiwan, China, 2016: 19-34. Google Scholar
[30]	Raj S A, Lowney M, Shah R, et al. Stanford lytro light field archive[EB/OL]. http://lightfields.stanford.edu/LF2016.html. 2016. Google Scholar
[31]	Rerabek M, Ebrahimi T. New light field image dataset[C]//Proceedings of the 8th International Conference on Quality of Multimedia Experience, Lisbon, Portugal, 2016. Google Scholar
[32]	Chu X X, Zhang B, Ma H L, et al. Fast, accurate and lightweight super-resolution with neural architecture search[Z]. arXiv: 1901.07261, 2019. Google Scholar
[33]	Kingma D P, Ba L J. Adam: a method for stochastic optimization[C]//Proceedings of the International Conference on Learning Representations, San Diego, America, 2015. Google Scholar
[34]	Glorot X, Bengio Y. Understanding the difficulty of training deep feedforward neural networks[C]//Proceedings of the 13th International Conference on Artificial Intelligence and Statistics, Sardinia, Italy, 2010: 249-256. Google Scholar
[35]	Bevilacqua M, Roumy A, Guillemot C, et al. Low-complexity single-image super-resolution based on nonnegative neighbor embedding[C]//British Machine Vision Conference, Guildford, UK, 2012. Google Scholar
[36]	Chen J, Hou J H, Ni Y, et al. Accurate light field depth estimation with superpixel regularization over partially occluded regions[J]. IEEE Transactions on Image Processing, 2018, 27(10): 4889-4900. doi: 10.1109/TIP.2018.2839524 CrossRef Google Scholar

Overview

Overview

Overview: As a new generation of imaging equipment, a light-field camera can simultaneously capture the spatial position and incident angle of light rays. However, the recorded light-field has a trade-off between spatial resolution and angular resolution. Especially the limited spatial resolution of sub-aperture images limits the application scenarios of light-field cameras. Therefore, a light-field super-resolution network that fuses multi-scale features to obtain super-resolved light-field is proposed in this paper. The deep-learning-based network framework contains three major modules: multi-scale feature extraction module, global feature fusion module, and up-sampling module. The design ideas of different modules are as follows.

a) Multi-scale feature extraction module: To explore the complex texture information in the 4D light-field space, the feature extraction module uses ResASPP blocks to expand the perception field and to extract multi-scale features. The low-resolution light-field sub-aperture images are first sent to a Conv block and a Res block for low level feature extraction, and then a ResASPP block and a Res block are alternated twice to learn multi-scale features that accumulate high-frequency information in the 4D light-field.
b) Global feature fusion module: The light-field images contain not only spatial information but also angular information, which implies inherent structures of 4D light-field. The global feature fusion module is proposed to geometrically reconstruct the super-resolved light-field by exploiting the angular clues. It should be noted that the feature maps of all the sub-images from the upstream are first stacked in the channel dimension of the network and then are sent to this module for high-level features extraction.
c) Up-sampling module: After learning the global features in the 4D light-field structure, the high-level feature maps could be sent to the up-sampling module for light-field super resolution. This module uses sub-pixel convolution or pixel shuffle operation to obtain 2 spatial super-resolution, after feature maps are sent to a conventional convolution layer to perform feature fusion and finally output a super-resolved light-field sub-images array.
The network proposed in this paper was applied to the synthetic light-field dataset and the real-world light-field dataset for light-field images super-resolution. The experimental results on the synthetic light-field dataset and real-world light-field dataset showed that this method outperforms other state-of-the-art methods in both visual and numerical evaluations. In addition, the super-resolved light-field images were applied to depth estimation, and the results illustrated the parallax calculation enhancement of light-field spatial super-resolution, especially in occlusion and edge regions.