Citation: |
|
[1] | Ng R, Levoy M, Brédif M, et al. Light field photography with a hand-held plenoptic camera[R]. Computer Science Technical Report CSTR2, 2005: 1-11. |
[2] | Zhang S, Sheng H, Li C, et al. Robust depth estimation for light field via spinning parallelogram operator[J]. Comput Vis Image Underst, 2016, 145: 148-159. doi: 10.1016/j.cviu.2015.12.007 |
[3] | Wanner S, Goldluecke B. Globally consistent depth labeling of 4D light fields[C]//2012 IEEE Conference on Computer Vision and Pattern Recognition, 2012. |
[4] | Chen J, Hou J H, Ni Y, et al. Accurate light field depth estimation with superpixel regularization over partially occluded regions[J]. IEEE Trans Image Process, 2018, 27(10): 4889-4900. doi: 10.1109/TIP.2018.2839524 |
[5] | Jeon H G, Park J, Choe G, et al. Accurate depth map estimation from a lenslet light field camera[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2015. |
[6] | Williem, Park I K, Lee K M. Robust light field depth estimation using occlusion-noise aware data costs[J]. IEEE Trans Pattern Anal Mach Intell, 2018, 40(10): 2484-2497. doi: 10.1109/TPAMI.2017.2746858 |
[7] | Williem W, Park I K. Robust light field depth estimation for noisy scene with occlusion[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016. |
[8] | Tao M W, Hadap S, Malik J, et al. Depth from combining defocus and correspondence using light-field cameras[C]//Proceedings of the IEEE International Conference on Computer Vision, 2013. |
[9] | Wang T C, Efros A, Ramamoorthi R. Occlusion-aware depth estimation using light-field cameras[C]//Proceedings of the IEEE International Conference on Computer Vision, 2015. |
[10] | Sheng H, Zhao P, Zhang S, et al. Occlusion-aware depth estimation for light field using multi-orientation EPIs[J]. Pattern Recognit, 2018, 74: 587-599. doi: 10.1016/j.patcog.2017.09.010 |
[11] | Guo Z H, Wu J L, Chen X F, et al. Accurate light field depth estimation using multi-orientation partial angular coherence[J]. IEEE Access, 2019, 7: 169123-169132. doi: 10.1109/ACCESS.2019.2954892 |
[12] | Strecke M, Goldluecke B. Sublabel-accurate convex relaxation with total generalized variation regularization[C]//German Conference on Pattern Recognition, 2018. |
[13] | Lin H T, Chen C, Kang S B, et al. Depth recovery from light field using focal stack symmetry[C]//Proceedings of the IEEE International Conference on Computer Vision, 2015. |
[14] | Sheng H, Zhang S, Cao X C, et al. Geometric occlusion analysis in depth estimation using integral guided filter for light-field image[J]. IEEE Trans Image Process, 2017, 26(12): 5758-5771. doi: 10.1109/TIP.2017.2745100 |
[15] | Honauer K, Johannsen O, Kondermann D, et al. A dataset and evaluation methodology for depth estimation on 4d light fields[C]//Asian Conference on Computer Vision, 2016. |
Overview: Depth estimation from multiple images is a central task in computer vision. Reliable depth information provides an effective source for visual tasks, such as target detection, image segmentation, and special effects for movies. As one of the new multi-view image acquisition devices, the light field camera makes it more convenient to acquire multiple images data. A light field camera can simultaneously sample a scene from multiple viewpoints with a single exposure, which has unique advantages in portability and depth accuracy over other depth sensors. Noise is a challenging issue for light field depth estimation. Especially for high-noise scenes containing occlusion, the simultaneous presence of occlusion and noise makes depth acquisition more difficult. For this problem, we present a light field depth estimation algorithm that is robust to occlusion and noise. The proposed method uses an inline occlusion handling framework. By integrating the occlusion handling into the anti-noise cost volume, the anti-occlusion ability of the proposed method is improved while maintaining the anti-noise performance. For the construction of the anti-noise cost volume, a focal stack matching measure based on the double-directions defocusing is proposed, which increases the defocus direction of the traditional focal stack and introduces more samples for a single match. More samples allow the algorithm to select the sample with the lowest matching cost, thereby improving the anti-noise performance. For occlusion handling, the occlusion mode in noisy scenes has greater computational difficulty. To eliminate the influence of occlusion on the focal stack and not be interfered by noise, the proposed algorithm designs view masks for different occlusion modes and constructs the cost volume respectively, and then adaptively selects the best volume according to the matching cost. After the cost volume is constructed, we use the filter-based algorithm to further smooth the cost volume. Because of the problem that traditional filtering methods cannot preserve the occlusion boundary, we design a multi-template filtering strategy. This strategy designs filters for occlusion in different directions and can better preserve the edge structure of the scene. Experiments are conducted on the HCI synthetic dataset and Stanford Lytro Illum dataset for real scenes. For quantitative evaluation, we use the percentage of bad pixels and the mean square error to measure the pros and cons of every algorithm. Experimental results show that the proposed method achieves better performance than other state-of-the-art methods for scenes where occlusion and noise exist at the same time.
The algorithm framework
Focal stack matching measurement based on double-directions defocusing (angel resolution 3×3)
The influence of occlusion on focal stack matching (angel resolution 5×1), the bottom trianglesare the view points.
Partial matching of focal stack (angle resolution:5×1).
Focal stack matching with an integrated view mask (angle resolution: 9×9)
View masks for occlusion in different directions and sides (angle resolution 3×3)
Comparison of sampling using opposite defocusing directions and different side view masks.
Comparison of disparity estimation results of synthetic scenes with different noise levels
Comparison of real scene disparity estimation results
Performance comparison of algorithms in non-occluded regions
Performance comparison of algorithms in occluded regions