TY - JOUR
T1 - Detail Preserving Coarse-to-Fine Matching for Stereo Matching and Optical Flow
AU - Deng, Yong
AU - Xiao, Jimin
AU - Zhou, Steven Zhiying
AU - Feng, Jiashi
N1 - Publisher Copyright:
© 1992-2012 IEEE.
PY - 2021
Y1 - 2021
N2 - The Coarse-To-Fine (CTF) matching scheme has been widely applied to reduce computational complexity and matching ambiguity in stereo matching and optical flow tasks by converting image pairs into multi-scale representations and performing matching from coarse to fine levels. Despite its efficiency, it suffers from several weaknesses, such as tending to blur the edges and miss small structures like thin bars and holes. We find that the pixels of small structures and edges are often assigned with wrong disparity/flow in the upsampling process of the CTF framework, introducing errors to the fine levels and leading to such weaknesses. We observe that these wrong disparity/flow values can be avoided if we select the best-matched value among their neighborhood, which inspires us to propose a novel differentiable Neighbor-Search Upsampling (NSU) module. The NSU module first estimates the matching scores and then selects the best-matched disparity/flow for each pixel from its neighbors. It effectively preserves finer structure details by exploiting the information from the finer level while upsampling the disparity/flow. The proposed module can be a drop-in replacement of the naive upsampling in the CTF matching framework and allows the neural networks to be trained end-to-end. By integrating the proposed NSU module into a baseline CTF matching network, we design our Detail Preserving Coarse-To-Fine (DPCTF) matching network. Comprehensive experiments demonstrate that our DPCTF can boost performances for both stereo matching and optical flow tasks. Notably, our DPCTF achieves new state-of-the-art performances for both tasks - it outperforms the competitive baseline (Bi3D) by 28.8% (from 0.73 to 0.52) on EPE of the FlyingThings3D stereo dataset, and ranks first in KITTI flow 2012 benchmark. The code is available at https://github.com/Deng-Y/DPCTF.
AB - The Coarse-To-Fine (CTF) matching scheme has been widely applied to reduce computational complexity and matching ambiguity in stereo matching and optical flow tasks by converting image pairs into multi-scale representations and performing matching from coarse to fine levels. Despite its efficiency, it suffers from several weaknesses, such as tending to blur the edges and miss small structures like thin bars and holes. We find that the pixels of small structures and edges are often assigned with wrong disparity/flow in the upsampling process of the CTF framework, introducing errors to the fine levels and leading to such weaknesses. We observe that these wrong disparity/flow values can be avoided if we select the best-matched value among their neighborhood, which inspires us to propose a novel differentiable Neighbor-Search Upsampling (NSU) module. The NSU module first estimates the matching scores and then selects the best-matched disparity/flow for each pixel from its neighbors. It effectively preserves finer structure details by exploiting the information from the finer level while upsampling the disparity/flow. The proposed module can be a drop-in replacement of the naive upsampling in the CTF matching framework and allows the neural networks to be trained end-to-end. By integrating the proposed NSU module into a baseline CTF matching network, we design our Detail Preserving Coarse-To-Fine (DPCTF) matching network. Comprehensive experiments demonstrate that our DPCTF can boost performances for both stereo matching and optical flow tasks. Notably, our DPCTF achieves new state-of-the-art performances for both tasks - it outperforms the competitive baseline (Bi3D) by 28.8% (from 0.73 to 0.52) on EPE of the FlyingThings3D stereo dataset, and ranks first in KITTI flow 2012 benchmark. The code is available at https://github.com/Deng-Y/DPCTF.
KW - Stereo matching
KW - coarse-to-fine
KW - neural network
KW - optical flow
UR - http://www.scopus.com/inward/record.url?scp=85110250899&partnerID=8YFLogxK
U2 - 10.1109/TIP.2021.3088635
DO - 10.1109/TIP.2021.3088635
M3 - Article
C2 - 34138709
AN - SCOPUS:85110250899
SN - 1057-7149
VL - 30
SP - 5835
EP - 5847
JO - IEEE Transactions on Image Processing
JF - IEEE Transactions on Image Processing
M1 - 9459444
ER -