TY - GEN
T1 - 3D video super-resolution using fully convolutional neural networks
AU - Xie, Yanchun
AU - Xiao, Jimin
AU - Tillo, Tammam
AU - Wei, Yunchao
AU - Zhao, Yao
N1 - Publisher Copyright:
© 2016 IEEE.
PY - 2016/8/25
Y1 - 2016/8/25
N2 - Large amount of redundant information and huge data size have been a serious problem for multiview video systems. To address this problem, one popular solution is mixed-resolution, where only few viewpoints are kept with full resolution and other views are kept with lower resolution. In this paper, we propose a super-resolution (SR) method, where the low-resolution viewpoints in the 3D video are up-sampled using a fully convolutional neural network. By simply projecting the neighboring high resolution image to the position of the low resolution image, we learn the relationship of high and low resolution patches, and reconstruct the low resolution images into high resolution ones using the projected image information. We propose to use a fully convolutional neural network to establish a mapping between those images. The network is barely trained on 17 pairs of multiview images, and tested on other multiview images and video sequences. It is observed that our proposed method outperforms existing methods objectively and subjectively, with more than 1 dB average gain achieved. Meanwhile, our network training procedure is efficient, with less than 3 hours using one Titan X GPU.
AB - Large amount of redundant information and huge data size have been a serious problem for multiview video systems. To address this problem, one popular solution is mixed-resolution, where only few viewpoints are kept with full resolution and other views are kept with lower resolution. In this paper, we propose a super-resolution (SR) method, where the low-resolution viewpoints in the 3D video are up-sampled using a fully convolutional neural network. By simply projecting the neighboring high resolution image to the position of the low resolution image, we learn the relationship of high and low resolution patches, and reconstruct the low resolution images into high resolution ones using the projected image information. We propose to use a fully convolutional neural network to establish a mapping between those images. The network is barely trained on 17 pairs of multiview images, and tested on other multiview images and video sequences. It is observed that our proposed method outperforms existing methods objectively and subjectively, with more than 1 dB average gain achieved. Meanwhile, our network training procedure is efficient, with less than 3 hours using one Titan X GPU.
KW - convolutional neural network
KW - depth map
KW - mix-resolution
KW - super resolution
KW - training
KW - virtual view
UR - http://www.scopus.com/inward/record.url?scp=84987653162&partnerID=8YFLogxK
U2 - 10.1109/ICME.2016.7552931
DO - 10.1109/ICME.2016.7552931
M3 - Conference Proceeding
AN - SCOPUS:84987653162
T3 - Proceedings - IEEE International Conference on Multimedia and Expo
BT - 2016 IEEE International Conference on Multimedia and Expo, ICME 2016
PB - IEEE Computer Society
T2 - 2016 IEEE International Conference on Multimedia and Expo, ICME 2016
Y2 - 11 July 2016 through 15 July 2016
ER -