3D video super-resolution using fully convolutional neural networks

Yanchun Xie; Jimin Xiao; Tammam Tillo; Yunchao Wei; Yao Zhao

doi:10.1109/ICME.2016.7552931

3D video super-resolution using fully convolutional neural networks

Yanchun Xie, Jimin Xiao, Tammam Tillo, Yunchao Wei, Yao Zhao

Research output: Chapter in Book or Report/Conference proceeding › Conference Proceeding › peer-review

14 Citations (Scopus)

Abstract

Large amount of redundant information and huge data size have been a serious problem for multiview video systems. To address this problem, one popular solution is mixed-resolution, where only few viewpoints are kept with full resolution and other views are kept with lower resolution. In this paper, we propose a super-resolution (SR) method, where the low-resolution viewpoints in the 3D video are up-sampled using a fully convolutional neural network. By simply projecting the neighboring high resolution image to the position of the low resolution image, we learn the relationship of high and low resolution patches, and reconstruct the low resolution images into high resolution ones using the projected image information. We propose to use a fully convolutional neural network to establish a mapping between those images. The network is barely trained on 17 pairs of multiview images, and tested on other multiview images and video sequences. It is observed that our proposed method outperforms existing methods objectively and subjectively, with more than 1 dB average gain achieved. Meanwhile, our network training procedure is efficient, with less than 3 hours using one Titan X GPU.

Original language	English
Title of host publication	2016 IEEE International Conference on Multimedia and Expo, ICME 2016
Publisher	IEEE Computer Society
ISBN (Electronic)	9781467372589
DOIs	https://doi.org/10.1109/ICME.2016.7552931
Publication status	Published - 25 Aug 2016
Event	2016 IEEE International Conference on Multimedia and Expo, ICME 2016 - Seattle, United States Duration: 11 Jul 2016 → 15 Jul 2016

Publication series

Name	Proceedings - IEEE International Conference on Multimedia and Expo
Volume	2016-August
ISSN (Print)	1945-7871
ISSN (Electronic)	1945-788X

Conference

Conference	2016 IEEE International Conference on Multimedia and Expo, ICME 2016
Country/Territory	United States
City	Seattle
Period	11/07/16 → 15/07/16

Keywords

convolutional neural network
depth map
mix-resolution
super resolution
training
virtual view

Access to Document

10.1109/ICME.2016.7552931

Cite this

@inproceedings{80846a7a16dc40ab9da92af6bcede19b,

title = "3D video super-resolution using fully convolutional neural networks",

abstract = "Large amount of redundant information and huge data size have been a serious problem for multiview video systems. To address this problem, one popular solution is mixed-resolution, where only few viewpoints are kept with full resolution and other views are kept with lower resolution. In this paper, we propose a super-resolution (SR) method, where the low-resolution viewpoints in the 3D video are up-sampled using a fully convolutional neural network. By simply projecting the neighboring high resolution image to the position of the low resolution image, we learn the relationship of high and low resolution patches, and reconstruct the low resolution images into high resolution ones using the projected image information. We propose to use a fully convolutional neural network to establish a mapping between those images. The network is barely trained on 17 pairs of multiview images, and tested on other multiview images and video sequences. It is observed that our proposed method outperforms existing methods objectively and subjectively, with more than 1 dB average gain achieved. Meanwhile, our network training procedure is efficient, with less than 3 hours using one Titan X GPU.",

keywords = "convolutional neural network, depth map, mix-resolution, super resolution, training, virtual view",

author = "Yanchun Xie and Jimin Xiao and Tammam Tillo and Yunchao Wei and Yao Zhao",

note = "Publisher Copyright: {\textcopyright} 2016 IEEE.; 2016 IEEE International Conference on Multimedia and Expo, ICME 2016 ; Conference date: 11-07-2016 Through 15-07-2016",

year = "2016",

month = aug,

day = "25",

doi = "10.1109/ICME.2016.7552931",

language = "English",

series = "Proceedings - IEEE International Conference on Multimedia and Expo",

publisher = "IEEE Computer Society",

booktitle = "2016 IEEE International Conference on Multimedia and Expo, ICME 2016",

}

Xie, Y, Xiao, J, Tillo, T, Wei, Y & Zhao, Y 2016, 3D video super-resolution using fully convolutional neural networks. in 2016 IEEE International Conference on Multimedia and Expo, ICME 2016., 7552931, Proceedings - IEEE International Conference on Multimedia and Expo, vol. 2016-August, IEEE Computer Society, 2016 IEEE International Conference on Multimedia and Expo, ICME 2016, Seattle, United States, 11/07/16. https://doi.org/10.1109/ICME.2016.7552931

3D video super-resolution using fully convolutional neural networks. / Xie, Yanchun; Xiao, Jimin; Tillo, Tammam et al.
2016 IEEE International Conference on Multimedia and Expo, ICME 2016. IEEE Computer Society, 2016. 7552931 (Proceedings - IEEE International Conference on Multimedia and Expo; Vol. 2016-August).

Research output: Chapter in Book or Report/Conference proceeding › Conference Proceeding › peer-review

TY - GEN

T1 - 3D video super-resolution using fully convolutional neural networks

AU - Xie, Yanchun

AU - Xiao, Jimin

AU - Tillo, Tammam

AU - Wei, Yunchao

AU - Zhao, Yao

PY - 2016/8/25

Y1 - 2016/8/25

N2 - Large amount of redundant information and huge data size have been a serious problem for multiview video systems. To address this problem, one popular solution is mixed-resolution, where only few viewpoints are kept with full resolution and other views are kept with lower resolution. In this paper, we propose a super-resolution (SR) method, where the low-resolution viewpoints in the 3D video are up-sampled using a fully convolutional neural network. By simply projecting the neighboring high resolution image to the position of the low resolution image, we learn the relationship of high and low resolution patches, and reconstruct the low resolution images into high resolution ones using the projected image information. We propose to use a fully convolutional neural network to establish a mapping between those images. The network is barely trained on 17 pairs of multiview images, and tested on other multiview images and video sequences. It is observed that our proposed method outperforms existing methods objectively and subjectively, with more than 1 dB average gain achieved. Meanwhile, our network training procedure is efficient, with less than 3 hours using one Titan X GPU.

AB - Large amount of redundant information and huge data size have been a serious problem for multiview video systems. To address this problem, one popular solution is mixed-resolution, where only few viewpoints are kept with full resolution and other views are kept with lower resolution. In this paper, we propose a super-resolution (SR) method, where the low-resolution viewpoints in the 3D video are up-sampled using a fully convolutional neural network. By simply projecting the neighboring high resolution image to the position of the low resolution image, we learn the relationship of high and low resolution patches, and reconstruct the low resolution images into high resolution ones using the projected image information. We propose to use a fully convolutional neural network to establish a mapping between those images. The network is barely trained on 17 pairs of multiview images, and tested on other multiview images and video sequences. It is observed that our proposed method outperforms existing methods objectively and subjectively, with more than 1 dB average gain achieved. Meanwhile, our network training procedure is efficient, with less than 3 hours using one Titan X GPU.

KW - convolutional neural network

KW - depth map

KW - mix-resolution

KW - super resolution

KW - training

KW - virtual view

UR - http://www.scopus.com/inward/record.url?scp=84987653162&partnerID=8YFLogxK

U2 - 10.1109/ICME.2016.7552931

DO - 10.1109/ICME.2016.7552931

M3 - Conference Proceeding

AN - SCOPUS:84987653162

T3 - Proceedings - IEEE International Conference on Multimedia and Expo

BT - 2016 IEEE International Conference on Multimedia and Expo, ICME 2016

PB - IEEE Computer Society

T2 - 2016 IEEE International Conference on Multimedia and Expo, ICME 2016

Y2 - 11 July 2016 through 15 July 2016

ER -

3D video super-resolution using fully convolutional neural networks

Abstract

Publication series

Conference

Keywords

Access to Document

Other files and links

Cite this