Multiview video quality enhancement without depth information

Samer Jammal; Tammam Tillo; Jimin Xiao

doi:10.1016/j.image.2019.03.014

Multiview video quality enhancement without depth information

Samer Jammal, Tammam Tillo, Jimin Xiao^*

^*Corresponding author for this work

Department of Intelligent Science

Research output: Contribution to journal › Article › peer-review

7 Citations (Scopus)

Abstract

The past decade has witnessed fast development in multiview 3D video technologies, such as Three-Dimensional Video (3DV), Virtual Reality (VR), and Free Viewpoint Video (FVV). However, large information redundancy and a vast amount of multiview video data needs to be stored or transmitted, which poses a serious problem for multiview video systems. Asymmetric multiview video compression can alleviate this problem by coding views with different qualities. Only several viewpoints are kept with high-quality and other views are highly compressed to low-quality. However, highly compressed views may incur severe quality degradation. Thus, it is necessary to enhance the visual quality of highly compressed views at the decoder side. Exploiting similarities among the multiview images is the key to efficiently reconstruct the multiview compressed views. In this paper, we propose a novel method for multiview quality enhancement, which directly learns an end-to-end mapping between the low-quality and high-quality views and recovers the details of the low-quality view. The mapping process is realized using a deep convolutional neural network (MVENet). MVENet takes a low-quality image of one view and a high-quality image of another view of the same scene as inputs and outputs an enhanced image for the low-quality view. To the best of our knowledge, this is the first work for multiview video enhancement where neither a depth map nor a projected virtual view is required in the enhancement process. Experimental results on both computer graphic and real datasets demonstrate the effectiveness of the proposed approach with a peak signal-to-noise ratio (PSNR) gain of up to 2dB over low-quality compressed views using HEVC and up to 3.7dB over low-quality compressed views using JPEG on the benchmark Cityscapes.

Original language	English
Pages (from-to)	22-31
Number of pages	10
Journal	Signal Processing: Image Communication
Volume	75
DOIs	https://doi.org/10.1016/j.image.2019.03.014
Publication status	Published - Jul 2019

Keywords

Asymmetric multiview video
Asymmetric stereoscopic video
Convolutional neural network
Deep learning
HEVC
JPEG
Multiview video
Quality enhancement
Video coding

Access to Document

10.1016/j.image.2019.03.014

Cite this

@article{419dda81f955494db60722f674c58257,

title = "Multiview video quality enhancement without depth information",

abstract = "The past decade has witnessed fast development in multiview 3D video technologies, such as Three-Dimensional Video (3DV), Virtual Reality (VR), and Free Viewpoint Video (FVV). However, large information redundancy and a vast amount of multiview video data needs to be stored or transmitted, which poses a serious problem for multiview video systems. Asymmetric multiview video compression can alleviate this problem by coding views with different qualities. Only several viewpoints are kept with high-quality and other views are highly compressed to low-quality. However, highly compressed views may incur severe quality degradation. Thus, it is necessary to enhance the visual quality of highly compressed views at the decoder side. Exploiting similarities among the multiview images is the key to efficiently reconstruct the multiview compressed views. In this paper, we propose a novel method for multiview quality enhancement, which directly learns an end-to-end mapping between the low-quality and high-quality views and recovers the details of the low-quality view. The mapping process is realized using a deep convolutional neural network (MVENet). MVENet takes a low-quality image of one view and a high-quality image of another view of the same scene as inputs and outputs an enhanced image for the low-quality view. To the best of our knowledge, this is the first work for multiview video enhancement where neither a depth map nor a projected virtual view is required in the enhancement process. Experimental results on both computer graphic and real datasets demonstrate the effectiveness of the proposed approach with a peak signal-to-noise ratio (PSNR) gain of up to 2dB over low-quality compressed views using HEVC and up to 3.7dB over low-quality compressed views using JPEG on the benchmark Cityscapes.",

keywords = "Asymmetric multiview video, Asymmetric stereoscopic video, Convolutional neural network, Deep learning, HEVC, JPEG, Multiview video, Quality enhancement, Video coding",

author = "Samer Jammal and Tammam Tillo and Jimin Xiao",

note = "Publisher Copyright: {\textcopyright} 2019 Elsevier B.V.",

year = "2019",

month = jul,

doi = "10.1016/j.image.2019.03.014",

language = "English",

volume = "75",

pages = "22--31",

journal = "Signal Processing: Image Communication",

issn = "0923-5965",

}

TY - JOUR

T1 - Multiview video quality enhancement without depth information

AU - Jammal, Samer

AU - Tillo, Tammam

AU - Xiao, Jimin

PY - 2019/7

Y1 - 2019/7

N2 - The past decade has witnessed fast development in multiview 3D video technologies, such as Three-Dimensional Video (3DV), Virtual Reality (VR), and Free Viewpoint Video (FVV). However, large information redundancy and a vast amount of multiview video data needs to be stored or transmitted, which poses a serious problem for multiview video systems. Asymmetric multiview video compression can alleviate this problem by coding views with different qualities. Only several viewpoints are kept with high-quality and other views are highly compressed to low-quality. However, highly compressed views may incur severe quality degradation. Thus, it is necessary to enhance the visual quality of highly compressed views at the decoder side. Exploiting similarities among the multiview images is the key to efficiently reconstruct the multiview compressed views. In this paper, we propose a novel method for multiview quality enhancement, which directly learns an end-to-end mapping between the low-quality and high-quality views and recovers the details of the low-quality view. The mapping process is realized using a deep convolutional neural network (MVENet). MVENet takes a low-quality image of one view and a high-quality image of another view of the same scene as inputs and outputs an enhanced image for the low-quality view. To the best of our knowledge, this is the first work for multiview video enhancement where neither a depth map nor a projected virtual view is required in the enhancement process. Experimental results on both computer graphic and real datasets demonstrate the effectiveness of the proposed approach with a peak signal-to-noise ratio (PSNR) gain of up to 2dB over low-quality compressed views using HEVC and up to 3.7dB over low-quality compressed views using JPEG on the benchmark Cityscapes.

AB - The past decade has witnessed fast development in multiview 3D video technologies, such as Three-Dimensional Video (3DV), Virtual Reality (VR), and Free Viewpoint Video (FVV). However, large information redundancy and a vast amount of multiview video data needs to be stored or transmitted, which poses a serious problem for multiview video systems. Asymmetric multiview video compression can alleviate this problem by coding views with different qualities. Only several viewpoints are kept with high-quality and other views are highly compressed to low-quality. However, highly compressed views may incur severe quality degradation. Thus, it is necessary to enhance the visual quality of highly compressed views at the decoder side. Exploiting similarities among the multiview images is the key to efficiently reconstruct the multiview compressed views. In this paper, we propose a novel method for multiview quality enhancement, which directly learns an end-to-end mapping between the low-quality and high-quality views and recovers the details of the low-quality view. The mapping process is realized using a deep convolutional neural network (MVENet). MVENet takes a low-quality image of one view and a high-quality image of another view of the same scene as inputs and outputs an enhanced image for the low-quality view. To the best of our knowledge, this is the first work for multiview video enhancement where neither a depth map nor a projected virtual view is required in the enhancement process. Experimental results on both computer graphic and real datasets demonstrate the effectiveness of the proposed approach with a peak signal-to-noise ratio (PSNR) gain of up to 2dB over low-quality compressed views using HEVC and up to 3.7dB over low-quality compressed views using JPEG on the benchmark Cityscapes.

KW - Asymmetric multiview video

KW - Asymmetric stereoscopic video

KW - Convolutional neural network

KW - Deep learning

KW - HEVC

KW - JPEG

KW - Multiview video

KW - Quality enhancement

KW - Video coding

UR - http://www.scopus.com/inward/record.url?scp=85063395960&partnerID=8YFLogxK

U2 - 10.1016/j.image.2019.03.014

DO - 10.1016/j.image.2019.03.014

M3 - Article

AN - SCOPUS:85063395960

SN - 0923-5965

VL - 75

SP - 22

EP - 31

JO - Signal Processing: Image Communication

JF - Signal Processing: Image Communication

ER -

Multiview video quality enhancement without depth information

Abstract

Keywords

Access to Document

Other files and links

Cite this