Neural texture transfer assisted video coding with adaptive up-sampling

Li Yu; Wenshuai Chang; Weize Quan; Jimin Xiao; Dong Ming Yan; Moncef Gabbouj

doi:10.1016/j.image.2022.116754

Neural texture transfer assisted video coding with adaptive up-sampling

Li Yu, Wenshuai Chang, Weize Quan, Jimin Xiao, Dong Ming Yan, Moncef Gabbouj^*

^*Corresponding author for this work

Department of Intelligent Science

Research output: Contribution to journal › Article › peer-review

1 Citation (Scopus)

Abstract

Deep learning techniques have been extensively investigated for the purpose of further increasing the efficiency of traditional video compression. Some deep learning techniques for down/up-sampling-based video coding were found to be especially effective when the bandwidth or storage is limited. Existing works mainly differ in the super-resolution models used. Some works simply use a single image super-resolution model, ignoring the rich information in the correlation between video frames, while others explore the correlation between frames by simply concatenating the features across adjacent frames. This, however, may fail when the textures are not well aligned. In this paper, we propose to utilize neural texture transfer which exploits the semantic correlation between frames and is able to explore the correlated information even when the textures are not aligned. Meanwhile, an adaptive group of pictures (GOP) method is proposed to automatically decide whether a frame should be down-sampled or not. Experimental results show that the proposed method outperforms the standard HEVC and state-of-the-art methods under different compression configurations. When compared to standard HEVC, the BD-rate (PSNR) and BD-rate (SSIM) of the proposed method are up to -19.1% and -26.5%, respectively.

Original language	English
Article number	116754
Journal	Signal Processing: Image Communication
Volume	107
DOIs	https://doi.org/10.1016/j.image.2022.116754
Publication status	Published - Sept 2022

Keywords

Deep learning
High-efficiency video coding (HEVC)
Low bitrate
Machine learning
Reference-based super-resolution
Video compression

Access to Document

10.1016/j.image.2022.116754

Cite this

@article{e98045cc4d85481eb8babc039dc5f7a2,

title = "Neural texture transfer assisted video coding with adaptive up-sampling",

abstract = "Deep learning techniques have been extensively investigated for the purpose of further increasing the efficiency of traditional video compression. Some deep learning techniques for down/up-sampling-based video coding were found to be especially effective when the bandwidth or storage is limited. Existing works mainly differ in the super-resolution models used. Some works simply use a single image super-resolution model, ignoring the rich information in the correlation between video frames, while others explore the correlation between frames by simply concatenating the features across adjacent frames. This, however, may fail when the textures are not well aligned. In this paper, we propose to utilize neural texture transfer which exploits the semantic correlation between frames and is able to explore the correlated information even when the textures are not aligned. Meanwhile, an adaptive group of pictures (GOP) method is proposed to automatically decide whether a frame should be down-sampled or not. Experimental results show that the proposed method outperforms the standard HEVC and state-of-the-art methods under different compression configurations. When compared to standard HEVC, the BD-rate (PSNR) and BD-rate (SSIM) of the proposed method are up to -19.1% and -26.5%, respectively.",

keywords = "Deep learning, High-efficiency video coding (HEVC), Low bitrate, Machine learning, Reference-based super-resolution, Video compression",

author = "Li Yu and Wenshuai Chang and Weize Quan and Jimin Xiao and Yan, {Dong Ming} and Moncef Gabbouj",

note = "Publisher Copyright: {\textcopyright} 2022 Elsevier B.V.",

year = "2022",

month = sep,

doi = "10.1016/j.image.2022.116754",

language = "English",

volume = "107",

journal = "Signal Processing: Image Communication",

issn = "0923-5965",

}

TY - JOUR

T1 - Neural texture transfer assisted video coding with adaptive up-sampling

AU - Yu, Li

AU - Chang, Wenshuai

AU - Quan, Weize

AU - Xiao, Jimin

AU - Yan, Dong Ming

AU - Gabbouj, Moncef

PY - 2022/9

Y1 - 2022/9

N2 - Deep learning techniques have been extensively investigated for the purpose of further increasing the efficiency of traditional video compression. Some deep learning techniques for down/up-sampling-based video coding were found to be especially effective when the bandwidth or storage is limited. Existing works mainly differ in the super-resolution models used. Some works simply use a single image super-resolution model, ignoring the rich information in the correlation between video frames, while others explore the correlation between frames by simply concatenating the features across adjacent frames. This, however, may fail when the textures are not well aligned. In this paper, we propose to utilize neural texture transfer which exploits the semantic correlation between frames and is able to explore the correlated information even when the textures are not aligned. Meanwhile, an adaptive group of pictures (GOP) method is proposed to automatically decide whether a frame should be down-sampled or not. Experimental results show that the proposed method outperforms the standard HEVC and state-of-the-art methods under different compression configurations. When compared to standard HEVC, the BD-rate (PSNR) and BD-rate (SSIM) of the proposed method are up to -19.1% and -26.5%, respectively.

AB - Deep learning techniques have been extensively investigated for the purpose of further increasing the efficiency of traditional video compression. Some deep learning techniques for down/up-sampling-based video coding were found to be especially effective when the bandwidth or storage is limited. Existing works mainly differ in the super-resolution models used. Some works simply use a single image super-resolution model, ignoring the rich information in the correlation between video frames, while others explore the correlation between frames by simply concatenating the features across adjacent frames. This, however, may fail when the textures are not well aligned. In this paper, we propose to utilize neural texture transfer which exploits the semantic correlation between frames and is able to explore the correlated information even when the textures are not aligned. Meanwhile, an adaptive group of pictures (GOP) method is proposed to automatically decide whether a frame should be down-sampled or not. Experimental results show that the proposed method outperforms the standard HEVC and state-of-the-art methods under different compression configurations. When compared to standard HEVC, the BD-rate (PSNR) and BD-rate (SSIM) of the proposed method are up to -19.1% and -26.5%, respectively.

KW - Deep learning

KW - High-efficiency video coding (HEVC)

KW - Low bitrate

KW - Machine learning

KW - Reference-based super-resolution

KW - Video compression

UR - http://www.scopus.com/inward/record.url?scp=85131914393&partnerID=8YFLogxK

U2 - 10.1016/j.image.2022.116754

DO - 10.1016/j.image.2022.116754

M3 - Article

AN - SCOPUS:85131914393

SN - 0923-5965

VL - 107

JO - Signal Processing: Image Communication

JF - Signal Processing: Image Communication

M1 - 116754

ER -

Neural texture transfer assisted video coding with adaptive up-sampling

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this