Learning deep semantic attributes for user video summarization

Ke Sun, Jiasong Zhu, Zhuo Lei, Xianxu Hou, Qian Zhang, Jiang Duan, Guoping Qiu

Research output: Chapter in Book or Report/Conference proceedingConference Proceedingpeer-review

12 Citations (Scopus)

Abstract

This paper presents a Semantic Attribute assisted video SUMmarization framework (SASUM). Compared with traditional methods, SASUM has several innovative features. Firstly, we use a natural language processing tool to discover a set of keywords from an image and text corpora to form the semantic attributes of visual contents. Secondly, we train a deep convolution neural network to extract visual features as well as predict the semantic attributes of video segments which enables us to represent video contents with visual and semantic features simultaneously. Thirdly, we construct a temporally constrained video segment affinity matrix and use a partially near duplicate image discovery technique to cluster visually and semantically consistent video frames together. These frame clusters can then be condensed to form an informative and compact summary of the video. We will present experimental results to show the effectiveness of the semantic attributes in assisting the visual features in video summarization and our new technique achieves state-of-the-art performance.

Original languageEnglish
Title of host publication2017 IEEE International Conference on Multimedia and Expo, ICME 2017
PublisherIEEE Computer Society
Pages643-648
Number of pages6
ISBN (Electronic)9781509060672
DOIs
Publication statusPublished - 28 Aug 2017
Externally publishedYes
Event2017 IEEE International Conference on Multimedia and Expo, ICME 2017 - Hong Kong, Hong Kong
Duration: 10 Jul 201714 Jul 2017

Publication series

NameProceedings - IEEE International Conference on Multimedia and Expo
Volume0
ISSN (Print)1945-7871
ISSN (Electronic)1945-788X

Conference

Conference2017 IEEE International Conference on Multimedia and Expo, ICME 2017
Country/TerritoryHong Kong
CityHong Kong
Period10/07/1714/07/17

Keywords

  • Bundling Center Clustering
  • Deep Convolution Neural Network
  • Semantic Attribute
  • Video Summarization

Fingerprint

Dive into the research topics of 'Learning deep semantic attributes for user video summarization'. Together they form a unique fingerprint.

Cite this