Character Prediction in TV Series via a Semantic Projection Network

Ke Sun, Zhuo Lei, Jiasong Zhu, Xianxu Hou, Bozhi Liu, Guoping Qiu*

*Corresponding author for this work

Research output: Chapter in Book or Report/Conference proceedingConference Proceedingpeer-review

Abstract

The goal of this paper is to automatically recognize characters in popular TV series. In contrast to conventional approaches which rely on weak supervision afforded by transcripts, subtitles or character facial data, we formulate the problem as the multi-label classification which requires only label-level supervision. We propose a novel semantic projection network consisting of two stacked subnetworks with specially designed constraints. The first subnetwork is a contractive autoencoder which focuses on reconstructing feature activations extracted from a pre-trained single-label convolutional neural network (CNN). The second subnetwork functions as a region-based multi-label classifier which produces character labels for the input video frame as well as reconstructing the input visual feature from the mapped semantic labels space. Extensive experiments show that the proposed model achieves state-of-the-art performance in comparison with recent approaches on three challenging TV series datasets (the Big Bang Theory, the Defenders and Nirvava in Fire).

Original languageEnglish
Title of host publicationMultiMedia Modeling - 25th International Conference, MMM 2019, Proceedings
EditorsIoannis Kompatsiaris, Stefanos Vrochidis, Vasileios Mezaris, Wen-Huang Cheng, Benoit Huet, Cathal Gurrin
PublisherSpringer Verlag
Pages300-311
Number of pages12
ISBN (Print)9783030057091
DOIs
Publication statusPublished - 2019
Externally publishedYes
Event25th International Conference on MultiMedia Modeling, MMM 2019 - Thessaloniki, Greece
Duration: 8 Jan 201911 Jan 2019

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume11295 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference25th International Conference on MultiMedia Modeling, MMM 2019
Country/TerritoryGreece
CityThessaloniki
Period8/01/1911/01/19

Keywords

  • Autoencoder
  • Character recognition
  • Convolutional neural network
  • Semantic projection
  • Video understanding

Fingerprint

Dive into the research topics of 'Character Prediction in TV Series via a Semantic Projection Network'. Together they form a unique fingerprint.

Cite this