Character Prediction in TV Series via a Semantic Projection Network

Ke Sun; Zhuo Lei; Jiasong Zhu; Xianxu Hou; Bozhi Liu; Guoping Qiu

doi:10.1007/978-3-030-05710-7_25

Character Prediction in TV Series via a Semantic Projection Network

Ke Sun, Zhuo Lei, Jiasong Zhu, Xianxu Hou, Bozhi Liu, Guoping Qiu^*

^*Corresponding author for this work

Research output: Chapter in Book or Report/Conference proceeding › Conference Proceeding › peer-review

Abstract

The goal of this paper is to automatically recognize characters in popular TV series. In contrast to conventional approaches which rely on weak supervision afforded by transcripts, subtitles or character facial data, we formulate the problem as the multi-label classification which requires only label-level supervision. We propose a novel semantic projection network consisting of two stacked subnetworks with specially designed constraints. The first subnetwork is a contractive autoencoder which focuses on reconstructing feature activations extracted from a pre-trained single-label convolutional neural network (CNN). The second subnetwork functions as a region-based multi-label classifier which produces character labels for the input video frame as well as reconstructing the input visual feature from the mapped semantic labels space. Extensive experiments show that the proposed model achieves state-of-the-art performance in comparison with recent approaches on three challenging TV series datasets (the Big Bang Theory, the Defenders and Nirvava in Fire).

Original language	English
Title of host publication	MultiMedia Modeling - 25th International Conference, MMM 2019, Proceedings
Editors	Ioannis Kompatsiaris, Stefanos Vrochidis, Vasileios Mezaris, Wen-Huang Cheng, Benoit Huet, Cathal Gurrin
Publisher	Springer Verlag
Pages	300-311
Number of pages	12
ISBN (Print)	9783030057091
DOIs	https://doi.org/10.1007/978-3-030-05710-7_25
Publication status	Published - 2019
Externally published	Yes
Event	25th International Conference on MultiMedia Modeling, MMM 2019 - Thessaloniki, Greece Duration: 8 Jan 2019 → 11 Jan 2019

Publication series

Name	Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume	11295 LNCS
ISSN (Print)	0302-9743
ISSN (Electronic)	1611-3349

Conference

Conference	25th International Conference on MultiMedia Modeling, MMM 2019
Country/Territory	Greece
City	Thessaloniki
Period	8/01/19 → 11/01/19

Keywords

Autoencoder
Character recognition
Convolutional neural network
Semantic projection
Video understanding

Access to Document

10.1007/978-3-030-05710-7_25

Cite this

Sun, K., Lei, Z., Zhu, J., Hou, X., Liu, B., & Qiu, G. (2019). Character Prediction in TV Series via a Semantic Projection Network. In I. Kompatsiaris, S. Vrochidis, V. Mezaris, W.-H. Cheng, B. Huet, & C. Gurrin (Eds.), MultiMedia Modeling - 25th International Conference, MMM 2019, Proceedings (pp. 300-311). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 11295 LNCS). Springer Verlag. https://doi.org/10.1007/978-3-030-05710-7_25

Sun, Ke ; Lei, Zhuo ; Zhu, Jiasong et al. / Character Prediction in TV Series via a Semantic Projection Network. MultiMedia Modeling - 25th International Conference, MMM 2019, Proceedings. editor / Ioannis Kompatsiaris ; Stefanos Vrochidis ; Vasileios Mezaris ; Wen-Huang Cheng ; Benoit Huet ; Cathal Gurrin. Springer Verlag, 2019. pp. 300-311 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).

@inproceedings{770c4669a0be4ea9b80035a637e6cbc2,

title = "Character Prediction in TV Series via a Semantic Projection Network",

abstract = "The goal of this paper is to automatically recognize characters in popular TV series. In contrast to conventional approaches which rely on weak supervision afforded by transcripts, subtitles or character facial data, we formulate the problem as the multi-label classification which requires only label-level supervision. We propose a novel semantic projection network consisting of two stacked subnetworks with specially designed constraints. The first subnetwork is a contractive autoencoder which focuses on reconstructing feature activations extracted from a pre-trained single-label convolutional neural network (CNN). The second subnetwork functions as a region-based multi-label classifier which produces character labels for the input video frame as well as reconstructing the input visual feature from the mapped semantic labels space. Extensive experiments show that the proposed model achieves state-of-the-art performance in comparison with recent approaches on three challenging TV series datasets (the Big Bang Theory, the Defenders and Nirvava in Fire).",

keywords = "Autoencoder, Character recognition, Convolutional neural network, Semantic projection, Video understanding",

author = "Ke Sun and Zhuo Lei and Jiasong Zhu and Xianxu Hou and Bozhi Liu and Guoping Qiu",

note = "Publisher Copyright: {\textcopyright} 2019, Springer Nature Switzerland AG.; 25th International Conference on MultiMedia Modeling, MMM 2019 ; Conference date: 08-01-2019 Through 11-01-2019",

year = "2019",

doi = "10.1007/978-3-030-05710-7_25",

language = "English",

isbn = "9783030057091",

series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

publisher = "Springer Verlag",

pages = "300--311",

editor = "Ioannis Kompatsiaris and Stefanos Vrochidis and Vasileios Mezaris and Wen-Huang Cheng and Benoit Huet and Cathal Gurrin",

booktitle = "MultiMedia Modeling - 25th International Conference, MMM 2019, Proceedings",

}

Sun, K, Lei, Z, Zhu, J, Hou, X, Liu, B & Qiu, G 2019, Character Prediction in TV Series via a Semantic Projection Network. in I Kompatsiaris, S Vrochidis, V Mezaris, W-H Cheng, B Huet & C Gurrin (eds), MultiMedia Modeling - 25th International Conference, MMM 2019, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 11295 LNCS, Springer Verlag, pp. 300-311, 25th International Conference on MultiMedia Modeling, MMM 2019, Thessaloniki, Greece, 8/01/19. https://doi.org/10.1007/978-3-030-05710-7_25

Character Prediction in TV Series via a Semantic Projection Network. / Sun, Ke; Lei, Zhuo; Zhu, Jiasong et al.
MultiMedia Modeling - 25th International Conference, MMM 2019, Proceedings. ed. / Ioannis Kompatsiaris; Stefanos Vrochidis; Vasileios Mezaris; Wen-Huang Cheng; Benoit Huet; Cathal Gurrin. Springer Verlag, 2019. p. 300-311 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 11295 LNCS).

Research output: Chapter in Book or Report/Conference proceeding › Conference Proceeding › peer-review

TY - GEN

T1 - Character Prediction in TV Series via a Semantic Projection Network

AU - Sun, Ke

AU - Lei, Zhuo

AU - Zhu, Jiasong

AU - Hou, Xianxu

AU - Liu, Bozhi

AU - Qiu, Guoping

PY - 2019

Y1 - 2019

N2 - The goal of this paper is to automatically recognize characters in popular TV series. In contrast to conventional approaches which rely on weak supervision afforded by transcripts, subtitles or character facial data, we formulate the problem as the multi-label classification which requires only label-level supervision. We propose a novel semantic projection network consisting of two stacked subnetworks with specially designed constraints. The first subnetwork is a contractive autoencoder which focuses on reconstructing feature activations extracted from a pre-trained single-label convolutional neural network (CNN). The second subnetwork functions as a region-based multi-label classifier which produces character labels for the input video frame as well as reconstructing the input visual feature from the mapped semantic labels space. Extensive experiments show that the proposed model achieves state-of-the-art performance in comparison with recent approaches on three challenging TV series datasets (the Big Bang Theory, the Defenders and Nirvava in Fire).

AB - The goal of this paper is to automatically recognize characters in popular TV series. In contrast to conventional approaches which rely on weak supervision afforded by transcripts, subtitles or character facial data, we formulate the problem as the multi-label classification which requires only label-level supervision. We propose a novel semantic projection network consisting of two stacked subnetworks with specially designed constraints. The first subnetwork is a contractive autoencoder which focuses on reconstructing feature activations extracted from a pre-trained single-label convolutional neural network (CNN). The second subnetwork functions as a region-based multi-label classifier which produces character labels for the input video frame as well as reconstructing the input visual feature from the mapped semantic labels space. Extensive experiments show that the proposed model achieves state-of-the-art performance in comparison with recent approaches on three challenging TV series datasets (the Big Bang Theory, the Defenders and Nirvava in Fire).

KW - Autoencoder

KW - Character recognition

KW - Convolutional neural network

KW - Semantic projection

KW - Video understanding

UR - http://www.scopus.com/inward/record.url?scp=85059847482&partnerID=8YFLogxK

U2 - 10.1007/978-3-030-05710-7_25

DO - 10.1007/978-3-030-05710-7_25

M3 - Conference Proceeding

AN - SCOPUS:85059847482

SN - 9783030057091

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 300

EP - 311

BT - MultiMedia Modeling - 25th International Conference, MMM 2019, Proceedings

A2 - Kompatsiaris, Ioannis

A2 - Vrochidis, Stefanos

A2 - Mezaris, Vasileios

A2 - Cheng, Wen-Huang

A2 - Huet, Benoit

A2 - Gurrin, Cathal

PB - Springer Verlag

T2 - 25th International Conference on MultiMedia Modeling, MMM 2019

Y2 - 8 January 2019 through 11 January 2019

ER -

Sun K, Lei Z, Zhu J, Hou X, Liu B, Qiu G. Character Prediction in TV Series via a Semantic Projection Network. In Kompatsiaris I, Vrochidis S, Mezaris V, Cheng WH, Huet B, Gurrin C, editors, MultiMedia Modeling - 25th International Conference, MMM 2019, Proceedings. Springer Verlag. 2019. p. 300-311. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). doi: 10.1007/978-3-030-05710-7_25

Character Prediction in TV Series via a Semantic Projection Network

Abstract

Publication series

Conference

Keywords

Access to Document

Other files and links

Fingerprint

Cite this