Generating descriptions for sequential images with local-object attention and global semantic context modelling

Jing Su, Chenghua Lin, Mian Zhou, Qingyun Dai, Haoyu Lv

Research output: Chapter in Book or Report/Conference proceedingConference Proceedingpeer-review

2 Citations (Scopus)

Abstract

In this paper, we propose an end-to-end CNN-LSTM model for generating descriptions for sequential images with a local-object attention mechanism. To generate coherent descriptions, we capture global semantic context using a multilayer perceptron, which learns the dependencies between sequential images. A paralleled LSTM network is exploited for decoding the sequence descriptions. Experimental results show that our model outperforms the baseline across three different evaluation metrics on the datasets published by Microsoft.

Original languageEnglish
Title of host publication2IS and NLG 2018 - Workshop on Intelligent Interactive Systems and Language Generation, Proceedings of the Workshop
PublisherAssociation for Computational Linguistics (ACL)
Pages3-8
Number of pages6
ISBN (Electronic)9781948087889
Publication statusPublished - 2018
Externally publishedYes
Event2018 Workshop on Intelligent Interactive Systems and Language Generation, 2IS and NLG 2018, collocated with the 11th International Conference on Natural Language Generation, INLG 2018 - Tilburg, Netherlands
Duration: 5 Nov 2018 → …

Publication series

Name2IS and NLG 2018 - Workshop on Intelligent Interactive Systems and Language Generation, Proceedings of the Workshop

Conference

Conference2018 Workshop on Intelligent Interactive Systems and Language Generation, 2IS and NLG 2018, collocated with the 11th International Conference on Natural Language Generation, INLG 2018
Country/TerritoryNetherlands
CityTilburg
Period5/11/18 → …

Fingerprint

Dive into the research topics of 'Generating descriptions for sequential images with local-object attention and global semantic context modelling'. Together they form a unique fingerprint.

Cite this