Image Captioning using Adversarial Networks and Reinforcement Learning

Shiyang Yan, Fangyu Wu, Jeremy S. Smith, Wenjin Lu, Bailing Zhang

Research output: Chapter in Book or Report/Conference proceedingConference Proceedingpeer-review

18 Citations (Scopus)

Abstract

Image captioning is a significant task in artificial intelligence which connects computer vision and natural language processing. With the rapid development of deep learning, the sequence to sequence model with attention, has become one of the main approaches for the task of image captioning. Nevertheless, a significant issue exists in the current framework: The exposure bias problem of Maximum Likelihood Estimation (MLE) in the sequence model. To address this problem, we use generative adversarial networks (GANs) for image captioning, which compensates for the exposure bias problem of MLE and also can generate more realistic captions. GANs, however, cannot be directly applied to a discrete task, like language processing, due to the discontinuity of the data. Hence, we use a reinforcement learning (RL) technique to estimate the gradients for the network. Also, to obtain the intermediate rewards during the process of language generation, a Monte Carlo roll-out sampling method is utilized. Experimental results on the COCO dataset validate the improved effect from each ingredient of the proposed model. The overall effectiveness is also evaluated.

Original languageEnglish
Title of host publication2018 24th International Conference on Pattern Recognition, ICPR 2018
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages248-253
Number of pages6
ISBN (Electronic)9781538637883
DOIs
Publication statusPublished - 26 Nov 2018
Event24th International Conference on Pattern Recognition, ICPR 2018 - Beijing, China
Duration: 20 Aug 201824 Aug 2018

Publication series

NameProceedings - International Conference on Pattern Recognition
Volume2018-August
ISSN (Print)1051-4651

Conference

Conference24th International Conference on Pattern Recognition, ICPR 2018
Country/TerritoryChina
CityBeijing
Period20/08/1824/08/18

Cite this