Enhanced LSTM with batch normalization

Li Na Wang, Guoqiang Zhong*, Shoujun Yan, Junyu Dong, Kaizhu Huang

*Corresponding author for this work

Research output: Chapter in Book or Report/Conference proceedingConference Proceedingpeer-review

4 Citations (Scopus)

Abstract

Recurrent neural networks (RNNs) are powerful models for sequence learning. However, the training of RNNs is complicated because the internal covariate shift problem, where the input distribution at each iteration changes during the training as the parameters have been updated. Although some work has applied batch normalization (BN) to alleviate this problem in long short-term memory (LSTM), unfortunately, BN has not been applied to the update of the LSTM cell. In this paper, to tackle the internal covariate shift problem of LSTM, we introduce a method to successfully integrate BN into the update of the LSTM cell. Experimental results on two benchmark data sets, i.e. MNIST and Fashion-MNIST, show that the proposed method, enhanced LSTM with BN (eLSTM-BN), has achieved a faster convergence than LSTM and its variants, while obtained higher classification accuracy on sequence learning tasks.

Original languageEnglish
Title of host publicationNeural Information Processing - 26th International Conference, ICONIP 2019, Proceedings
EditorsTom Gedeon, Kok Wai Wong, Minho Lee
PublisherSpringer
Pages746-755
Number of pages10
ISBN (Print)9783030367077
DOIs
Publication statusPublished - 2019
Event26th International Conference on Neural Information Processing, ICONIP 2019 - Sydney, Australia
Duration: 12 Dec 201915 Dec 2019

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume11953 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference26th International Conference on Neural Information Processing, ICONIP 2019
Country/TerritoryAustralia
CitySydney
Period12/12/1915/12/19

Keywords

  • Batch normalization
  • Long short-term memory
  • Recurrent neural networks

Fingerprint

Dive into the research topics of 'Enhanced LSTM with batch normalization'. Together they form a unique fingerprint.

Cite this