Enhanced LSTM with batch normalization

Li Na Wang; Guoqiang Zhong; Shoujun Yan; Junyu Dong; Kaizhu Huang

doi:10.1007/978-3-030-36708-4_61

Enhanced LSTM with batch normalization

Li Na Wang, Guoqiang Zhong^*, Shoujun Yan, Junyu Dong, Kaizhu Huang

^*Corresponding author for this work

School of Advanced Technology

Ocean University of China

Research output: Chapter in Book or Report/Conference proceeding › Conference Proceeding › peer-review

5 Citations (Scopus)

Abstract

Recurrent neural networks (RNNs) are powerful models for sequence learning. However, the training of RNNs is complicated because the internal covariate shift problem, where the input distribution at each iteration changes during the training as the parameters have been updated. Although some work has applied batch normalization (BN) to alleviate this problem in long short-term memory (LSTM), unfortunately, BN has not been applied to the update of the LSTM cell. In this paper, to tackle the internal covariate shift problem of LSTM, we introduce a method to successfully integrate BN into the update of the LSTM cell. Experimental results on two benchmark data sets, i.e. MNIST and Fashion-MNIST, show that the proposed method, enhanced LSTM with BN (eLSTM-BN), has achieved a faster convergence than LSTM and its variants, while obtained higher classification accuracy on sequence learning tasks.

Original language	English
Title of host publication	Neural Information Processing - 26th International Conference, ICONIP 2019, Proceedings
Editors	Tom Gedeon, Kok Wai Wong, Minho Lee
Publisher	Springer
Pages	746-755
Number of pages	10
ISBN (Print)	9783030367077
DOIs	https://doi.org/10.1007/978-3-030-36708-4_61
Publication status	Published - 2019
Event	26th International Conference on Neural Information Processing, ICONIP 2019 - Sydney, Australia Duration: 12 Dec 2019 → 15 Dec 2019

Publication series

Name	Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume	11953 LNCS
ISSN (Print)	0302-9743
ISSN (Electronic)	1611-3349

Conference

Conference	26th International Conference on Neural Information Processing, ICONIP 2019
Country/Territory	Australia
City	Sydney
Period	12/12/19 → 15/12/19

Keywords

Batch normalization
Long short-term memory
Recurrent neural networks

Access to Document

10.1007/978-3-030-36708-4_61

Cite this

Wang, L. N., Zhong, G., Yan, S., Dong, J., & Huang, K. (2019). Enhanced LSTM with batch normalization. In T. Gedeon, K. W. Wong, & M. Lee (Eds.), Neural Information Processing - 26th International Conference, ICONIP 2019, Proceedings (pp. 746-755). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 11953 LNCS). Springer. https://doi.org/10.1007/978-3-030-36708-4_61

Wang, Li Na ; Zhong, Guoqiang ; Yan, Shoujun et al. / Enhanced LSTM with batch normalization. Neural Information Processing - 26th International Conference, ICONIP 2019, Proceedings. editor / Tom Gedeon ; Kok Wai Wong ; Minho Lee. Springer, 2019. pp. 746-755 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).

@inproceedings{63c3cb6aa0a94f19aa129ec91faa1cb1,

title = "Enhanced LSTM with batch normalization",

abstract = "Recurrent neural networks (RNNs) are powerful models for sequence learning. However, the training of RNNs is complicated because the internal covariate shift problem, where the input distribution at each iteration changes during the training as the parameters have been updated. Although some work has applied batch normalization (BN) to alleviate this problem in long short-term memory (LSTM), unfortunately, BN has not been applied to the update of the LSTM cell. In this paper, to tackle the internal covariate shift problem of LSTM, we introduce a method to successfully integrate BN into the update of the LSTM cell. Experimental results on two benchmark data sets, i.e. MNIST and Fashion-MNIST, show that the proposed method, enhanced LSTM with BN (eLSTM-BN), has achieved a faster convergence than LSTM and its variants, while obtained higher classification accuracy on sequence learning tasks.",

keywords = "Batch normalization, Long short-term memory, Recurrent neural networks",

author = "Wang, {Li Na} and Guoqiang Zhong and Shoujun Yan and Junyu Dong and Kaizhu Huang",

note = "Publisher Copyright: {\textcopyright} Springer Nature Switzerland AG 2019.; 26th International Conference on Neural Information Processing, ICONIP 2019 ; Conference date: 12-12-2019 Through 15-12-2019",

year = "2019",

doi = "10.1007/978-3-030-36708-4_61",

language = "English",

isbn = "9783030367077",

series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

publisher = "Springer",

pages = "746--755",

editor = "Tom Gedeon and Wong, {Kok Wai} and Minho Lee",

booktitle = "Neural Information Processing - 26th International Conference, ICONIP 2019, Proceedings",

}

Wang, LN, Zhong, G, Yan, S, Dong, J & Huang, K 2019, Enhanced LSTM with batch normalization. in T Gedeon, KW Wong & M Lee (eds), Neural Information Processing - 26th International Conference, ICONIP 2019, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 11953 LNCS, Springer, pp. 746-755, 26th International Conference on Neural Information Processing, ICONIP 2019, Sydney, Australia, 12/12/19. https://doi.org/10.1007/978-3-030-36708-4_61

Enhanced LSTM with batch normalization. / Wang, Li Na; Zhong, Guoqiang; Yan, Shoujun et al.
Neural Information Processing - 26th International Conference, ICONIP 2019, Proceedings. ed. / Tom Gedeon; Kok Wai Wong; Minho Lee. Springer, 2019. p. 746-755 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 11953 LNCS).

Research output: Chapter in Book or Report/Conference proceeding › Conference Proceeding › peer-review

TY - GEN

T1 - Enhanced LSTM with batch normalization

AU - Wang, Li Na

AU - Zhong, Guoqiang

AU - Yan, Shoujun

AU - Dong, Junyu

AU - Huang, Kaizhu

N1 - Publisher Copyright: © Springer Nature Switzerland AG 2019.

PY - 2019

Y1 - 2019

N2 - Recurrent neural networks (RNNs) are powerful models for sequence learning. However, the training of RNNs is complicated because the internal covariate shift problem, where the input distribution at each iteration changes during the training as the parameters have been updated. Although some work has applied batch normalization (BN) to alleviate this problem in long short-term memory (LSTM), unfortunately, BN has not been applied to the update of the LSTM cell. In this paper, to tackle the internal covariate shift problem of LSTM, we introduce a method to successfully integrate BN into the update of the LSTM cell. Experimental results on two benchmark data sets, i.e. MNIST and Fashion-MNIST, show that the proposed method, enhanced LSTM with BN (eLSTM-BN), has achieved a faster convergence than LSTM and its variants, while obtained higher classification accuracy on sequence learning tasks.

AB - Recurrent neural networks (RNNs) are powerful models for sequence learning. However, the training of RNNs is complicated because the internal covariate shift problem, where the input distribution at each iteration changes during the training as the parameters have been updated. Although some work has applied batch normalization (BN) to alleviate this problem in long short-term memory (LSTM), unfortunately, BN has not been applied to the update of the LSTM cell. In this paper, to tackle the internal covariate shift problem of LSTM, we introduce a method to successfully integrate BN into the update of the LSTM cell. Experimental results on two benchmark data sets, i.e. MNIST and Fashion-MNIST, show that the proposed method, enhanced LSTM with BN (eLSTM-BN), has achieved a faster convergence than LSTM and its variants, while obtained higher classification accuracy on sequence learning tasks.

KW - Batch normalization

KW - Long short-term memory

KW - Recurrent neural networks

UR - http://www.scopus.com/inward/record.url?scp=85077509388&partnerID=8YFLogxK

U2 - 10.1007/978-3-030-36708-4_61

DO - 10.1007/978-3-030-36708-4_61

M3 - Conference Proceeding

AN - SCOPUS:85077509388

SN - 9783030367077

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 746

EP - 755

BT - Neural Information Processing - 26th International Conference, ICONIP 2019, Proceedings

A2 - Gedeon, Tom

A2 - Wong, Kok Wai

A2 - Lee, Minho

PB - Springer

T2 - 26th International Conference on Neural Information Processing, ICONIP 2019

Y2 - 12 December 2019 through 15 December 2019

ER -

Wang LN, Zhong G, Yan S, Dong J, Huang K. Enhanced LSTM with batch normalization. In Gedeon T, Wong KW, Lee M, editors, Neural Information Processing - 26th International Conference, ICONIP 2019, Proceedings. Springer. 2019. p. 746-755. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). doi: 10.1007/978-3-030-36708-4_61

Enhanced LSTM with batch normalization

Abstract

Publication series

Conference

Keywords

Access to Document

Other files and links

Fingerprint

Cite this