Improving Handwritten Mathematical Expression Recognition via an Attention Refinement Network

Jiayi Liu; Qiufeng Wang; Wei Liao; Jianghan Chen; Kaizhu Huang

doi:10.1007/978-981-99-8178-6_41

Improving Handwritten Mathematical Expression Recognition via an Attention Refinement Network

Jiayi Liu, Qiufeng Wang^*, Wei Liao, Jianghan Chen, Kaizhu Huang

^*Corresponding author for this work

Department of Intelligent Science

Research output: Chapter in Book or Report/Conference proceeding › Conference Proceeding › peer-review

Abstract

Handwritten mathematical expression recognition (HMER), typically regarding as a sequence-to-sequence problem, has made great progress in recent years, where RNN based models have been widely adopted. Although Transformer based model has demonstrated success in many areas, its performance is not satisfied due to the issue of standard attention mechanism in HMER. Therefore, we propose to improve the performance via an attention refinement network in the Transformer framework for HMER. We firstly adopt a shift window attention (SWA) from Swin Transformer to capture spatial contexts of the whole image for HMER. Moreover, we propose a refined coverage attention (RCA) to overcome the issue of lack of converge in the standard attention mechanism, where we utilize a convolutional kernel with a gating function to obtain coverage features. With the proposed RCA, we refine coverage attentions to attenuate the repeating issue of focused areas in the long-sequence. In addition, we utilize a pyramid data augmentation method to generate mathematical expression images with multiple resolutions to enhance the model generalization. We evaluate the proposed attention refinement network on the HMER benchmark datasets of CROHME2014/2016/2019, and extensive experiments demonstrate its effectiveness.

Original language	English
Title of host publication	Neural Information Processing - 30th International Conference, ICONIP 2023, Proceedings
Editors	Biao Luo, Long Cheng, Zheng-Guang Wu, Hongyi Li, Chaojie Li
Publisher	Springer Science and Business Media Deutschland GmbH
Pages	543-555
Number of pages	13
ISBN (Print)	9789819981779
DOIs	https://doi.org/10.1007/978-981-99-8178-6_41
Publication status	Published - 2024
Event	30th International Conference on Neural Information Processing, ICONIP 2023 - Changsha, China Duration: 20 Nov 2023 → 23 Nov 2023

Publication series

Name	Communications in Computer and Information Science
Volume	1967 CCIS
ISSN (Print)	1865-0929
ISSN (Electronic)	1865-0937

Conference

Conference	30th International Conference on Neural Information Processing, ICONIP 2023
Country/Territory	China
City	Changsha
Period	20/11/23 → 23/11/23

Keywords

Handwritten mathematical expression recognition
Pyramid data augmentation
Refined coverage attention
Shift window attention

Access to Document

10.1007/978-981-99-8178-6_41

Cite this

Liu, J., Wang, Q., Liao, W., Chen, J., & Huang, K. (2024). Improving Handwritten Mathematical Expression Recognition via an Attention Refinement Network. In B. Luo, L. Cheng, Z.-G. Wu, H. Li, & C. Li (Eds.), Neural Information Processing - 30th International Conference, ICONIP 2023, Proceedings (pp. 543-555). (Communications in Computer and Information Science; Vol. 1967 CCIS). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-981-99-8178-6_41

Liu, Jiayi ; Wang, Qiufeng ; Liao, Wei et al. / Improving Handwritten Mathematical Expression Recognition via an Attention Refinement Network. Neural Information Processing - 30th International Conference, ICONIP 2023, Proceedings. editor / Biao Luo ; Long Cheng ; Zheng-Guang Wu ; Hongyi Li ; Chaojie Li. Springer Science and Business Media Deutschland GmbH, 2024. pp. 543-555 (Communications in Computer and Information Science).

@inproceedings{eebbfea8d1884fe090923a45a167e6ca,

title = "Improving Handwritten Mathematical Expression Recognition via an Attention Refinement Network",

abstract = "Handwritten mathematical expression recognition (HMER), typically regarding as a sequence-to-sequence problem, has made great progress in recent years, where RNN based models have been widely adopted. Although Transformer based model has demonstrated success in many areas, its performance is not satisfied due to the issue of standard attention mechanism in HMER. Therefore, we propose to improve the performance via an attention refinement network in the Transformer framework for HMER. We firstly adopt a shift window attention (SWA) from Swin Transformer to capture spatial contexts of the whole image for HMER. Moreover, we propose a refined coverage attention (RCA) to overcome the issue of lack of converge in the standard attention mechanism, where we utilize a convolutional kernel with a gating function to obtain coverage features. With the proposed RCA, we refine coverage attentions to attenuate the repeating issue of focused areas in the long-sequence. In addition, we utilize a pyramid data augmentation method to generate mathematical expression images with multiple resolutions to enhance the model generalization. We evaluate the proposed attention refinement network on the HMER benchmark datasets of CROHME2014/2016/2019, and extensive experiments demonstrate its effectiveness.",

keywords = "Handwritten mathematical expression recognition, Pyramid data augmentation, Refined coverage attention, Shift window attention",

author = "Jiayi Liu and Qiufeng Wang and Wei Liao and Jianghan Chen and Kaizhu Huang",

note = "Publisher Copyright: {\textcopyright} 2024, The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.; 30th International Conference on Neural Information Processing, ICONIP 2023 ; Conference date: 20-11-2023 Through 23-11-2023",

year = "2024",

doi = "10.1007/978-981-99-8178-6_41",

language = "English",

isbn = "9789819981779",

series = "Communications in Computer and Information Science",

publisher = "Springer Science and Business Media Deutschland GmbH",

pages = "543--555",

editor = "Biao Luo and Long Cheng and Zheng-Guang Wu and Hongyi Li and Chaojie Li",

booktitle = "Neural Information Processing - 30th International Conference, ICONIP 2023, Proceedings",

}

Liu, J, Wang, Q, Liao, W, Chen, J & Huang, K 2024, Improving Handwritten Mathematical Expression Recognition via an Attention Refinement Network. in B Luo, L Cheng, Z-G Wu, H Li & C Li (eds), Neural Information Processing - 30th International Conference, ICONIP 2023, Proceedings. Communications in Computer and Information Science, vol. 1967 CCIS, Springer Science and Business Media Deutschland GmbH, pp. 543-555, 30th International Conference on Neural Information Processing, ICONIP 2023, Changsha, China, 20/11/23. https://doi.org/10.1007/978-981-99-8178-6_41

Improving Handwritten Mathematical Expression Recognition via an Attention Refinement Network. / Liu, Jiayi; Wang, Qiufeng; Liao, Wei et al.
Neural Information Processing - 30th International Conference, ICONIP 2023, Proceedings. ed. / Biao Luo; Long Cheng; Zheng-Guang Wu; Hongyi Li; Chaojie Li. Springer Science and Business Media Deutschland GmbH, 2024. p. 543-555 (Communications in Computer and Information Science; Vol. 1967 CCIS).

Research output: Chapter in Book or Report/Conference proceeding › Conference Proceeding › peer-review

TY - GEN

T1 - Improving Handwritten Mathematical Expression Recognition via an Attention Refinement Network

AU - Liu, Jiayi

AU - Wang, Qiufeng

AU - Liao, Wei

AU - Chen, Jianghan

AU - Huang, Kaizhu

PY - 2024

Y1 - 2024

N2 - Handwritten mathematical expression recognition (HMER), typically regarding as a sequence-to-sequence problem, has made great progress in recent years, where RNN based models have been widely adopted. Although Transformer based model has demonstrated success in many areas, its performance is not satisfied due to the issue of standard attention mechanism in HMER. Therefore, we propose to improve the performance via an attention refinement network in the Transformer framework for HMER. We firstly adopt a shift window attention (SWA) from Swin Transformer to capture spatial contexts of the whole image for HMER. Moreover, we propose a refined coverage attention (RCA) to overcome the issue of lack of converge in the standard attention mechanism, where we utilize a convolutional kernel with a gating function to obtain coverage features. With the proposed RCA, we refine coverage attentions to attenuate the repeating issue of focused areas in the long-sequence. In addition, we utilize a pyramid data augmentation method to generate mathematical expression images with multiple resolutions to enhance the model generalization. We evaluate the proposed attention refinement network on the HMER benchmark datasets of CROHME2014/2016/2019, and extensive experiments demonstrate its effectiveness.

AB - Handwritten mathematical expression recognition (HMER), typically regarding as a sequence-to-sequence problem, has made great progress in recent years, where RNN based models have been widely adopted. Although Transformer based model has demonstrated success in many areas, its performance is not satisfied due to the issue of standard attention mechanism in HMER. Therefore, we propose to improve the performance via an attention refinement network in the Transformer framework for HMER. We firstly adopt a shift window attention (SWA) from Swin Transformer to capture spatial contexts of the whole image for HMER. Moreover, we propose a refined coverage attention (RCA) to overcome the issue of lack of converge in the standard attention mechanism, where we utilize a convolutional kernel with a gating function to obtain coverage features. With the proposed RCA, we refine coverage attentions to attenuate the repeating issue of focused areas in the long-sequence. In addition, we utilize a pyramid data augmentation method to generate mathematical expression images with multiple resolutions to enhance the model generalization. We evaluate the proposed attention refinement network on the HMER benchmark datasets of CROHME2014/2016/2019, and extensive experiments demonstrate its effectiveness.

KW - Handwritten mathematical expression recognition

KW - Pyramid data augmentation

KW - Refined coverage attention

KW - Shift window attention

UR - http://www.scopus.com/inward/record.url?scp=85180153172&partnerID=8YFLogxK

U2 - 10.1007/978-981-99-8178-6_41

DO - 10.1007/978-981-99-8178-6_41

M3 - Conference Proceeding

AN - SCOPUS:85180153172

SN - 9789819981779

T3 - Communications in Computer and Information Science

SP - 543

EP - 555

BT - Neural Information Processing - 30th International Conference, ICONIP 2023, Proceedings

A2 - Luo, Biao

A2 - Cheng, Long

A2 - Wu, Zheng-Guang

A2 - Li, Hongyi

A2 - Li, Chaojie

PB - Springer Science and Business Media Deutschland GmbH

T2 - 30th International Conference on Neural Information Processing, ICONIP 2023

Y2 - 20 November 2023 through 23 November 2023

ER -

Liu J, Wang Q, Liao W, Chen J, Huang K. Improving Handwritten Mathematical Expression Recognition via an Attention Refinement Network. In Luo B, Cheng L, Wu ZG, Li H, Li C, editors, Neural Information Processing - 30th International Conference, ICONIP 2023, Proceedings. Springer Science and Business Media Deutschland GmbH. 2024. p. 543-555. (Communications in Computer and Information Science). doi: 10.1007/978-981-99-8178-6_41