TY - GEN
T1 - Visual-Textual Attention for Tree-Based Handwritten Mathematical Expression Recognition
AU - Liao, Wei
AU - Liu, Jiayi
AU - Chen, Jianghan
AU - Wang, Qiu Feng
AU - Huang, Kaizhu
N1 - Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2024.
PY - 2024
Y1 - 2024
N2 - Handwritten mathematical expression recognition (HMER) has attracted much attention and achieved remarkable progress under the encoder-decoder framework. However, it is still challenging due to complex structures and illegible handwriting. In this paper, we propose to refine the encoder-decoder framework for HMER. Firstly, we propose a multi-scale vision and textual attention fusion mechanism to enhance the contexts from both spatial and semantic information. Next, most of HMER works simply regard the HMER as a sequence-to-sequence problem (i.e., Latex string), ignoring the structure information in the mathematical expressions. To overcome this issue, we utilize a tree decoder to capture such structure contexts. Furthermore, we propose a parent-children mutual learning method to enhance the learning of our encoder-decoder model. Extensive experiments on the HMER benchmark datasets of CROHME 2014, 2016 and 2019 demonstrate the effectiveness of the proposed method.
AB - Handwritten mathematical expression recognition (HMER) has attracted much attention and achieved remarkable progress under the encoder-decoder framework. However, it is still challenging due to complex structures and illegible handwriting. In this paper, we propose to refine the encoder-decoder framework for HMER. Firstly, we propose a multi-scale vision and textual attention fusion mechanism to enhance the contexts from both spatial and semantic information. Next, most of HMER works simply regard the HMER as a sequence-to-sequence problem (i.e., Latex string), ignoring the structure information in the mathematical expressions. To overcome this issue, we utilize a tree decoder to capture such structure contexts. Furthermore, we propose a parent-children mutual learning method to enhance the learning of our encoder-decoder model. Extensive experiments on the HMER benchmark datasets of CROHME 2014, 2016 and 2019 demonstrate the effectiveness of the proposed method.
KW - Handwritten mathematical expression recognition
KW - Mutual learning
KW - Tree decoder
KW - Visual-textual attention
UR - http://www.scopus.com/inward/record.url?scp=85195127062&partnerID=8YFLogxK
U2 - 10.1007/978-981-97-1417-9_35
DO - 10.1007/978-981-97-1417-9_35
M3 - Conference Proceeding
AN - SCOPUS:85195127062
SN - 9789819714162
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 375
EP - 384
BT - Advances in Brain Inspired Cognitive Systems - 13th International Conference, BICS 2023, Proceedings
A2 - Ren, Jinchang
A2 - Hussain, Amir
A2 - Liao, Iman Yi
A2 - Chen, Rongjun
A2 - Huang, Kaizhu
A2 - Zhao, Huimin
A2 - Liu, Xiaoyong
A2 - Ma, Ping
A2 - Maul, Thomas
PB - Springer Science and Business Media Deutschland GmbH
T2 - 13th International Conference on Brain Inspired Cognitive Systems, BICS 2023
Y2 - 5 August 2023 through 6 August 2023
ER -