TY - GEN
T1 - Class Incremental Learning for Character String Recognition
AU - Hu, Yijie
AU - Zhang, Yan Ming
AU - Huang, Kaizhu
AU - Wang, Qiu Feng
N1 - Publisher Copyright:
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2024.
PY - 2024
Y1 - 2024
N2 - Character string recognition (CSR) has drawn much attention for document intelligence, but its performance is limited by the pre-defined character set without the ability to recognize new characters. To overcome this issue, class incremental learning (CIL) can be adopted where the model learns from new data instances incrementally over time. However, it is challenging to directly apply existing CIL methods in CSR because CSR is a typical sequence recognition problem. Without accurate alignment, the recognition error of new characters will affect the recognition of other characters in the same sequence. Moreover, the new characters are usually much fewer than the old ones, resulting in a data imbalance issue for learning new classes. To tackle the misalignment issue, we decouple the learning of feature alignment and classifiers during the incremental process in CSR. To handle the data imbalance issue, we propose a Prototype Incremental Learning framework for CSR, namely PIL-CSR. In the PIL-CSR framework, we propose a prototype-centered loss (PCL) to aid the model in facilitating better class separation, and we further propose a prototype separation and feature alignment (PSFA) strategy, allowing the model to adapt and learn new classes seamlessly. Finally, we collect a CSR dataset to evaluate CIL performance (github.com/tambourine666/Doc-CIL). Experimental results demonstrate the effectiveness of our proposed sequence CIL method, obtaining a significant improvement in both line-level and character-level accuracy.
AB - Character string recognition (CSR) has drawn much attention for document intelligence, but its performance is limited by the pre-defined character set without the ability to recognize new characters. To overcome this issue, class incremental learning (CIL) can be adopted where the model learns from new data instances incrementally over time. However, it is challenging to directly apply existing CIL methods in CSR because CSR is a typical sequence recognition problem. Without accurate alignment, the recognition error of new characters will affect the recognition of other characters in the same sequence. Moreover, the new characters are usually much fewer than the old ones, resulting in a data imbalance issue for learning new classes. To tackle the misalignment issue, we decouple the learning of feature alignment and classifiers during the incremental process in CSR. To handle the data imbalance issue, we propose a Prototype Incremental Learning framework for CSR, namely PIL-CSR. In the PIL-CSR framework, we propose a prototype-centered loss (PCL) to aid the model in facilitating better class separation, and we further propose a prototype separation and feature alignment (PSFA) strategy, allowing the model to adapt and learn new classes seamlessly. Finally, we collect a CSR dataset to evaluate CIL performance (github.com/tambourine666/Doc-CIL). Experimental results demonstrate the effectiveness of our proposed sequence CIL method, obtaining a significant improvement in both line-level and character-level accuracy.
KW - Character String Recognition
KW - Class Incremental Learning
KW - OCR
KW - Sequence-to-Sequence
UR - http://www.scopus.com/inward/record.url?scp=85204551902&partnerID=8YFLogxK
U2 - 10.1007/978-3-031-70549-6_24
DO - 10.1007/978-3-031-70549-6_24
M3 - Conference Proceeding
AN - SCOPUS:85204551902
SN - 9783031705489
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 405
EP - 420
BT - Document Analysis and Recognition - ICDAR 2024 - 18th International Conference, Proceedings
A2 - Barney Smith, Elisa H.
A2 - Liwicki, Marcus
A2 - Peng, Liangrui
PB - Springer Science and Business Media Deutschland GmbH
T2 - 18th International Conference on Document Analysis and Recognition, ICDAR 2024
Y2 - 30 August 2024 through 4 September 2024
ER -