TY - JOUR
T1 - TextTriangle
T2 - an end-to-end textspotter with piecewise linear alignment
AU - Xu, Hui
AU - Wang, Qiu Feng
AU - Li, Zhenghao
AU - Shi, Yu
AU - Zhou, Xiang Dong
N1 - Publisher Copyright:
© The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2025.
PY - 2025
Y1 - 2025
N2 - Scene text detection and recognition have attracted increasing research attention recently, especially for texts of arbitrary shapes. In most of text spotting methods, text feature alignment is a key component to connect the detector and the recognizer for end-to-end training. Existing alignment methods can be roughly categorized into those based on global consistent transformations and based on character-level classification. However, these methods either are unreliable for heavily deformed text or ignore contextual information in recognition. In this paper, we propose a novel text spotter named TextTriangle, which detects and recognizes the arbitrary-shaped text in an end-to-end manner without character-level annotations. In TextTriangle, a text instance is described as a sequence of ordered triangles attached to each other. Based on this representation, a new PiecewiseAlign layer is designed to accurately extract features of the text instance with arbitrary shapes, which is the key to make the framework end-to-end trainable. Compared with the methods based on global consistent transformations, PiecewiseAlign adopts piecewise linear transformation for feature calculation. Experiments show that PiecewiseAlign is superior to TPS-based method in the text alignment, and TextTriangle achieves competitive performance on standard scene text benchmarks.
AB - Scene text detection and recognition have attracted increasing research attention recently, especially for texts of arbitrary shapes. In most of text spotting methods, text feature alignment is a key component to connect the detector and the recognizer for end-to-end training. Existing alignment methods can be roughly categorized into those based on global consistent transformations and based on character-level classification. However, these methods either are unreliable for heavily deformed text or ignore contextual information in recognition. In this paper, we propose a novel text spotter named TextTriangle, which detects and recognizes the arbitrary-shaped text in an end-to-end manner without character-level annotations. In TextTriangle, a text instance is described as a sequence of ordered triangles attached to each other. Based on this representation, a new PiecewiseAlign layer is designed to accurately extract features of the text instance with arbitrary shapes, which is the key to make the framework end-to-end trainable. Compared with the methods based on global consistent transformations, PiecewiseAlign adopts piecewise linear transformation for feature calculation. Experiments show that PiecewiseAlign is superior to TPS-based method in the text alignment, and TextTriangle achieves competitive performance on standard scene text benchmarks.
KW - End-to-end training
KW - Scene text detection
KW - Scene text recognition
KW - Scene text spotting
KW - Text feature alignment
UR - http://www.scopus.com/inward/record.url?scp=105000033682&partnerID=8YFLogxK
U2 - 10.1007/s10032-025-00517-x
DO - 10.1007/s10032-025-00517-x
M3 - Article
AN - SCOPUS:105000033682
SN - 1433-2833
JO - International Journal on Document Analysis and Recognition
JF - International Journal on Document Analysis and Recognition
ER -