TextTriangle: an end-to-end textspotter with piecewise linear alignment

Hui Xu; Qiu Feng Wang; Zhenghao Li; Yu Shi; Xiang Dong Zhou

doi:10.1007/s10032-025-00517-x

TextTriangle: an end-to-end textspotter with piecewise linear alignment

Hui Xu^*, Qiu Feng Wang, Zhenghao Li, Yu Shi, Xiang Dong Zhou

^*Corresponding author for this work

Department of Intelligent Science

Research output: Contribution to journal › Article › peer-review

Abstract

Scene text detection and recognition have attracted increasing research attention recently, especially for texts of arbitrary shapes. In most of text spotting methods, text feature alignment is a key component to connect the detector and the recognizer for end-to-end training. Existing alignment methods can be roughly categorized into those based on global consistent transformations and based on character-level classification. However, these methods either are unreliable for heavily deformed text or ignore contextual information in recognition. In this paper, we propose a novel text spotter named TextTriangle, which detects and recognizes the arbitrary-shaped text in an end-to-end manner without character-level annotations. In TextTriangle, a text instance is described as a sequence of ordered triangles attached to each other. Based on this representation, a new PiecewiseAlign layer is designed to accurately extract features of the text instance with arbitrary shapes, which is the key to make the framework end-to-end trainable. Compared with the methods based on global consistent transformations, PiecewiseAlign adopts piecewise linear transformation for feature calculation. Experiments show that PiecewiseAlign is superior to TPS-based method in the text alignment, and TextTriangle achieves competitive performance on standard scene text benchmarks.

Original language	English
Journal	International Journal on Document Analysis and Recognition
DOIs	https://doi.org/10.1007/s10032-025-00517-x
Publication status	Accepted/In press - 2025

Keywords

End-to-end training
Scene text detection
Scene text recognition
Scene text spotting
Text feature alignment

Access to Document

10.1007/s10032-025-00517-x

Cite this

@article{1c0ec34cbc1e4f1495f26ef983c10628,

title = "TextTriangle: an end-to-end textspotter with piecewise linear alignment",

abstract = "Scene text detection and recognition have attracted increasing research attention recently, especially for texts of arbitrary shapes. In most of text spotting methods, text feature alignment is a key component to connect the detector and the recognizer for end-to-end training. Existing alignment methods can be roughly categorized into those based on global consistent transformations and based on character-level classification. However, these methods either are unreliable for heavily deformed text or ignore contextual information in recognition. In this paper, we propose a novel text spotter named TextTriangle, which detects and recognizes the arbitrary-shaped text in an end-to-end manner without character-level annotations. In TextTriangle, a text instance is described as a sequence of ordered triangles attached to each other. Based on this representation, a new PiecewiseAlign layer is designed to accurately extract features of the text instance with arbitrary shapes, which is the key to make the framework end-to-end trainable. Compared with the methods based on global consistent transformations, PiecewiseAlign adopts piecewise linear transformation for feature calculation. Experiments show that PiecewiseAlign is superior to TPS-based method in the text alignment, and TextTriangle achieves competitive performance on standard scene text benchmarks.",

keywords = "End-to-end training, Scene text detection, Scene text recognition, Scene text spotting, Text feature alignment",

author = "Hui Xu and Wang, {Qiu Feng} and Zhenghao Li and Yu Shi and Zhou, {Xiang Dong}",

note = "Publisher Copyright: {\textcopyright} The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2025.",

year = "2025",

doi = "10.1007/s10032-025-00517-x",

language = "English",

journal = "International Journal on Document Analysis and Recognition",

issn = "1433-2833",

}

TY - JOUR

T1 - TextTriangle

T2 - an end-to-end textspotter with piecewise linear alignment

AU - Xu, Hui

AU - Wang, Qiu Feng

AU - Li, Zhenghao

AU - Shi, Yu

AU - Zhou, Xiang Dong

N1 - Publisher Copyright: © The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2025.

PY - 2025

Y1 - 2025

N2 - Scene text detection and recognition have attracted increasing research attention recently, especially for texts of arbitrary shapes. In most of text spotting methods, text feature alignment is a key component to connect the detector and the recognizer for end-to-end training. Existing alignment methods can be roughly categorized into those based on global consistent transformations and based on character-level classification. However, these methods either are unreliable for heavily deformed text or ignore contextual information in recognition. In this paper, we propose a novel text spotter named TextTriangle, which detects and recognizes the arbitrary-shaped text in an end-to-end manner without character-level annotations. In TextTriangle, a text instance is described as a sequence of ordered triangles attached to each other. Based on this representation, a new PiecewiseAlign layer is designed to accurately extract features of the text instance with arbitrary shapes, which is the key to make the framework end-to-end trainable. Compared with the methods based on global consistent transformations, PiecewiseAlign adopts piecewise linear transformation for feature calculation. Experiments show that PiecewiseAlign is superior to TPS-based method in the text alignment, and TextTriangle achieves competitive performance on standard scene text benchmarks.

AB - Scene text detection and recognition have attracted increasing research attention recently, especially for texts of arbitrary shapes. In most of text spotting methods, text feature alignment is a key component to connect the detector and the recognizer for end-to-end training. Existing alignment methods can be roughly categorized into those based on global consistent transformations and based on character-level classification. However, these methods either are unreliable for heavily deformed text or ignore contextual information in recognition. In this paper, we propose a novel text spotter named TextTriangle, which detects and recognizes the arbitrary-shaped text in an end-to-end manner without character-level annotations. In TextTriangle, a text instance is described as a sequence of ordered triangles attached to each other. Based on this representation, a new PiecewiseAlign layer is designed to accurately extract features of the text instance with arbitrary shapes, which is the key to make the framework end-to-end trainable. Compared with the methods based on global consistent transformations, PiecewiseAlign adopts piecewise linear transformation for feature calculation. Experiments show that PiecewiseAlign is superior to TPS-based method in the text alignment, and TextTriangle achieves competitive performance on standard scene text benchmarks.

KW - End-to-end training

KW - Scene text detection

KW - Scene text recognition

KW - Scene text spotting

KW - Text feature alignment

UR - http://www.scopus.com/inward/record.url?scp=105000033682&partnerID=8YFLogxK

U2 - 10.1007/s10032-025-00517-x

DO - 10.1007/s10032-025-00517-x

M3 - Article

AN - SCOPUS:105000033682

SN - 1433-2833

JO - International Journal on Document Analysis and Recognition

JF - International Journal on Document Analysis and Recognition

ER -

TextTriangle: an end-to-end textspotter with piecewise linear alignment

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this