Robust Text Image Recognition via Adversarial Sequence-to-Sequence Domain Adaptation

Yaping Zhang; Shuai Nie; Shan Liang; Wenju Liu

doi:10.1109/TIP.2021.3066903

Robust Text Image Recognition via Adversarial Sequence-to-Sequence Domain Adaptation

Yaping Zhang, Shuai Nie, Shan Liang, Wenju Liu^*

^*Corresponding author for this work

Research output: Contribution to journal › Article › peer-review

30 Citations (Scopus)

Abstract

Robust text reading is a very challenging problem, due to the distribution of text images changing significantly in real-world scenarios. One effective solution is to align the distribution between different domains by domain adaptation methods. However, we found that these methods might struggle when dealing sequence-like text images. An important reason is that conventional domain adaptation methods strive to align images as a whole, while text images consist of variable-length fine-grained character information. To address this issue, we propose a novel Adversarial Sequence-to-Sequence Domain Adaptation (ASSDA) method to learn 'where to adapt' and 'how to align' the sequential image. Our key idea is to mine the local regions that contain characters, and focus on aligning them across domains in an adversarial manner. Extensive text recognition experiments show the ASSDA could efficiently transfer sequence knowledge and validate the promising power towards the various domain shift in the real world applications.

Original language	English
Article number	9384298
Pages (from-to)	3922-3933
Number of pages	12
Journal	IEEE Transactions on Image Processing
Volume	30
DOIs	https://doi.org/10.1109/TIP.2021.3066903
Publication status	Published - 2021
Externally published	Yes

Keywords

domain adaptation
Sequence-to-sequence
text image recognition

Access to Document

10.1109/TIP.2021.3066903

Cite this

@article{100ea9e907a34929b4ac2e73d0c2d722,

title = "Robust Text Image Recognition via Adversarial Sequence-to-Sequence Domain Adaptation",

abstract = "Robust text reading is a very challenging problem, due to the distribution of text images changing significantly in real-world scenarios. One effective solution is to align the distribution between different domains by domain adaptation methods. However, we found that these methods might struggle when dealing sequence-like text images. An important reason is that conventional domain adaptation methods strive to align images as a whole, while text images consist of variable-length fine-grained character information. To address this issue, we propose a novel Adversarial Sequence-to-Sequence Domain Adaptation (ASSDA) method to learn 'where to adapt' and 'how to align' the sequential image. Our key idea is to mine the local regions that contain characters, and focus on aligning them across domains in an adversarial manner. Extensive text recognition experiments show the ASSDA could efficiently transfer sequence knowledge and validate the promising power towards the various domain shift in the real world applications.",

keywords = "domain adaptation, Sequence-to-sequence, text image recognition",

author = "Yaping Zhang and Shuai Nie and Shan Liang and Wenju Liu",

note = "Publisher Copyright: {\textcopyright} 1992-2012 IEEE.",

year = "2021",

doi = "10.1109/TIP.2021.3066903",

language = "English",

volume = "30",

pages = "3922--3933",

journal = "IEEE Transactions on Image Processing",

issn = "1057-7149",

}

TY - JOUR

T1 - Robust Text Image Recognition via Adversarial Sequence-to-Sequence Domain Adaptation

AU - Zhang, Yaping

AU - Nie, Shuai

AU - Liang, Shan

AU - Liu, Wenju

PY - 2021

Y1 - 2021

N2 - Robust text reading is a very challenging problem, due to the distribution of text images changing significantly in real-world scenarios. One effective solution is to align the distribution between different domains by domain adaptation methods. However, we found that these methods might struggle when dealing sequence-like text images. An important reason is that conventional domain adaptation methods strive to align images as a whole, while text images consist of variable-length fine-grained character information. To address this issue, we propose a novel Adversarial Sequence-to-Sequence Domain Adaptation (ASSDA) method to learn 'where to adapt' and 'how to align' the sequential image. Our key idea is to mine the local regions that contain characters, and focus on aligning them across domains in an adversarial manner. Extensive text recognition experiments show the ASSDA could efficiently transfer sequence knowledge and validate the promising power towards the various domain shift in the real world applications.

AB - Robust text reading is a very challenging problem, due to the distribution of text images changing significantly in real-world scenarios. One effective solution is to align the distribution between different domains by domain adaptation methods. However, we found that these methods might struggle when dealing sequence-like text images. An important reason is that conventional domain adaptation methods strive to align images as a whole, while text images consist of variable-length fine-grained character information. To address this issue, we propose a novel Adversarial Sequence-to-Sequence Domain Adaptation (ASSDA) method to learn 'where to adapt' and 'how to align' the sequential image. Our key idea is to mine the local regions that contain characters, and focus on aligning them across domains in an adversarial manner. Extensive text recognition experiments show the ASSDA could efficiently transfer sequence knowledge and validate the promising power towards the various domain shift in the real world applications.

KW - domain adaptation

KW - Sequence-to-sequence

KW - text image recognition

UR - http://www.scopus.com/inward/record.url?scp=85103254759&partnerID=8YFLogxK

U2 - 10.1109/TIP.2021.3066903

DO - 10.1109/TIP.2021.3066903

M3 - Article

C2 - 33755566

AN - SCOPUS:85103254759

SN - 1057-7149

VL - 30

SP - 3922

EP - 3933

JO - IEEE Transactions on Image Processing

JF - IEEE Transactions on Image Processing

M1 - 9384298

ER -

Robust Text Image Recognition via Adversarial Sequence-to-Sequence Domain Adaptation

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this