TY - JOUR
T1 - Robust Text Image Recognition via Adversarial Sequence-to-Sequence Domain Adaptation
AU - Zhang, Yaping
AU - Nie, Shuai
AU - Liang, Shan
AU - Liu, Wenju
N1 - Publisher Copyright:
© 1992-2012 IEEE.
PY - 2021
Y1 - 2021
N2 - Robust text reading is a very challenging problem, due to the distribution of text images changing significantly in real-world scenarios. One effective solution is to align the distribution between different domains by domain adaptation methods. However, we found that these methods might struggle when dealing sequence-like text images. An important reason is that conventional domain adaptation methods strive to align images as a whole, while text images consist of variable-length fine-grained character information. To address this issue, we propose a novel Adversarial Sequence-to-Sequence Domain Adaptation (ASSDA) method to learn 'where to adapt' and 'how to align' the sequential image. Our key idea is to mine the local regions that contain characters, and focus on aligning them across domains in an adversarial manner. Extensive text recognition experiments show the ASSDA could efficiently transfer sequence knowledge and validate the promising power towards the various domain shift in the real world applications.
AB - Robust text reading is a very challenging problem, due to the distribution of text images changing significantly in real-world scenarios. One effective solution is to align the distribution between different domains by domain adaptation methods. However, we found that these methods might struggle when dealing sequence-like text images. An important reason is that conventional domain adaptation methods strive to align images as a whole, while text images consist of variable-length fine-grained character information. To address this issue, we propose a novel Adversarial Sequence-to-Sequence Domain Adaptation (ASSDA) method to learn 'where to adapt' and 'how to align' the sequential image. Our key idea is to mine the local regions that contain characters, and focus on aligning them across domains in an adversarial manner. Extensive text recognition experiments show the ASSDA could efficiently transfer sequence knowledge and validate the promising power towards the various domain shift in the real world applications.
KW - domain adaptation
KW - Sequence-to-sequence
KW - text image recognition
UR - http://www.scopus.com/inward/record.url?scp=85103254759&partnerID=8YFLogxK
U2 - 10.1109/TIP.2021.3066903
DO - 10.1109/TIP.2021.3066903
M3 - Article
C2 - 33755566
AN - SCOPUS:85103254759
SN - 1057-7149
VL - 30
SP - 3922
EP - 3933
JO - IEEE Transactions on Image Processing
JF - IEEE Transactions on Image Processing
M1 - 9384298
ER -