Coarse-to-Fine Document Image Registration for Dewarping

Weiguang Zhang, Qiufeng Wang*, Kaizhu Huang, Xiaomeng Gu, Fengjun Guo

*Corresponding author for this work

Research output: Chapter in Book or Report/Conference proceedingConference Proceedingpeer-review

Abstract

Document dewarping has made great progress in recent years, however it usually requires huge document pairs with pixel-level annotation to learn a mapping function. Although photographed document images are easy to obtain, the pixel-level annotation between warped and flat images is time-consuming and almost impossible for large-scale datasets. To overcome this issue, we propose to register photographed documents with corresponding flat counterparts, obtaining the auto-annotation of pixel-level mapping labels. Due to the severe deformation in the real photographed documents, we introduce a coarse-to-fine registration pipeline to learn global-scale transformation and local details alignment respectively. In addition, the lack of registration labels motivates us to tailor a teacher-student dual branch under semi-supervised training, where the model is initialized on synthetic documents with labels. Furthermore, we contribute a large-scale dataset containing 12,500 triplets of synthetic-real-flat documents. Extensive experiments demonstrate the effectiveness of our proposed registration method. Specifically, trained by our registered pixel-level documents, the dewarping model can obtain comparable performance with SOTAs trained by almost 100× scale of samples, showing the high quality of our registration results. Our dataset and code are available at https://github.com/hanquansanren/DIRD.

Original languageEnglish
Title of host publicationDocument Analysis and Recognition - ICDAR 2024 - 18th International Conference, Proceedings
EditorsElisa H. Barney Smith, Marcus Liwicki, Liangrui Peng
PublisherSpringer Science and Business Media Deutschland GmbH
Pages343-358
Number of pages16
ISBN (Print)9783031705458
DOIs
Publication statusPublished - 2024
Event18th International Conference on Document Analysis and Recognition, ICDAR 2024 - Athens, Greece
Duration: 30 Aug 20244 Sept 2024

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume14807 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference18th International Conference on Document Analysis and Recognition, ICDAR 2024
Country/TerritoryGreece
CityAthens
Period30/08/244/09/24

Keywords

  • Coarse-to-Fine
  • Document Dewarping
  • Document Registration
  • Semi-supervised Learning

Fingerprint

Dive into the research topics of 'Coarse-to-Fine Document Image Registration for Dewarping'. Together they form a unique fingerprint.

Cite this