Document Registration: Towards Automated Labeling of Pixel-Level Alignment Between Warped-Flat Documents

Weiguang Zhang, Qiufeng Wang*, Kaizhu Huang, Xiaowei Huang, Fengjun Guo, Xiaomeng Gu

*Corresponding author for this work

Research output: Chapter in Book or Report/Conference proceedingConference Proceedingpeer-review

Abstract

Photographed documents are prevalent but often suffer from deformations like curves or folds, hindering readability. Consequently, document dewarping has been widely studied, however its performance is still not satisfied due to lack of real training samples with pixel-level annotation. To obtain the pixel-level labels, we leverage a document registration pipeline to automatically align warped-flat documents. Unlike general image registration works, registering documents poses unique challenges due to their severe deformations and fine-grained textures. In this paper, we introduce a coarse-to-fine framework including a coarse registration network (CRN) aiming to eliminate severe deformations then a fine registration network (FRN) focusing on fine-grained features. In addition, we utilize self-supervised learning to initialize our document registration model, where we propose a cross-reconstruction pre-training task on the pair of warped-flat documents. Extensive experiments show that we can achieve satisfied document registration performance, consequently obtaining a high-quality registered document dataset with pixel-level annotation. Without bells and whistles, we re-train two popular document dewarping models on our registered document dataset WarpDoc-R, and obtain superior performance with those using almost 100× scale of synthetic training data, verifying the label quality of our document registration method.

Original languageEnglish
Title of host publicationMM 2024 - Proceedings of the 32nd ACM International Conference on Multimedia
PublisherAssociation for Computing Machinery, Inc
Pages9933-9942
Number of pages10
ISBN (Electronic)9798400706868
DOIs
Publication statusPublished - 28 Oct 2024
Event32nd ACM International Conference on Multimedia, MM 2024 - Melbourne, Australia
Duration: 28 Oct 20241 Nov 2024

Publication series

NameMM 2024 - Proceedings of the 32nd ACM International Conference on Multimedia

Conference

Conference32nd ACM International Conference on Multimedia, MM 2024
Country/TerritoryAustralia
CityMelbourne
Period28/10/241/11/24

Keywords

  • document dewarping
  • document registration
  • image matching
  • photographed documents
  • pixel-level alignment

Fingerprint

Dive into the research topics of 'Document Registration: Towards Automated Labeling of Pixel-Level Alignment Between Warped-Flat Documents'. Together they form a unique fingerprint.

Cite this