A Segment-Based Layout Aware Model for Information Extraction on Document Images

Maizhen Ning, Qiu Feng Wang*, Kaizhu Huang, Xiaowei Huang

*Corresponding author for this work

Research output: Chapter in Book or Report/Conference proceedingConference Proceedingpeer-review

Abstract

Information extraction (IE) on document images has attracted considerable attention recently due to its great potentials for intelligent document analysis, where visual layout information is vital. However, most existing works mainly consider visual layout information at the token level, which unfortunately ignore long contexts and require time-consuming annotation. In this paper, we propose to model document visual layout information at the segment level. First, we obtain segment representation by integrating the segment-level layout information and text embedding. Since only segment-level layout annotation is needed, our model enjoys a low cost in comparison with the full annotation as needed at the token level. Then, word vectors are also extracted from each text segment to get the fine-grained representation. Finally, both segment and word vectors are fused for obtaining prediction results. Extensive experiments on the benchmark datasets are conducted to demonstrate the effectiveness of our novel method.

Original languageEnglish
Title of host publicationNeural Information Processing - 28th International Conference, ICONIP 2021, Proceedings
EditorsTeddy Mantoro, Minho Lee, Media Anugerah Ayu, Kok Wai Wong, Achmad Nizar Hidayanto
PublisherSpringer Science and Business Media Deutschland GmbH
Pages757-765
Number of pages9
ISBN (Print)9783030923068
DOIs
Publication statusPublished - 2021
Event28th International Conference on Neural Information Processing, ICONIP 2021 - Virtual, Online
Duration: 8 Dec 202112 Dec 2021

Publication series

NameCommunications in Computer and Information Science
Volume1516 CCIS
ISSN (Print)1865-0929
ISSN (Electronic)1865-0937

Conference

Conference28th International Conference on Neural Information Processing, ICONIP 2021
CityVirtual, Online
Period8/12/2112/12/21

Keywords

  • Document intelligence
  • Information extraction
  • Segment representation
  • Visual layout information
  • Weak annotation

Cite this