A Segment-Based Layout Aware Model for Information Extraction on Document Images

Maizhen Ning; Qiu Feng Wang; Kaizhu Huang; Xiaowei Huang

doi:10.1007/978-3-030-92307-5_88

A Segment-Based Layout Aware Model for Information Extraction on Document Images

Maizhen Ning, Qiu Feng Wang^*, Kaizhu Huang, Xiaowei Huang

^*Corresponding author for this work

Research output: Chapter in Book or Report/Conference proceeding › Conference Proceeding › peer-review

Abstract

Information extraction (IE) on document images has attracted considerable attention recently due to its great potentials for intelligent document analysis, where visual layout information is vital. However, most existing works mainly consider visual layout information at the token level, which unfortunately ignore long contexts and require time-consuming annotation. In this paper, we propose to model document visual layout information at the segment level. First, we obtain segment representation by integrating the segment-level layout information and text embedding. Since only segment-level layout annotation is needed, our model enjoys a low cost in comparison with the full annotation as needed at the token level. Then, word vectors are also extracted from each text segment to get the fine-grained representation. Finally, both segment and word vectors are fused for obtaining prediction results. Extensive experiments on the benchmark datasets are conducted to demonstrate the effectiveness of our novel method.

Original language	English
Title of host publication	Neural Information Processing - 28th International Conference, ICONIP 2021, Proceedings
Editors	Teddy Mantoro, Minho Lee, Media Anugerah Ayu, Kok Wai Wong, Achmad Nizar Hidayanto
Publisher	Springer Science and Business Media Deutschland GmbH
Pages	757-765
Number of pages	9
ISBN (Print)	9783030923068
DOIs	https://doi.org/10.1007/978-3-030-92307-5_88
Publication status	Published - 2021
Event	28th International Conference on Neural Information Processing, ICONIP 2021 - Virtual, Online Duration: 8 Dec 2021 → 12 Dec 2021

Publication series

Name	Communications in Computer and Information Science
Volume	1516 CCIS
ISSN (Print)	1865-0929
ISSN (Electronic)	1865-0937

Conference

Conference	28th International Conference on Neural Information Processing, ICONIP 2021
City	Virtual, Online
Period	8/12/21 → 12/12/21

Keywords

Document intelligence
Information extraction
Segment representation
Visual layout information
Weak annotation

Access to Document

10.1007/978-3-030-92307-5_88

Cite this

Ning, M., Wang, Q. F., Huang, K., & Huang, X. (2021). A Segment-Based Layout Aware Model for Information Extraction on Document Images. In T. Mantoro, M. Lee, M. A. Ayu, K. W. Wong, & A. N. Hidayanto (Eds.), Neural Information Processing - 28th International Conference, ICONIP 2021, Proceedings (pp. 757-765). (Communications in Computer and Information Science; Vol. 1516 CCIS). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-030-92307-5_88

Ning, Maizhen ; Wang, Qiu Feng ; Huang, Kaizhu et al. / A Segment-Based Layout Aware Model for Information Extraction on Document Images. Neural Information Processing - 28th International Conference, ICONIP 2021, Proceedings. editor / Teddy Mantoro ; Minho Lee ; Media Anugerah Ayu ; Kok Wai Wong ; Achmad Nizar Hidayanto. Springer Science and Business Media Deutschland GmbH, 2021. pp. 757-765 (Communications in Computer and Information Science).

@inproceedings{6c9b12b15f6849cebf6e24a7c8c4d7f0,

title = "A Segment-Based Layout Aware Model for Information Extraction on Document Images",

abstract = "Information extraction (IE) on document images has attracted considerable attention recently due to its great potentials for intelligent document analysis, where visual layout information is vital. However, most existing works mainly consider visual layout information at the token level, which unfortunately ignore long contexts and require time-consuming annotation. In this paper, we propose to model document visual layout information at the segment level. First, we obtain segment representation by integrating the segment-level layout information and text embedding. Since only segment-level layout annotation is needed, our model enjoys a low cost in comparison with the full annotation as needed at the token level. Then, word vectors are also extracted from each text segment to get the fine-grained representation. Finally, both segment and word vectors are fused for obtaining prediction results. Extensive experiments on the benchmark datasets are conducted to demonstrate the effectiveness of our novel method.",

keywords = "Document intelligence, Information extraction, Segment representation, Visual layout information, Weak annotation",

author = "Maizhen Ning and Wang, {Qiu Feng} and Kaizhu Huang and Xiaowei Huang",

note = "Funding Information: Acknowledgments. The work was partially supported by the following: National Natural Science Foundation of China under no. 61876154 and no. 61876155; Jiangsu Science and Technology Programme (Natural Science Foundation of Jiangsu Province) under no. BE2020006-4 and BK20181190; Key Program Special Fund in XJTLU under no. KSF-T-06, KSF-E-26, and KSF-A-10, and XJTLU Research Development Fund RDF-16-02-49. Publisher Copyright: {\textcopyright} 2021, Springer Nature Switzerland AG.; 28th International Conference on Neural Information Processing, ICONIP 2021 ; Conference date: 08-12-2021 Through 12-12-2021",

year = "2021",

doi = "10.1007/978-3-030-92307-5_88",

language = "English",

isbn = "9783030923068",

series = "Communications in Computer and Information Science",

publisher = "Springer Science and Business Media Deutschland GmbH",

pages = "757--765",

editor = "Teddy Mantoro and Minho Lee and Ayu, {Media Anugerah} and Wong, {Kok Wai} and Hidayanto, {Achmad Nizar}",

booktitle = "Neural Information Processing - 28th International Conference, ICONIP 2021, Proceedings",

}

Ning, M, Wang, QF, Huang, K & Huang, X 2021, A Segment-Based Layout Aware Model for Information Extraction on Document Images. in T Mantoro, M Lee, MA Ayu, KW Wong & AN Hidayanto (eds), Neural Information Processing - 28th International Conference, ICONIP 2021, Proceedings. Communications in Computer and Information Science, vol. 1516 CCIS, Springer Science and Business Media Deutschland GmbH, pp. 757-765, 28th International Conference on Neural Information Processing, ICONIP 2021, Virtual, Online, 8/12/21. https://doi.org/10.1007/978-3-030-92307-5_88

A Segment-Based Layout Aware Model for Information Extraction on Document Images. / Ning, Maizhen; Wang, Qiu Feng; Huang, Kaizhu et al.
Neural Information Processing - 28th International Conference, ICONIP 2021, Proceedings. ed. / Teddy Mantoro; Minho Lee; Media Anugerah Ayu; Kok Wai Wong; Achmad Nizar Hidayanto. Springer Science and Business Media Deutschland GmbH, 2021. p. 757-765 (Communications in Computer and Information Science; Vol. 1516 CCIS).

Research output: Chapter in Book or Report/Conference proceeding › Conference Proceeding › peer-review

TY - GEN

T1 - A Segment-Based Layout Aware Model for Information Extraction on Document Images

AU - Ning, Maizhen

AU - Wang, Qiu Feng

AU - Huang, Kaizhu

AU - Huang, Xiaowei

N1 - Funding Information: Acknowledgments. The work was partially supported by the following: National Natural Science Foundation of China under no. 61876154 and no. 61876155; Jiangsu Science and Technology Programme (Natural Science Foundation of Jiangsu Province) under no. BE2020006-4 and BK20181190; Key Program Special Fund in XJTLU under no. KSF-T-06, KSF-E-26, and KSF-A-10, and XJTLU Research Development Fund RDF-16-02-49. Publisher Copyright: © 2021, Springer Nature Switzerland AG.

PY - 2021

Y1 - 2021

N2 - Information extraction (IE) on document images has attracted considerable attention recently due to its great potentials for intelligent document analysis, where visual layout information is vital. However, most existing works mainly consider visual layout information at the token level, which unfortunately ignore long contexts and require time-consuming annotation. In this paper, we propose to model document visual layout information at the segment level. First, we obtain segment representation by integrating the segment-level layout information and text embedding. Since only segment-level layout annotation is needed, our model enjoys a low cost in comparison with the full annotation as needed at the token level. Then, word vectors are also extracted from each text segment to get the fine-grained representation. Finally, both segment and word vectors are fused for obtaining prediction results. Extensive experiments on the benchmark datasets are conducted to demonstrate the effectiveness of our novel method.

AB - Information extraction (IE) on document images has attracted considerable attention recently due to its great potentials for intelligent document analysis, where visual layout information is vital. However, most existing works mainly consider visual layout information at the token level, which unfortunately ignore long contexts and require time-consuming annotation. In this paper, we propose to model document visual layout information at the segment level. First, we obtain segment representation by integrating the segment-level layout information and text embedding. Since only segment-level layout annotation is needed, our model enjoys a low cost in comparison with the full annotation as needed at the token level. Then, word vectors are also extracted from each text segment to get the fine-grained representation. Finally, both segment and word vectors are fused for obtaining prediction results. Extensive experiments on the benchmark datasets are conducted to demonstrate the effectiveness of our novel method.

KW - Document intelligence

KW - Information extraction

KW - Segment representation

KW - Visual layout information

KW - Weak annotation

UR - http://www.scopus.com/inward/record.url?scp=85121915498&partnerID=8YFLogxK

U2 - 10.1007/978-3-030-92307-5_88

DO - 10.1007/978-3-030-92307-5_88

M3 - Conference Proceeding

AN - SCOPUS:85121915498

SN - 9783030923068

T3 - Communications in Computer and Information Science

SP - 757

EP - 765

BT - Neural Information Processing - 28th International Conference, ICONIP 2021, Proceedings

A2 - Mantoro, Teddy

A2 - Lee, Minho

A2 - Ayu, Media Anugerah

A2 - Wong, Kok Wai

A2 - Hidayanto, Achmad Nizar

PB - Springer Science and Business Media Deutschland GmbH

T2 - 28th International Conference on Neural Information Processing, ICONIP 2021

Y2 - 8 December 2021 through 12 December 2021

ER -

Ning M, Wang QF, Huang K, Huang X. A Segment-Based Layout Aware Model for Information Extraction on Document Images. In Mantoro T, Lee M, Ayu MA, Wong KW, Hidayanto AN, editors, Neural Information Processing - 28th International Conference, ICONIP 2021, Proceedings. Springer Science and Business Media Deutschland GmbH. 2021. p. 757-765. (Communications in Computer and Information Science). doi: 10.1007/978-3-030-92307-5_88