A Symbolic Characters Aware Model for Solving Geometry Problems

Maizhen Ning; Qiu Feng Wang; Kaizhu Huang; Xiaowei Huang

doi:10.1145/3581783.3612570

A Symbolic Characters Aware Model for Solving Geometry Problems

Maizhen Ning, Qiu Feng Wang^*, Kaizhu Huang, Xiaowei Huang

^*Corresponding author for this work

Department of Intelligent Science

Research output: Chapter in Book or Report/Conference proceeding › Conference Proceeding › peer-review

8 Citations (Scopus)

Abstract

AI has made significant progress in solving math problems, but geometry problems remain challenging due to their reliance on both text and diagrams. In the text description, symbolic characters such as "ABC"often serve as a bridge to connect the corresponding diagram. However, by simply tokenizing symbolic characters into individual letters (e.g., 'A', 'B' and 'C'), existing works fail to study them explicitly and thus lose the semantic relationship with the diagram. In this paper, we develop a symbolic character-aware model to fully explore the role of these characters in both text and diagram understanding and optimize the model under a multi-modal reasoning framework. In the text encoder, we propose merging individual symbolic characters to form one semantic unit along with geometric information from the corresponding diagram. For the diagram encoder, we pre-train it under a multi-label classification framework with the symbolic characters as labels. In addition, we enhance the geometry diagram understanding ability via a self-supervised learning method under the masked image modeling auxiliary task. By integrating the proposed model into a general encoder-decoder pipeline for solving geometry problems, we demonstrate its superiority on two benchmark datasets, including GeoQA and Geometry3K, with extensive experiments. Specifically, on GeoQA, the question-solving accuracy is increased from 60.0% to 64.1%, achieving a new state-of-the-art accuracy; on Geometry3K, we reduce the question average solving steps from 6.9 down to 6.0 with marginally higher solving accuracy.

Original language	English
Title of host publication	MM 2023 - Proceedings of the 31st ACM International Conference on Multimedia
Publisher	Association for Computing Machinery, Inc
Pages	7767-7775
Number of pages	9
ISBN (Electronic)	9798400701085
DOIs	https://doi.org/10.1145/3581783.3612570
Publication status	Published - 26 Oct 2023
Event	31st ACM International Conference on Multimedia, MM 2023 - Ottawa, Canada Duration: 29 Oct 2023 → 3 Nov 2023

Publication series

Name	MM 2023 - Proceedings of the 31st ACM International Conference on Multimedia

Conference

Conference	31st ACM International Conference on Multimedia, MM 2023
Country/Territory	Canada
City	Ottawa
Period	29/10/23 → 3/11/23

Keywords

diagram encoder
geometry problems solver
multi-modal reasoning
symbolic characters

Access to Document

10.1145/3581783.3612570

Cite this

Ning, M., Wang, Q. F., Huang, K., & Huang, X. (2023). A Symbolic Characters Aware Model for Solving Geometry Problems. In MM 2023 - Proceedings of the 31st ACM International Conference on Multimedia (pp. 7767-7775). (MM 2023 - Proceedings of the 31st ACM International Conference on Multimedia). Association for Computing Machinery, Inc. https://doi.org/10.1145/3581783.3612570

@inproceedings{d75e8f86996d43d7a107b86bf724f027,

title = "A Symbolic Characters Aware Model for Solving Geometry Problems",

abstract = "AI has made significant progress in solving math problems, but geometry problems remain challenging due to their reliance on both text and diagrams. In the text description, symbolic characters such as {"}ABC{"}often serve as a bridge to connect the corresponding diagram. However, by simply tokenizing symbolic characters into individual letters (e.g., 'A', 'B' and 'C'), existing works fail to study them explicitly and thus lose the semantic relationship with the diagram. In this paper, we develop a symbolic character-aware model to fully explore the role of these characters in both text and diagram understanding and optimize the model under a multi-modal reasoning framework. In the text encoder, we propose merging individual symbolic characters to form one semantic unit along with geometric information from the corresponding diagram. For the diagram encoder, we pre-train it under a multi-label classification framework with the symbolic characters as labels. In addition, we enhance the geometry diagram understanding ability via a self-supervised learning method under the masked image modeling auxiliary task. By integrating the proposed model into a general encoder-decoder pipeline for solving geometry problems, we demonstrate its superiority on two benchmark datasets, including GeoQA and Geometry3K, with extensive experiments. Specifically, on GeoQA, the question-solving accuracy is increased from 60.0% to 64.1%, achieving a new state-of-the-art accuracy; on Geometry3K, we reduce the question average solving steps from 6.9 down to 6.0 with marginally higher solving accuracy.",

keywords = "diagram encoder, geometry problems solver, multi-modal reasoning, symbolic characters",

author = "Maizhen Ning and Wang, {Qiu Feng} and Kaizhu Huang and Xiaowei Huang",

note = "Publisher Copyright: {\textcopyright} 2023 ACM.; 31st ACM International Conference on Multimedia, MM 2023 ; Conference date: 29-10-2023 Through 03-11-2023",

year = "2023",

month = oct,

day = "26",

doi = "10.1145/3581783.3612570",

language = "English",

series = "MM 2023 - Proceedings of the 31st ACM International Conference on Multimedia",

publisher = "Association for Computing Machinery, Inc",

pages = "7767--7775",

booktitle = "MM 2023 - Proceedings of the 31st ACM International Conference on Multimedia",

}

Ning, M, Wang, QF, Huang, K & Huang, X 2023, A Symbolic Characters Aware Model for Solving Geometry Problems. in MM 2023 - Proceedings of the 31st ACM International Conference on Multimedia. MM 2023 - Proceedings of the 31st ACM International Conference on Multimedia, Association for Computing Machinery, Inc, pp. 7767-7775, 31st ACM International Conference on Multimedia, MM 2023, Ottawa, Canada, 29/10/23. https://doi.org/10.1145/3581783.3612570

A Symbolic Characters Aware Model for Solving Geometry Problems. / Ning, Maizhen; Wang, Qiu Feng; Huang, Kaizhu et al.
MM 2023 - Proceedings of the 31st ACM International Conference on Multimedia. Association for Computing Machinery, Inc, 2023. p. 7767-7775 (MM 2023 - Proceedings of the 31st ACM International Conference on Multimedia).

Research output: Chapter in Book or Report/Conference proceeding › Conference Proceeding › peer-review

TY - GEN

T1 - A Symbolic Characters Aware Model for Solving Geometry Problems

AU - Ning, Maizhen

AU - Wang, Qiu Feng

AU - Huang, Kaizhu

AU - Huang, Xiaowei

PY - 2023/10/26

Y1 - 2023/10/26

N2 - AI has made significant progress in solving math problems, but geometry problems remain challenging due to their reliance on both text and diagrams. In the text description, symbolic characters such as "ABC"often serve as a bridge to connect the corresponding diagram. However, by simply tokenizing symbolic characters into individual letters (e.g., 'A', 'B' and 'C'), existing works fail to study them explicitly and thus lose the semantic relationship with the diagram. In this paper, we develop a symbolic character-aware model to fully explore the role of these characters in both text and diagram understanding and optimize the model under a multi-modal reasoning framework. In the text encoder, we propose merging individual symbolic characters to form one semantic unit along with geometric information from the corresponding diagram. For the diagram encoder, we pre-train it under a multi-label classification framework with the symbolic characters as labels. In addition, we enhance the geometry diagram understanding ability via a self-supervised learning method under the masked image modeling auxiliary task. By integrating the proposed model into a general encoder-decoder pipeline for solving geometry problems, we demonstrate its superiority on two benchmark datasets, including GeoQA and Geometry3K, with extensive experiments. Specifically, on GeoQA, the question-solving accuracy is increased from 60.0% to 64.1%, achieving a new state-of-the-art accuracy; on Geometry3K, we reduce the question average solving steps from 6.9 down to 6.0 with marginally higher solving accuracy.

AB - AI has made significant progress in solving math problems, but geometry problems remain challenging due to their reliance on both text and diagrams. In the text description, symbolic characters such as "ABC"often serve as a bridge to connect the corresponding diagram. However, by simply tokenizing symbolic characters into individual letters (e.g., 'A', 'B' and 'C'), existing works fail to study them explicitly and thus lose the semantic relationship with the diagram. In this paper, we develop a symbolic character-aware model to fully explore the role of these characters in both text and diagram understanding and optimize the model under a multi-modal reasoning framework. In the text encoder, we propose merging individual symbolic characters to form one semantic unit along with geometric information from the corresponding diagram. For the diagram encoder, we pre-train it under a multi-label classification framework with the symbolic characters as labels. In addition, we enhance the geometry diagram understanding ability via a self-supervised learning method under the masked image modeling auxiliary task. By integrating the proposed model into a general encoder-decoder pipeline for solving geometry problems, we demonstrate its superiority on two benchmark datasets, including GeoQA and Geometry3K, with extensive experiments. Specifically, on GeoQA, the question-solving accuracy is increased from 60.0% to 64.1%, achieving a new state-of-the-art accuracy; on Geometry3K, we reduce the question average solving steps from 6.9 down to 6.0 with marginally higher solving accuracy.

KW - diagram encoder

KW - geometry problems solver

KW - multi-modal reasoning

KW - symbolic characters

UR - http://www.scopus.com/inward/record.url?scp=85179557833&partnerID=8YFLogxK

U2 - 10.1145/3581783.3612570

DO - 10.1145/3581783.3612570

M3 - Conference Proceeding

AN - SCOPUS:85179557833

T3 - MM 2023 - Proceedings of the 31st ACM International Conference on Multimedia

SP - 7767

EP - 7775

BT - MM 2023 - Proceedings of the 31st ACM International Conference on Multimedia

PB - Association for Computing Machinery, Inc

T2 - 31st ACM International Conference on Multimedia, MM 2023

Y2 - 29 October 2023 through 3 November 2023

ER -

A Symbolic Characters Aware Model for Solving Geometry Problems

Abstract

Publication series

Conference

Keywords

Access to Document

Other files and links

Fingerprint

Cite this