Enhancing Text Comprehension via Fusing Pre-trained Language Model with Knowledge Graph

Jing Qian; Gangmin Li; Katie Atkinson; Yong Yue

doi:10.1145/3639631.3639689

Enhancing Text Comprehension via Fusing Pre-trained Language Model with Knowledge Graph

Jing Qian, Gangmin Li, Katie Atkinson, Yong Yue^*

^*Corresponding author for this work

Research output: Chapter in Book or Report/Conference proceeding › Conference Proceeding › peer-review

Abstract

Pre-trained language models (PLMs) such as BERT and GPTs capture rich linguistic and syntactic knowledge from pre-training over large-scale text corpora, which can be further fine-tuned for specific downstream tasks. However, these models still have limitations as they rely on knowledge gained from plain text and ignore structured knowledge such as knowledge graphs (KGs). Recently, there has been a growing trend of explicitly integrating KGs into PLMs to improve their performance. For instance, K-BERT incorporates KG triples as domain-specific supplements into input sentences. Nevertheless, we have observed that such methods do not consider the semantic relevance between the introduced knowledge and the original input sentence, leading to the issue of knowledge impurities. To address this issue, we propose a semantic matching-based approach that enriches the input text with knowledge extracted from an external KG. The architecture of our model comprises three components: the knowledge retriever (KR), the knowledge injector (KI), and the knowledge aggregator (KA). The KR, built upon the sentence representation learning model (i.e. CoSENT), retrieves triples with high semantic relevance to the input sentence from an external KG to alleviate the issue of knowledge impurities. The KI then integrates the retrieved triples from the KR into the input text by converting the original sentence into a knowledge tree with multiple branches, the knowledge tree is transformed into an accessible sequence of text that can be fed into the KA. Finally, the KA takes the flattened knowledge tree and passes it through an embedding layer and a masked Transformer encoder. We conducted extensive evaluations on eight datasets covering five text comprehension tasks, and the experimental results demonstrate that our approach exhibits competitive advantages over popular knowledge-enhanced PLMs such as K-BERT and ERNIE.

Original language	English
Title of host publication	ACAI 2023 - Conference Program
Subtitle of host publication	2023 6th International Conference on Algorithms, Computing and Artificial Intelligence
Publisher	Association for Computing Machinery
Pages	353-360
Number of pages	8
ISBN (Electronic)	9798400709203
DOIs	https://doi.org/10.1145/3639631.3639689
Publication status	Published - 22 Dec 2023
Externally published	Yes
Event	6th International Conference on Algorithms, Computing and Artificial Intelligence, ACAI 2023 - Sanya, China Duration: 22 Dec 2023 → 24 Dec 2023

Publication series

Name	ACM International Conference Proceeding Series

Conference

Conference	6th International Conference on Algorithms, Computing and Artificial Intelligence, ACAI 2023
Country/Territory	China
City	Sanya
Period	22/12/23 → 24/12/23

Keywords

knowledge graphs
natural language understanding
sentence representation learning

Access to Document

10.1145/3639631.3639689

Cite this

Qian, J., Li, G., Atkinson, K., & Yue, Y. (2023). Enhancing Text Comprehension via Fusing Pre-trained Language Model with Knowledge Graph. In ACAI 2023 - Conference Program: 2023 6th International Conference on Algorithms, Computing and Artificial Intelligence (pp. 353-360). (ACM International Conference Proceeding Series). Association for Computing Machinery. https://doi.org/10.1145/3639631.3639689

@inproceedings{6aae3396f9f04568a883590a66080dc2,

title = "Enhancing Text Comprehension via Fusing Pre-trained Language Model with Knowledge Graph",

abstract = "Pre-trained language models (PLMs) such as BERT and GPTs capture rich linguistic and syntactic knowledge from pre-training over large-scale text corpora, which can be further fine-tuned for specific downstream tasks. However, these models still have limitations as they rely on knowledge gained from plain text and ignore structured knowledge such as knowledge graphs (KGs). Recently, there has been a growing trend of explicitly integrating KGs into PLMs to improve their performance. For instance, K-BERT incorporates KG triples as domain-specific supplements into input sentences. Nevertheless, we have observed that such methods do not consider the semantic relevance between the introduced knowledge and the original input sentence, leading to the issue of knowledge impurities. To address this issue, we propose a semantic matching-based approach that enriches the input text with knowledge extracted from an external KG. The architecture of our model comprises three components: the knowledge retriever (KR), the knowledge injector (KI), and the knowledge aggregator (KA). The KR, built upon the sentence representation learning model (i.e. CoSENT), retrieves triples with high semantic relevance to the input sentence from an external KG to alleviate the issue of knowledge impurities. The KI then integrates the retrieved triples from the KR into the input text by converting the original sentence into a knowledge tree with multiple branches, the knowledge tree is transformed into an accessible sequence of text that can be fed into the KA. Finally, the KA takes the flattened knowledge tree and passes it through an embedding layer and a masked Transformer encoder. We conducted extensive evaluations on eight datasets covering five text comprehension tasks, and the experimental results demonstrate that our approach exhibits competitive advantages over popular knowledge-enhanced PLMs such as K-BERT and ERNIE.",

keywords = "knowledge graphs, natural language understanding, sentence representation learning",

author = "Jing Qian and Gangmin Li and Katie Atkinson and Yong Yue",

note = "Publisher Copyright: {\textcopyright} 2023 Copyright held by the owner/author(s). Publication rights licensed to ACM.; 6th International Conference on Algorithms, Computing and Artificial Intelligence, ACAI 2023 ; Conference date: 22-12-2023 Through 24-12-2023",

year = "2023",

month = dec,

day = "22",

doi = "10.1145/3639631.3639689",

language = "English",

series = "ACM International Conference Proceeding Series",

publisher = "Association for Computing Machinery",

pages = "353--360",

booktitle = "ACAI 2023 - Conference Program",

}

Qian, J, Li, G, Atkinson, K & Yue, Y 2023, Enhancing Text Comprehension via Fusing Pre-trained Language Model with Knowledge Graph. in ACAI 2023 - Conference Program: 2023 6th International Conference on Algorithms, Computing and Artificial Intelligence. ACM International Conference Proceeding Series, Association for Computing Machinery, pp. 353-360, 6th International Conference on Algorithms, Computing and Artificial Intelligence, ACAI 2023, Sanya, China, 22/12/23. https://doi.org/10.1145/3639631.3639689

Enhancing Text Comprehension via Fusing Pre-trained Language Model with Knowledge Graph. / Qian, Jing; Li, Gangmin; Atkinson, Katie et al.
ACAI 2023 - Conference Program: 2023 6th International Conference on Algorithms, Computing and Artificial Intelligence. Association for Computing Machinery, 2023. p. 353-360 (ACM International Conference Proceeding Series).

Research output: Chapter in Book or Report/Conference proceeding › Conference Proceeding › peer-review

TY - GEN

T1 - Enhancing Text Comprehension via Fusing Pre-trained Language Model with Knowledge Graph

AU - Qian, Jing

AU - Li, Gangmin

AU - Atkinson, Katie

AU - Yue, Yong

PY - 2023/12/22

Y1 - 2023/12/22

N2 - Pre-trained language models (PLMs) such as BERT and GPTs capture rich linguistic and syntactic knowledge from pre-training over large-scale text corpora, which can be further fine-tuned for specific downstream tasks. However, these models still have limitations as they rely on knowledge gained from plain text and ignore structured knowledge such as knowledge graphs (KGs). Recently, there has been a growing trend of explicitly integrating KGs into PLMs to improve their performance. For instance, K-BERT incorporates KG triples as domain-specific supplements into input sentences. Nevertheless, we have observed that such methods do not consider the semantic relevance between the introduced knowledge and the original input sentence, leading to the issue of knowledge impurities. To address this issue, we propose a semantic matching-based approach that enriches the input text with knowledge extracted from an external KG. The architecture of our model comprises three components: the knowledge retriever (KR), the knowledge injector (KI), and the knowledge aggregator (KA). The KR, built upon the sentence representation learning model (i.e. CoSENT), retrieves triples with high semantic relevance to the input sentence from an external KG to alleviate the issue of knowledge impurities. The KI then integrates the retrieved triples from the KR into the input text by converting the original sentence into a knowledge tree with multiple branches, the knowledge tree is transformed into an accessible sequence of text that can be fed into the KA. Finally, the KA takes the flattened knowledge tree and passes it through an embedding layer and a masked Transformer encoder. We conducted extensive evaluations on eight datasets covering five text comprehension tasks, and the experimental results demonstrate that our approach exhibits competitive advantages over popular knowledge-enhanced PLMs such as K-BERT and ERNIE.

AB - Pre-trained language models (PLMs) such as BERT and GPTs capture rich linguistic and syntactic knowledge from pre-training over large-scale text corpora, which can be further fine-tuned for specific downstream tasks. However, these models still have limitations as they rely on knowledge gained from plain text and ignore structured knowledge such as knowledge graphs (KGs). Recently, there has been a growing trend of explicitly integrating KGs into PLMs to improve their performance. For instance, K-BERT incorporates KG triples as domain-specific supplements into input sentences. Nevertheless, we have observed that such methods do not consider the semantic relevance between the introduced knowledge and the original input sentence, leading to the issue of knowledge impurities. To address this issue, we propose a semantic matching-based approach that enriches the input text with knowledge extracted from an external KG. The architecture of our model comprises three components: the knowledge retriever (KR), the knowledge injector (KI), and the knowledge aggregator (KA). The KR, built upon the sentence representation learning model (i.e. CoSENT), retrieves triples with high semantic relevance to the input sentence from an external KG to alleviate the issue of knowledge impurities. The KI then integrates the retrieved triples from the KR into the input text by converting the original sentence into a knowledge tree with multiple branches, the knowledge tree is transformed into an accessible sequence of text that can be fed into the KA. Finally, the KA takes the flattened knowledge tree and passes it through an embedding layer and a masked Transformer encoder. We conducted extensive evaluations on eight datasets covering five text comprehension tasks, and the experimental results demonstrate that our approach exhibits competitive advantages over popular knowledge-enhanced PLMs such as K-BERT and ERNIE.

KW - knowledge graphs

KW - natural language understanding

KW - sentence representation learning

UR - http://www.scopus.com/inward/record.url?scp=85185825645&partnerID=8YFLogxK

U2 - 10.1145/3639631.3639689

DO - 10.1145/3639631.3639689

M3 - Conference Proceeding

AN - SCOPUS:85185825645

T3 - ACM International Conference Proceeding Series

SP - 353

EP - 360

BT - ACAI 2023 - Conference Program

PB - Association for Computing Machinery

T2 - 6th International Conference on Algorithms, Computing and Artificial Intelligence, ACAI 2023

Y2 - 22 December 2023 through 24 December 2023

ER -

Qian J, Li G, Atkinson K, Yue Y. Enhancing Text Comprehension via Fusing Pre-trained Language Model with Knowledge Graph. In ACAI 2023 - Conference Program: 2023 6th International Conference on Algorithms, Computing and Artificial Intelligence. Association for Computing Machinery. 2023. p. 353-360. (ACM International Conference Proceeding Series). doi: 10.1145/3639631.3639689

Enhancing Text Comprehension via Fusing Pre-trained Language Model with Knowledge Graph

Abstract

Publication series

Conference

Keywords

Access to Document

Other files and links

Fingerprint

Cite this