Knowledge base enrichment by relation learning from social tagging data

Hang Dong; Wei Wang; Frans Coenen; Kaizhu Huang

doi:10.1016/j.ins.2020.04.002

Knowledge base enrichment by relation learning from social tagging data

Hang Dong, Wei Wang^*, Frans Coenen, Kaizhu Huang

^*Corresponding author for this work

Research output: Contribution to journal › Article › peer-review

6 Citations (Scopus)

Abstract

There has been considerable interest in transforming unstructured social tagging data into structured knowledge for semantic-based retrieval and recommendation. Research in this line mostly exploits data co-occurrence and often overlooks the complex and ambiguous meanings of tags. Furthermore, there have been few comprehensive evaluation studies regarding the quality of the discovered knowledge. We propose a supervised learning method to discover subsumption relations from tags. The key to this method is quantifying the probabilistic association among tags to better characterise their relations. We further develop an algorithm to organise tags into hierarchies based on the learned relations. Experiments were conducted using a large, publicly available dataset, Bibsonomy, and three popular, human-engineered or data-driven knowledge bases: DBpedia, Microsoft Concept Graph, and ACM Computing Classification System. We performed a comprehensive evaluation using different strategies: relation-level, ontology-level, and knowledge base enrichment based evaluation. The results clearly show that the proposed method can extract knowledge of better quality than the existing methods against the gold standard knowledge bases. The proposed approach can also enrich knowledge bases with new subsumption relations, having the potential to significantly reduce time and human effort for knowledge base maintenance and ontology evolution.

Original language	English
Pages (from-to)	203-220
Number of pages	18
Journal	Information Sciences
Volume	526
DOIs	https://doi.org/10.1016/j.ins.2020.04.002
Publication status	Published - Jul 2020

Keywords

Classification
Knowledge base enrichment
Knowledge discovery
Ontology learning
Probabilistic association analysis
Social tagging

Access to Document

10.1016/j.ins.2020.04.002

Cite this

@article{2743cb805c6545a5829694943e4b828e,

title = "Knowledge base enrichment by relation learning from social tagging data",

abstract = "There has been considerable interest in transforming unstructured social tagging data into structured knowledge for semantic-based retrieval and recommendation. Research in this line mostly exploits data co-occurrence and often overlooks the complex and ambiguous meanings of tags. Furthermore, there have been few comprehensive evaluation studies regarding the quality of the discovered knowledge. We propose a supervised learning method to discover subsumption relations from tags. The key to this method is quantifying the probabilistic association among tags to better characterise their relations. We further develop an algorithm to organise tags into hierarchies based on the learned relations. Experiments were conducted using a large, publicly available dataset, Bibsonomy, and three popular, human-engineered or data-driven knowledge bases: DBpedia, Microsoft Concept Graph, and ACM Computing Classification System. We performed a comprehensive evaluation using different strategies: relation-level, ontology-level, and knowledge base enrichment based evaluation. The results clearly show that the proposed method can extract knowledge of better quality than the existing methods against the gold standard knowledge bases. The proposed approach can also enrich knowledge bases with new subsumption relations, having the potential to significantly reduce time and human effort for knowledge base maintenance and ontology evolution.",

keywords = "Classification, Knowledge base enrichment, Knowledge discovery, Ontology learning, Probabilistic association analysis, Social tagging",

author = "Hang Dong and Wei Wang and Frans Coenen and Kaizhu Huang",

note = "Publisher Copyright: {\textcopyright} 2020",

year = "2020",

month = jul,

doi = "10.1016/j.ins.2020.04.002",

language = "English",

volume = "526",

pages = "203--220",

journal = "Information Sciences",

issn = "0020-0255",

publisher = "Elsevier",

}

TY - JOUR

T1 - Knowledge base enrichment by relation learning from social tagging data

AU - Dong, Hang

AU - Wang, Wei

AU - Coenen, Frans

AU - Huang, Kaizhu

PY - 2020/7

Y1 - 2020/7

N2 - There has been considerable interest in transforming unstructured social tagging data into structured knowledge for semantic-based retrieval and recommendation. Research in this line mostly exploits data co-occurrence and often overlooks the complex and ambiguous meanings of tags. Furthermore, there have been few comprehensive evaluation studies regarding the quality of the discovered knowledge. We propose a supervised learning method to discover subsumption relations from tags. The key to this method is quantifying the probabilistic association among tags to better characterise their relations. We further develop an algorithm to organise tags into hierarchies based on the learned relations. Experiments were conducted using a large, publicly available dataset, Bibsonomy, and three popular, human-engineered or data-driven knowledge bases: DBpedia, Microsoft Concept Graph, and ACM Computing Classification System. We performed a comprehensive evaluation using different strategies: relation-level, ontology-level, and knowledge base enrichment based evaluation. The results clearly show that the proposed method can extract knowledge of better quality than the existing methods against the gold standard knowledge bases. The proposed approach can also enrich knowledge bases with new subsumption relations, having the potential to significantly reduce time and human effort for knowledge base maintenance and ontology evolution.

AB - There has been considerable interest in transforming unstructured social tagging data into structured knowledge for semantic-based retrieval and recommendation. Research in this line mostly exploits data co-occurrence and often overlooks the complex and ambiguous meanings of tags. Furthermore, there have been few comprehensive evaluation studies regarding the quality of the discovered knowledge. We propose a supervised learning method to discover subsumption relations from tags. The key to this method is quantifying the probabilistic association among tags to better characterise their relations. We further develop an algorithm to organise tags into hierarchies based on the learned relations. Experiments were conducted using a large, publicly available dataset, Bibsonomy, and three popular, human-engineered or data-driven knowledge bases: DBpedia, Microsoft Concept Graph, and ACM Computing Classification System. We performed a comprehensive evaluation using different strategies: relation-level, ontology-level, and knowledge base enrichment based evaluation. The results clearly show that the proposed method can extract knowledge of better quality than the existing methods against the gold standard knowledge bases. The proposed approach can also enrich knowledge bases with new subsumption relations, having the potential to significantly reduce time and human effort for knowledge base maintenance and ontology evolution.

KW - Classification

KW - Knowledge base enrichment

KW - Knowledge discovery

KW - Ontology learning

KW - Probabilistic association analysis

KW - Social tagging

UR - http://www.scopus.com/inward/record.url?scp=85082848128&partnerID=8YFLogxK

U2 - 10.1016/j.ins.2020.04.002

DO - 10.1016/j.ins.2020.04.002

M3 - Article

AN - SCOPUS:85082848128

SN - 0020-0255

VL - 526

SP - 203

EP - 220

JO - Information Sciences

JF - Information Sciences

ER -

Knowledge base enrichment by relation learning from social tagging data

Abstract

Keywords

Access to Document

Other files and links

Cite this