Probabilistic topic models for learning terminological ontologies

Wei Wang*, Payam Mamaani Barnaghi, Andrzej Bargiela

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

84 Citations (Scopus)

Abstract

Probabilistic topic models were originally developed and utilized for document modeling and topic extraction in Information Retrieval. In this paper, we describe a new approach for automatic learning of terminological ontologies from text corpus based on such models. In our approach, topic models are used as efficient dimension reduction techniques, which are able to capture semantic relationships between word-topic and topic-document interpreted in terms of probability distributions. We propose two algorithms for learning terminological ontologies using the principle of topic relationship and exploiting information theory with the probabilistic topic models learned. Experiments with different model parameters were conducted and learned ontology statements were evaluated by the domain experts. We have also compared the results of our method with two existing concept hierarchy learning methods on the same data set. The study shows that our method outperforms other methods in terms of recall and precision measures. The precision level of the learned ontology is sufficient for it to be deployed for the purpose of browsing, navigation, and information search and retrieval in digital libraries.

Original languageEnglish
Article number4912206
Pages (from-to)1028-1040
Number of pages13
JournalIEEE Transactions on Knowledge and Data Engineering
Volume22
Issue number7
DOIs
Publication statusPublished - 2010
Externally publishedYes

Keywords

  • Knowledge acquisition
  • Ontology
  • Ontology learning
  • Probabilistic topic models

Cite this