Multi-resolution modelling of topic relationships in semantic space

Wei Wang*, Andrzej Bargiela

*Corresponding author for this work

Research output: Chapter in Book or Report/Conference proceedingConference Proceedingpeer-review

Abstract

Recent techniques for document modelling provide means for transforming document representation in high dimensional word space to low dimensional semantic space. The representation with coarse resolution is often regarded as being able to capture intrinsic semantic structure of the original documents. Probabilistic topic models for document modelling attempt to search for richer representations of the structure of linguistic stimuli and as such support the process of human cognition. The topics inferred by the probabilistic topic models (latent topics) are represented as probability distributions over words. Although they are interpretable, the interpretation is not sufficiently straightforward for human understanding. Also, perhaps more importantly, relationships between the topics are difficult, if not impossible to interpret. Instead of directly operating on the latent topics, we extract topics with labels from a document collection and represent them using fictitious documents. Having trained the probabilistic topic models, we propose a method for deriving relationships (more general or more specific) between the extracted topics in the semantic space. To ensure a reasonable accuracy of modeling in a given semantic space we have conducted experiments with various dimensionality of the semantic space to identify optimal parameter settings in this context. Evaluation and comparison show that our method outperforms the existing methods for learning concept or topic relationships using same dataset.

Original languageEnglish
Title of host publicationProceedings - 23rd European Conference on Modelling and Simulation, ECMS 2009
Pages813-819
Number of pages7
Publication statusPublished - 2009
Externally publishedYes
Event23rd European Conference on Modelling and Simulation, ECMS 2009 - Madrid, Spain
Duration: 9 Jun 200912 Jun 2009

Publication series

NameProceedings - 23rd European Conference on Modelling and Simulation, ECMS 2009

Conference

Conference23rd European Conference on Modelling and Simulation, ECMS 2009
Country/TerritorySpain
CityMadrid
Period9/06/0912/06/09

Keywords

  • Document modelling
  • Latent semantic allocation
  • Probabilistic topic models
  • Topic hierarchy

Cite this