Multi-resolution modelling of topic relationships in semantic space

Wei Wang; Andrzej Bargiela

Multi-resolution modelling of topic relationships in semantic space

Wei Wang^*, Andrzej Bargiela

^*Corresponding author for this work

Materials and Manufacturing Engineering

Research output: Chapter in Book or Report/Conference proceeding › Conference Proceeding › peer-review

Abstract

Recent techniques for document modelling provide means for transforming document representation in high dimensional word space to low dimensional semantic space. The representation with coarse resolution is often regarded as being able to capture intrinsic semantic structure of the original documents. Probabilistic topic models for document modelling attempt to search for richer representations of the structure of linguistic stimuli and as such support the process of human cognition. The topics inferred by the probabilistic topic models (latent topics) are represented as probability distributions over words. Although they are interpretable, the interpretation is not sufficiently straightforward for human understanding. Also, perhaps more importantly, relationships between the topics are difficult, if not impossible to interpret. Instead of directly operating on the latent topics, we extract topics with labels from a document collection and represent them using fictitious documents. Having trained the probabilistic topic models, we propose a method for deriving relationships (more general or more specific) between the extracted topics in the semantic space. To ensure a reasonable accuracy of modeling in a given semantic space we have conducted experiments with various dimensionality of the semantic space to identify optimal parameter settings in this context. Evaluation and comparison show that our method outperforms the existing methods for learning concept or topic relationships using same dataset.

Original language	English
Title of host publication	Proceedings - 23rd European Conference on Modelling and Simulation, ECMS 2009
Pages	813-819
Number of pages	7
Publication status	Published - 2009
Externally published	Yes
Event	23rd European Conference on Modelling and Simulation, ECMS 2009 - Madrid, Spain Duration: 9 Jun 2009 → 12 Jun 2009

Publication series

Name	Proceedings - 23rd European Conference on Modelling and Simulation, ECMS 2009

Conference

Conference	23rd European Conference on Modelling and Simulation, ECMS 2009
Country/Territory	Spain
City	Madrid
Period	9/06/09 → 12/06/09

Keywords

Document modelling
Latent semantic allocation
Probabilistic topic models
Topic hierarchy

Cite this

@inproceedings{2f24d7c0a1da430b926b3455d565aebd,

title = "Multi-resolution modelling of topic relationships in semantic space",

abstract = "Recent techniques for document modelling provide means for transforming document representation in high dimensional word space to low dimensional semantic space. The representation with coarse resolution is often regarded as being able to capture intrinsic semantic structure of the original documents. Probabilistic topic models for document modelling attempt to search for richer representations of the structure of linguistic stimuli and as such support the process of human cognition. The topics inferred by the probabilistic topic models (latent topics) are represented as probability distributions over words. Although they are interpretable, the interpretation is not sufficiently straightforward for human understanding. Also, perhaps more importantly, relationships between the topics are difficult, if not impossible to interpret. Instead of directly operating on the latent topics, we extract topics with labels from a document collection and represent them using fictitious documents. Having trained the probabilistic topic models, we propose a method for deriving relationships (more general or more specific) between the extracted topics in the semantic space. To ensure a reasonable accuracy of modeling in a given semantic space we have conducted experiments with various dimensionality of the semantic space to identify optimal parameter settings in this context. Evaluation and comparison show that our method outperforms the existing methods for learning concept or topic relationships using same dataset.",

keywords = "Document modelling, Latent semantic allocation, Probabilistic topic models, Topic hierarchy",

author = "Wei Wang and Andrzej Bargiela",

year = "2009",

language = "English",

isbn = "0955301882",

series = "Proceedings - 23rd European Conference on Modelling and Simulation, ECMS 2009",

pages = "813--819",

booktitle = "Proceedings - 23rd European Conference on Modelling and Simulation, ECMS 2009",

note = "23rd European Conference on Modelling and Simulation, ECMS 2009 ; Conference date: 09-06-2009 Through 12-06-2009",

}

Multi-resolution modelling of topic relationships in semantic space. / Wang, Wei; Bargiela, Andrzej.
Proceedings - 23rd European Conference on Modelling and Simulation, ECMS 2009. 2009. p. 813-819 (Proceedings - 23rd European Conference on Modelling and Simulation, ECMS 2009).

Research output: Chapter in Book or Report/Conference proceeding › Conference Proceeding › peer-review

TY - GEN

T1 - Multi-resolution modelling of topic relationships in semantic space

AU - Wang, Wei

AU - Bargiela, Andrzej

PY - 2009

Y1 - 2009

N2 - Recent techniques for document modelling provide means for transforming document representation in high dimensional word space to low dimensional semantic space. The representation with coarse resolution is often regarded as being able to capture intrinsic semantic structure of the original documents. Probabilistic topic models for document modelling attempt to search for richer representations of the structure of linguistic stimuli and as such support the process of human cognition. The topics inferred by the probabilistic topic models (latent topics) are represented as probability distributions over words. Although they are interpretable, the interpretation is not sufficiently straightforward for human understanding. Also, perhaps more importantly, relationships between the topics are difficult, if not impossible to interpret. Instead of directly operating on the latent topics, we extract topics with labels from a document collection and represent them using fictitious documents. Having trained the probabilistic topic models, we propose a method for deriving relationships (more general or more specific) between the extracted topics in the semantic space. To ensure a reasonable accuracy of modeling in a given semantic space we have conducted experiments with various dimensionality of the semantic space to identify optimal parameter settings in this context. Evaluation and comparison show that our method outperforms the existing methods for learning concept or topic relationships using same dataset.

AB - Recent techniques for document modelling provide means for transforming document representation in high dimensional word space to low dimensional semantic space. The representation with coarse resolution is often regarded as being able to capture intrinsic semantic structure of the original documents. Probabilistic topic models for document modelling attempt to search for richer representations of the structure of linguistic stimuli and as such support the process of human cognition. The topics inferred by the probabilistic topic models (latent topics) are represented as probability distributions over words. Although they are interpretable, the interpretation is not sufficiently straightforward for human understanding. Also, perhaps more importantly, relationships between the topics are difficult, if not impossible to interpret. Instead of directly operating on the latent topics, we extract topics with labels from a document collection and represent them using fictitious documents. Having trained the probabilistic topic models, we propose a method for deriving relationships (more general or more specific) between the extracted topics in the semantic space. To ensure a reasonable accuracy of modeling in a given semantic space we have conducted experiments with various dimensionality of the semantic space to identify optimal parameter settings in this context. Evaluation and comparison show that our method outperforms the existing methods for learning concept or topic relationships using same dataset.

KW - Document modelling

KW - Latent semantic allocation

KW - Probabilistic topic models

KW - Topic hierarchy

UR - http://www.scopus.com/inward/record.url?scp=84863242234&partnerID=8YFLogxK

M3 - Conference Proceeding

AN - SCOPUS:84863242234

SN - 0955301882

SN - 9780955301889

T3 - Proceedings - 23rd European Conference on Modelling and Simulation, ECMS 2009

SP - 813

EP - 819

BT - Proceedings - 23rd European Conference on Modelling and Simulation, ECMS 2009

T2 - 23rd European Conference on Modelling and Simulation, ECMS 2009

Y2 - 9 June 2009 through 12 June 2009

ER -

Multi-resolution modelling of topic relationships in semantic space

Abstract

Publication series

Conference

Keywords

Other files and links

Cite this