Calculating and displaying key labels: The texts, sections, authors and neighbourhoods where words and collocations are likely to be prominent

Stephen Jeaco*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Corpora are usually not only made up of words, sentences and plain texts; they usually also have metadata, background information and structural features which can be used to filter searches or provide additional information about the context of specific concordance lines. This paper presents a new approach which uses the information about the texts in which words and collocations occur, generating clouds and tables of what are called Key Labels. The procedure can be likened to looking at key words (Scott, 1997; and Scott and Tribble, 2006) from the opposite starting point: beginning with a word of interest and exploring the features of texts and the parts of text in which it occurs. The paper explains the background to the procedure, how it is carried out, and how these Key Labels are integrated into The Prime Machine corpus tool for English language learning.

Original languageEnglish
Pages (from-to)169-182
Number of pages14
JournalCorpora
Volume15
Issue number2
DOIs
Publication statusPublished - Aug 2020

Keywords

  • Dispersion
  • Keyness
  • Metadata
  • Semantic associations

Cite this