Calculating and displaying key labels: The texts, sections, authors and neighbourhoods where words and collocations are likely to be prominent

Stephen Jeaco

doi:10.3366/COR.2020.0193

Calculating and displaying key labels: The texts, sections, authors and neighbourhoods where words and collocations are likely to be prominent

Stephen Jeaco^*

^*Corresponding author for this work

Department of Applied Linguistics

Research output: Contribution to journal › Article › peer-review

Abstract

Corpora are usually not only made up of words, sentences and plain texts; they usually also have metadata, background information and structural features which can be used to filter searches or provide additional information about the context of specific concordance lines. This paper presents a new approach which uses the information about the texts in which words and collocations occur, generating clouds and tables of what are called Key Labels. The procedure can be likened to looking at key words (Scott, 1997; and Scott and Tribble, 2006) from the opposite starting point: beginning with a word of interest and exploring the features of texts and the parts of text in which it occurs. The paper explains the background to the procedure, how it is carried out, and how these Key Labels are integrated into The Prime Machine corpus tool for English language learning.

Original language	English
Pages (from-to)	169-182
Number of pages	14
Journal	Corpora
Volume	15
Issue number	2
DOIs	https://doi.org/10.3366/COR.2020.0193
Publication status	Published - Aug 2020

Keywords

Dispersion
Keyness
Metadata
Semantic associations

Access to Document

10.3366/COR.2020.0193

Cite this

@article{e296185a02204f1094bc83326095d097,

title = "Calculating and displaying key labels: The texts, sections, authors and neighbourhoods where words and collocations are likely to be prominent",

abstract = "Corpora are usually not only made up of words, sentences and plain texts; they usually also have metadata, background information and structural features which can be used to filter searches or provide additional information about the context of specific concordance lines. This paper presents a new approach which uses the information about the texts in which words and collocations occur, generating clouds and tables of what are called Key Labels. The procedure can be likened to looking at key words (Scott, 1997; and Scott and Tribble, 2006) from the opposite starting point: beginning with a word of interest and exploring the features of texts and the parts of text in which it occurs. The paper explains the background to the procedure, how it is carried out, and how these Key Labels are integrated into The Prime Machine corpus tool for English language learning.",

keywords = "Dispersion, Keyness, Metadata, Semantic associations",

author = "Stephen Jeaco",

note = "Publisher Copyright: {\textcopyright} Edinburgh University Press",

year = "2020",

month = aug,

doi = "10.3366/COR.2020.0193",

language = "English",

volume = "15",

pages = "169--182",

journal = "Corpora",

issn = "1749-5032",

number = "2",

}

TY - JOUR

T1 - Calculating and displaying key labels

T2 - The texts, sections, authors and neighbourhoods where words and collocations are likely to be prominent

AU - Jeaco, Stephen

N1 - Publisher Copyright: © Edinburgh University Press

PY - 2020/8

Y1 - 2020/8

N2 - Corpora are usually not only made up of words, sentences and plain texts; they usually also have metadata, background information and structural features which can be used to filter searches or provide additional information about the context of specific concordance lines. This paper presents a new approach which uses the information about the texts in which words and collocations occur, generating clouds and tables of what are called Key Labels. The procedure can be likened to looking at key words (Scott, 1997; and Scott and Tribble, 2006) from the opposite starting point: beginning with a word of interest and exploring the features of texts and the parts of text in which it occurs. The paper explains the background to the procedure, how it is carried out, and how these Key Labels are integrated into The Prime Machine corpus tool for English language learning.

AB - Corpora are usually not only made up of words, sentences and plain texts; they usually also have metadata, background information and structural features which can be used to filter searches or provide additional information about the context of specific concordance lines. This paper presents a new approach which uses the information about the texts in which words and collocations occur, generating clouds and tables of what are called Key Labels. The procedure can be likened to looking at key words (Scott, 1997; and Scott and Tribble, 2006) from the opposite starting point: beginning with a word of interest and exploring the features of texts and the parts of text in which it occurs. The paper explains the background to the procedure, how it is carried out, and how these Key Labels are integrated into The Prime Machine corpus tool for English language learning.

KW - Dispersion

KW - Keyness

KW - Metadata

KW - Semantic associations

UR - http://www.scopus.com/inward/record.url?scp=85090890525&partnerID=8YFLogxK

U2 - 10.3366/COR.2020.0193

DO - 10.3366/COR.2020.0193

M3 - Article

AN - SCOPUS:85090890525

SN - 1749-5032

VL - 15

SP - 169

EP - 182

JO - Corpora

JF - Corpora

IS - 2

ER -

Calculating and displaying key labels: The texts, sections, authors and neighbourhoods where words and collocations are likely to be prominent

Abstract

Keywords

Access to Document

Other files and links

Cite this