Key words when text forms the unit of study: Sizing up the effects of different measures

Stephen Jeaco*

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

7 Citations (Scopus)


Throughout the social sciences, there has been growing pressure to present effect sizes when publishing empirical data (see American Psychological Association, 2001; Parsons & Nelson, 2004). While it seems indisputable that for the majority of quantitative research foci, effect size is an essential element of statistical analysis, this paper argues that specifically for key word analysis in corpus linguistics, the means of reporting effect size must depend on the level of the unit of study of each investigation (single text, collection or large corpus). After exploring some main criticisms of the log-likelihood measure, this paper unpacks the parameters of different measures for keyness and how they might address underlying concerns. It maintains that for the exploration of foregrounded/deviant/salient/marked features in text, the use of log-likelihood scores to rank the results is still fit for purpose and coupled with Bayes Factors is a solid approach for key word analyses.

Original languageEnglish
Pages (from-to)125-154
Number of pages30
JournalInternational Journal of Corpus Linguistics
Issue number2
Publication statusPublished - 28 Aug 2020


  • Effect size
  • Key word analysis
  • Keyness
  • Log-likelihood
  • Ranking

Cite this