TY - JOUR
T1 - A combined approach for automatic identification of multi-word expressions for Latvian and Lithuanian
AU - Mandravičkaite, Justina
AU - Krilavicius, Tomas
AU - Man, Ka Lok
N1 - Funding Information:
Manuscript received October 10, 2017. This research was partly funded by a grant (No. LIP-027/2016) from the Research Council of Lithuania.
PY - 2017/11/1
Y1 - 2017/11/1
N2 - We discuss an experiment on automatic identification of bi-gram multiword expressions (MWE) in parallel Latvian and Lithuanian corpora. Raw corpora, lexical association measures (LAMs) and supervised machine learning (ML) are used due to the scarceness and quality of lexical resources (e.g., POS-tagger, parser) and tools. Combining LAMs with ML works well for other languages, our experiments show that it perform well for Lithuanian and Latvian as well. We analyse and discuss frequency thresholds in terms of potential MWE and LAMs values. Finally, combining LAMs with ML we have achieved 98,8% precision and 57,5% recall for Latvian and 96,9% precision and 61,8% recall for Lithuanian.
AB - We discuss an experiment on automatic identification of bi-gram multiword expressions (MWE) in parallel Latvian and Lithuanian corpora. Raw corpora, lexical association measures (LAMs) and supervised machine learning (ML) are used due to the scarceness and quality of lexical resources (e.g., POS-tagger, parser) and tools. Combining LAMs with ML works well for other languages, our experiments show that it perform well for Lithuanian and Latvian as well. We analyse and discuss frequency thresholds in terms of potential MWE and LAMs values. Finally, combining LAMs with ML we have achieved 98,8% precision and 57,5% recall for Latvian and 96,9% precision and 61,8% recall for Lithuanian.
KW - Lexical-associationmeasures
KW - Machine-learning
KW - Multi-word-expression
KW - hybrid-approach
UR - http://www.scopus.com/inward/record.url?scp=85034451966&partnerID=8YFLogxK
M3 - Article
AN - SCOPUS:85034451966
SN - 1819-656X
VL - 44
SP - 598
EP - 606
JO - IAENG International Journal of Computer Science
JF - IAENG International Journal of Computer Science
IS - 4
ER -