Minoan Linguistic Resources: The Linear A Digital Corpus

Tommaso PETROLITO; Ruggero PETROLITO; Francesco PERONO CACCIAFOCO; Grégoire WINTERSTEIN

Minoan Linguistic Resources: The Linear A Digital Corpus

Tommaso PETROLITO, Ruggero PETROLITO, Francesco PERONO CACCIAFOCO^*, Grégoire WINTERSTEIN

^*Corresponding author for this work

Department of Applied Linguistics

Research output: Chapter in Book or Report/Conference proceeding › Conference Proceeding › peer-review

3 Citations (Scopus)

Abstract

This paper describes the Linear A/Minoan digital corpus and the approaches we applied to develop it. We aim to set up a suitable study resource for Linear A and Minoan. Firstly we start by introducing Linear A and Minoan in order to make it clear why we should develop a digital marked up corpus of the existing Linear A transcriptions. Secondly we list and describe some of the existing resources about Linear A: Linear A documents (seals, statuettes, vessels etc.), the traditional encoding systems (standard code numbers referring to distinct symbols), a Linear A font, and the newest (released on June 16th 2014) Unicode Standard Characters set for Linear A. Thirdly we explain our choice concerning the data format: why we decided to digitize the Linear A resources; why we decided to convert all the transcriptions in standard Unicode characters; why we decided to use an XML format; why we decided to implement the TEI-EpiDoc DTD. Lastly we describe: the developing process (from the data collection to the issues we faced and the solving strategies); a new font we developed (synchronized with the Unicode Characters Set) in order to make the data readable even on systems that are not updated. Finally, we discuss the corpus we developed in a Cultural Heritage preservation perspective and suggest some future works. c 2015 Association for Computational Linguistics and The Asian Federation of Natural Language Processing.

Original language	English
Title of host publication	LaTeCH 2015 - Proceedings of the 9th SIGHUM Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities
Editors	Kalliopi A. Zervanou, Marieke van Erp, Beatrice Alex
Publisher	Association for Computational Linguistics (ACL)
Pages	95-104
Number of pages	10
ISBN (Electronic)	9781941643631
Publication status	Published - Jul 2015
Event	9th SIGHUM Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities, LaTeCH 2015 - Beijing, China Duration: 30 Jul 2015 → …

Publication series

Name	Proceedings of the Annual Meeting of the Association for Computational Linguistics
Volume	2015-text
ISSN (Print)	0736-587X

Conference

Conference	9th SIGHUM Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities, LaTeCH 2015
Country/Territory	China
City	Beijing
Period	30/07/15 → …

Keywords

Linear A
Language Deciphering
Corpus Linguistics
Digital Humanities
History of Writing

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

Cite this

PETROLITO, T., PETROLITO, R., PERONO CACCIAFOCO, F., & WINTERSTEIN, G. (2015). Minoan Linguistic Resources: The Linear A Digital Corpus. In K. A. Zervanou, M. van Erp, & B. Alex (Eds.), LaTeCH 2015 - Proceedings of the 9th SIGHUM Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities (pp. 95-104). (Proceedings of the Annual Meeting of the Association for Computational Linguistics; Vol. 2015-text). Association for Computational Linguistics (ACL).

PETROLITO, Tommaso ; PETROLITO, Ruggero ; PERONO CACCIAFOCO, Francesco et al. / Minoan Linguistic Resources : The Linear A Digital Corpus. LaTeCH 2015 - Proceedings of the 9th SIGHUM Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities. editor / Kalliopi A. Zervanou ; Marieke van Erp ; Beatrice Alex. Association for Computational Linguistics (ACL), 2015. pp. 95-104 (Proceedings of the Annual Meeting of the Association for Computational Linguistics).

@inproceedings{9ae40295245b43e8825555a25dd8e76b,

title = "Minoan Linguistic Resources: The Linear A Digital Corpus",

abstract = "This paper describes the Linear A/Minoan digital corpus and the approaches we applied to develop it. We aim to set up a suitable study resource for Linear A and Minoan. Firstly we start by introducing Linear A and Minoan in order to make it clear why we should develop a digital marked up corpus of the existing Linear A transcriptions. Secondly we list and describe some of the existing resources about Linear A: Linear A documents (seals, statuettes, vessels etc.), the traditional encoding systems (standard code numbers referring to distinct symbols), a Linear A font, and the newest (released on June 16th 2014) Unicode Standard Characters set for Linear A. Thirdly we explain our choice concerning the data format: why we decided to digitize the Linear A resources; why we decided to convert all the transcriptions in standard Unicode characters; why we decided to use an XML format; why we decided to implement the TEI-EpiDoc DTD. Lastly we describe: the developing process (from the data collection to the issues we faced and the solving strategies); a new font we developed (synchronized with the Unicode Characters Set) in order to make the data readable even on systems that are not updated. Finally, we discuss the corpus we developed in a Cultural Heritage preservation perspective and suggest some future works. c 2015 Association for Computational Linguistics and The Asian Federation of Natural Language Processing. ",

keywords = "Linear A, Language Deciphering, Corpus Linguistics, Digital Humanities, History of Writing",

author = "Tommaso PETROLITO and Ruggero PETROLITO and {PERONO CACCIAFOCO}, Francesco and Gr{\'e}goire WINTERSTEIN",

note = "PETROLITO, Tommaso, and Ruggero PETROLITO, Francesco PERONO CACCIAFOCO, Gr{\'e}goire WINTERSTEIN. (2015). Minoan Linguistic Resources: The Linear A Digital Corpus. Proceedings of the 9th SIGHUM Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities (LaTech) / ACL (Association for Computational Linguistics)-IJCNLP, July 26-31, 2015, Beijing, PRC (China National Convention Center - CNCC): 95-104 Publisher Copyright: {\textcopyright} 2015 Proceedings of the Annual Meeting of the Association for Computational Linguistics. ; 9th SIGHUM Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities, LaTeCH 2015 ; Conference date: 30-07-2015",

year = "2015",

month = jul,

language = "English",

series = "Proceedings of the Annual Meeting of the Association for Computational Linguistics",

publisher = "Association for Computational Linguistics (ACL)",

pages = "95--104",

editor = "Zervanou, {Kalliopi A.} and {van Erp}, Marieke and Beatrice Alex",

booktitle = "LaTeCH 2015 - Proceedings of the 9th SIGHUM Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities",

}

PETROLITO, T, PETROLITO, R, PERONO CACCIAFOCO, F & WINTERSTEIN, G 2015, Minoan Linguistic Resources: The Linear A Digital Corpus. in KA Zervanou, M van Erp & B Alex (eds), LaTeCH 2015 - Proceedings of the 9th SIGHUM Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities. Proceedings of the Annual Meeting of the Association for Computational Linguistics, vol. 2015-text, Association for Computational Linguistics (ACL), pp. 95-104, 9th SIGHUM Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities, LaTeCH 2015, Beijing, China, 30/07/15.

Minoan Linguistic Resources: The Linear A Digital Corpus. / PETROLITO, Tommaso; PETROLITO, Ruggero; PERONO CACCIAFOCO, Francesco et al.
LaTeCH 2015 - Proceedings of the 9th SIGHUM Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities. ed. / Kalliopi A. Zervanou; Marieke van Erp; Beatrice Alex. Association for Computational Linguistics (ACL), 2015. p. 95-104 (Proceedings of the Annual Meeting of the Association for Computational Linguistics; Vol. 2015-text).

Research output: Chapter in Book or Report/Conference proceeding › Conference Proceeding › peer-review

TY - GEN

T1 - Minoan Linguistic Resources

T2 - 9th SIGHUM Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities, LaTeCH 2015

AU - PETROLITO, Tommaso

AU - PETROLITO, Ruggero

AU - PERONO CACCIAFOCO, Francesco

AU - WINTERSTEIN, Grégoire

N1 - PETROLITO, Tommaso, and Ruggero PETROLITO, Francesco PERONO CACCIAFOCO, Grégoire WINTERSTEIN. (2015). Minoan Linguistic Resources: The Linear A Digital Corpus. Proceedings of the 9th SIGHUM Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities (LaTech) / ACL (Association for Computational Linguistics)-IJCNLP, July 26-31, 2015, Beijing, PRC (China National Convention Center - CNCC): 95-104 Publisher Copyright: © 2015 Proceedings of the Annual Meeting of the Association for Computational Linguistics.

PY - 2015/7

Y1 - 2015/7

N2 - This paper describes the Linear A/Minoan digital corpus and the approaches we applied to develop it. We aim to set up a suitable study resource for Linear A and Minoan. Firstly we start by introducing Linear A and Minoan in order to make it clear why we should develop a digital marked up corpus of the existing Linear A transcriptions. Secondly we list and describe some of the existing resources about Linear A: Linear A documents (seals, statuettes, vessels etc.), the traditional encoding systems (standard code numbers referring to distinct symbols), a Linear A font, and the newest (released on June 16th 2014) Unicode Standard Characters set for Linear A. Thirdly we explain our choice concerning the data format: why we decided to digitize the Linear A resources; why we decided to convert all the transcriptions in standard Unicode characters; why we decided to use an XML format; why we decided to implement the TEI-EpiDoc DTD. Lastly we describe: the developing process (from the data collection to the issues we faced and the solving strategies); a new font we developed (synchronized with the Unicode Characters Set) in order to make the data readable even on systems that are not updated. Finally, we discuss the corpus we developed in a Cultural Heritage preservation perspective and suggest some future works. c 2015 Association for Computational Linguistics and The Asian Federation of Natural Language Processing.

AB - This paper describes the Linear A/Minoan digital corpus and the approaches we applied to develop it. We aim to set up a suitable study resource for Linear A and Minoan. Firstly we start by introducing Linear A and Minoan in order to make it clear why we should develop a digital marked up corpus of the existing Linear A transcriptions. Secondly we list and describe some of the existing resources about Linear A: Linear A documents (seals, statuettes, vessels etc.), the traditional encoding systems (standard code numbers referring to distinct symbols), a Linear A font, and the newest (released on June 16th 2014) Unicode Standard Characters set for Linear A. Thirdly we explain our choice concerning the data format: why we decided to digitize the Linear A resources; why we decided to convert all the transcriptions in standard Unicode characters; why we decided to use an XML format; why we decided to implement the TEI-EpiDoc DTD. Lastly we describe: the developing process (from the data collection to the issues we faced and the solving strategies); a new font we developed (synchronized with the Unicode Characters Set) in order to make the data readable even on systems that are not updated. Finally, we discuss the corpus we developed in a Cultural Heritage preservation perspective and suggest some future works. c 2015 Association for Computational Linguistics and The Asian Federation of Natural Language Processing.

KW - Linear A

KW - Language Deciphering

KW - Corpus Linguistics

KW - Digital Humanities

KW - History of Writing

UR - http://www.scopus.com/inward/record.url?scp=85122496155&partnerID=8YFLogxK

UR - https://aclanthology.org/W15-3715/

M3 - Conference Proceeding

AN - SCOPUS:85122496155

T3 - Proceedings of the Annual Meeting of the Association for Computational Linguistics

SP - 95

EP - 104

BT - LaTeCH 2015 - Proceedings of the 9th SIGHUM Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities

A2 - Zervanou, Kalliopi A.

A2 - van Erp, Marieke

A2 - Alex, Beatrice

PB - Association for Computational Linguistics (ACL)

Y2 - 30 July 2015

ER -

PETROLITO T, PETROLITO R, PERONO CACCIAFOCO F, WINTERSTEIN G. Minoan Linguistic Resources: The Linear A Digital Corpus. In Zervanou KA, van Erp M, Alex B, editors, LaTeCH 2015 - Proceedings of the 9th SIGHUM Workshop on Language Technology for Cultural Heritage, Social Sciences, and Humanities. Association for Computational Linguistics (ACL). 2015. p. 95-104. (Proceedings of the Annual Meeting of the Association for Computational Linguistics).

Minoan Linguistic Resources: The Linear A Digital Corpus

Abstract

Publication series

Conference

Keywords

UN SDGs

Other files and links

Fingerprint

Giving Voice to the Minoan People: The Decipherment of Linear A

Minoan and the Machines: Computational Approaches to the Decipherment of Linear A

The Lonely Life of a Glyph-breaker

Clusters and Syllabo-logographical Classification Models in Linear A/B Tablets: Lexical and Visual Analogies for a More Comprehensive Study of Their Properties and Advanced Bronze Age Trading-diplomatical Correspondences

Giving Back to the Minoan People Their Own Voice: How to Be Awarded with a Research Grant on Language Deciphering and Successfully Manage It

Il mistero delle lingue antiche che nessuno riesce (ancora) a decifrare [The Mystery of Ancient Languages that Nobody Is Able to Decipher (Yet)].

Cite this

Minoan Linguistic Resources: The Linear A Digital Corpus

Abstract

Publication series

Conference

Keywords

UN SDGs

Other files and links

Fingerprint

Projects

Giving Voice to the Minoan People: The Decipherment of Linear A

Research output

Minoan and the Machines: Computational Approaches to the Decipherment of Linear A

The Lonely Life of a Glyph-breaker

Activities

Clusters and Syllabo-logographical Classification Models in Linear A/B Tablets: Lexical and Visual Analogies for a More Comprehensive Study of Their Properties and Advanced Bronze Age Trading-diplomatical Correspondences

Giving Back to the Minoan People Their Own Voice: How to Be Awarded with a Research Grant on Language Deciphering and Successfully Manage It

Il mistero delle lingue antiche che nessuno riesce (ancora) a decifrare [The Mystery of Ancient Languages that Nobody Is Able to Decipher (Yet)].

Cite this