Abstract
Hikayat Lonthoir, a rare saga manuscript collection originating from the Banda Archipelago, Maluku, Indonesia, retains significant Indigenous oral history amidst the Western colonial narrative. This study seeks to leverage computational methods to analyze the historic manuscript that constitutes a combination of OCR-supervised transcription, corpus linguistic profiling, semantic clustering (Word2Vec + K-Means), and named entity network analysis. A validation of the dataset is performed on 2793 cleaned word tokens towards Indonesian and Malay dictionaries, showing that 50.3% overlapped with both dictionaries, with strong cross-dictionary agreement (κ = 0.76). The lexical analysis indicates that monarchy/governance, kinship, maritime vocabulary, and extensive morphological productivity (me-, di-, ter-, pe-/per-, -nya, -an), while semantic and network analyses identify two narrative cores, developed into Aarne–Thompson–Uther (ATU) and Stith Thompson’s Motif Index of Folk Literature classification systems. These findings demonstrate how computational methods can extract structural, thematic, and relational patterns from historical manuscripts and contribute evidence-based insights to digital philology and historical linguistics.
| Original language | English |
|---|---|
| Article number | 1069 |
| Pages (from-to) | 1-26 |
| Number of pages | 26 |
| Journal | Information (Switzerland) |
| Volume | 16 |
| Issue number | 12 |
| DOIs | |
| Publication status | Published - 4 Dec 2025 |
UN SDGs
This output contributes to the following UN Sustainable Development Goals (SDGs)
-
SDG 4 Quality Education
-
SDG 15 Life on Land
Keywords
- Banda Archipelago
- Digital Philology
- Linguistic Documentation
- NLP
- Oral History
- Semantic Analysis
Projects
- 2 Finished
-
Mapping the Languages of Indigenous Clans in the Eastern Spice Islands
1/04/24 → 30/11/25
Project: Governmental Research Project
-
Place Names and Cultural Identity: Toponyms and Their Diachronic Evolution among the Kula People from Alor Island
1/01/24 → 31/12/25
Project: Internal Research Project
Research output
- 1 Article
-
Who Was 'She' in Ancient Tamil Literature?
Gracia Lourdes, J. & Perono Cacciafoco, F., 11 Dec 2023, In: Analele Universitatii din Craiova - Seria Stiinte Filologice, Lingvistica. 45, 1-2, p. 68-111 44 p.Research output: Contribution to journal › Article › peer-review
Open Access
-
Computers (Journal)
PERONO CACCIAFOCO, F. (Reviewer)
29 Jan 2026 → …Activity: Peer-review and editorial work of publications › Publication Peer-review
-
Mapping the Languages of Indigenous Clans in the Eastern Spice Islands (Research Grant)
PERONO CACCIAFOCO, F. (Participant)
1 Apr 2024 → 30 Nov 2025Activity: Other
-
Place Names and Cultural Identity: Toponyms and Their Diachronic Evolution among the Kula People from Alor Island (Research Grant)
PERONO CACCIAFOCO, F. (Participant)
1 Jan 2024 → 31 Dec 2025Activity: Other
Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver