< Terug naar vorige pagina

Publicatie

Creating a richly annotated corpus of papyrological Greek: the possibilities of Natural Language Processing approaches to a highly inflected historical language

Tijdschriftbijdrage - e-publicatie

This article describes a first attempt to annotate the full Greek papyrus corpus automatically for linguistic information. It gives an overview of existing work on Ancient Greek and analyzes the typical problems one encounters when using natural language processing techniques on (1) a historical corpus of (2) a highly inflectional language (as opposed to the more analytic present-day English) and offers solutions to them, testing several different approaches. The focus is on part-of-speech/morphological tagging and lemmatization; some syntactic parsing experiments are also briefly discussed. The conclusion discusses the strengths and shortcomings of the examined techniques and suggests possible ways to further improve tagging and parsing accuracy.
Tijdschrift: Digital Scholarship in the Humanities
ISSN: 2055-7671
Issue: 1
Volume: 35
Pagina's: 1 - 16
Jaar van publicatie:2020
BOF-keylabel:ja
IOF-keylabel:ja
BOF-publication weight:1
CSS-citation score:1
Authors from:Higher Education
Toegankelijkheid:Closed