Publicatie

Transfer learning for digital heritage collections: comparing neural machine translation at the subword-level and character-level

Boekbijdrage - Hoofdstuk

Transfer learning via pre-training has become an important strategy for the efficient application of NLP methods in domains where only limited training data is available. This paper reports on a focused case study in which we apply transfer learning in the context of neural machine translation (French-Dutch) for cultural heritage metadata (i.e. titles of artistic works). Nowadays, neural machine translation (NMT) is commonly applied at the subword level using byte-pair encoding (BPE), because word-level models struggle with rare and out-of-vocabulary words. Because unseen vocabulary is a significant issue in domain adaptation, BPE seems a better fit for transfer learning across text varieties. We discuss an experiment in which we compare a subword-level to a character-level NMT approach. We pre-trained models on a large, generic corpus and fine-tuned them in a two-stage process: first, on a domain-specific dataset extracted from Wikipedia, and then on our metadata. While our experiments show comparable performance for character-level and BPEbased models on the general dataset, we demonstrate that the character-level approach nevertheless yields major downstream performance gains during the subsequent stages of fine-tuning. We therefore conclude that character-level translation can be beneficial compared to the popular subword-level approach in the cultural heritage domain.

Boek: Proceedings of the 12th International Conference on Agents and Artificial Intelligence - Volume 1: ARTIDIGH]

Pagina's: 522 - 529

ISBN:978-989-758-395-7

Jaar van publicatie:2020

Trefwoorden:H1 Book chapter

Handle: https://hdl.handle.net/10067/1672240151162165141
DOI: https://doi.org/10.5220/0009167205220529
WoS Id: 000570767700058

BOF-keylabel:ja

Authors from:Higher Education

Toegankelijkheid:Open

Publicatie

Transfer learning for digital heritage collections: comparing neural machine translation at the subword-level and character-level

Boekbijdrage - Hoofdstuk

Auteurs/uitgever

Onderzoekseenheden