< Terug naar vorige pagina

Publicatie

Improving cross-domain n-gram language modelling with skipgrams

Boekbijdrage - Boekhoofdstuk Conferentiebijdrage

© 2016 Association for Computational Linguistics. In this paper we improve over the hierarchical Pitman-Yor processes language model in a cross-domain setting by adding skipgrams as features. We find that adding skipgram features reduces the perplexity. This reduction is substantial when models are trained on a generic corpus and tested on domain-specific corpora. We also find that within-domain testing and crossdomain testing require different backoff strategies. We observe a 30-40% reduction in perplexity in a cross-domain language modelling task, and up to 6% reduction in a within-domain experiment, for both English and Flemish-Dutch.
Boek: Proceedings ACL 2016
Pagina's: 137 - 142
ISBN:9781510827592
Jaar van publicatie:2016
Toegankelijkheid:Open