< Back to previous page

Project

Natiolectal variation in Dutch grammar: A data-driven approach

While Belgians and Dutchmen are well aware that they use different words, and that their pronunciation diverges, they are mostly oblivious to the fact that there are also grammatical discrepancies between Belgian and Netherlandic Dutch. Few Belgians, for instance, will realize that the preposition voor in Jan maakte (voor) haar een boterham 'John made (for) her a sandwich' is optional for them, whereas it is indispensable for almost all the Dutch.

How come there are such outspoken syntactic differences between two varieties (in a comparatively small language area) which did not begin to diverge before the 16th century? And where do these differences come from? In order to answer these questions, we draw on large subtitle and newspaper corpora, and marshal machine translation, machine learning, and automated semantic classification technologies to access the syntactic motor, or motors, of Dutch.

Date:1 Nov 2017 →  20 Apr 2023
Keywords:memory-based learning, Dutch, syntax, national variation
Disciplines:Theory and methodology of literary studies
Project type:PhD project