< Back to previous page

Project

Theory meets Quantity: Corpus-based Comparative Syntax

The availability of digital corpora and sophisticated tools to extract information from them provides exciting opportunities to carry out groundbreaking research in linguistics. This project explores the potential of corpus-based research for comparative syntax. On a more general level, this project aims to investigate how the data-driven approach of corpus linguistics can optimally be combined with the knowledge-based approach of theoretical linguistics, as many corpus studies lack theoretical support. For corpus-based syntactic research, syntactically annotated corpora, also known as ‘treebanks’, are of special interest. Such corpora consist of sentences together with their syntactic analysis. There is a fast expanding community of linguists who use treebanks for their research. What is still in its infancy though, is using treebanks for the comparative study of different languages. The corpus study carried out in this project will not be limited to the use of monolingual corpora and treebanks. A major innovation includes the exploitation of the recently constructed Europarl parallel treebank for comparative syntax. In a number of case studies it will be shown how the new tools and methods can lead to new insights and more accurate descriptions of cross-linguistic variation. The central topic is one of the most notorious phenomena in West Germanic syntax, i.e. the formation of verb clusters in German, Dutch and Afrikaans.

Date:1 Oct 2016 →  31 Aug 2020
Keywords:Corpus-based Comparative Syntax, Theory, Quantity
Disciplines:Linguistics, Theory and methodology of linguistics, Other languages and literary studies