< Back to previous page

Dataset

CLiPS Stylometry Investigation (CSI) Corpus

The CSI corpus is a yearly expanded corpus of student texts in two genres: essays and reviews. The purpose of this corpus lies primarily in stylometric research, but other applications are possible. There is a vast amount of meta-data available, both on the author (gender, age, sexual orientation, region of origin, personality profile) and on the document (timestamp, genre, veracity, sentiment, grade).
Publication year:2014
Accessibility:closed
Publisher:CLiPS Research Group, University of Antwerp
License:CC-BY-SA-3.0
Format:txt
Keywords: Linguistics