< Terug naar vorige pagina

Dataset

SparrKULee: A Speech-evoked Auditory Response Repository of the KU Leuven, containing EEG of 85 participants

The following author contributed equally to this dataset: Accou, Bernd; Bollens, Lies.

Researchers investigating the neural mechanisms underlying speech perception often employ electroencephalography (EEG) to record brain activity while participants listen to spoken language. The high temporal resolution of EEG enables the study of neural responses to fast and dynamic speech signals. Previous studies have successfully extracted speech characteristics from EEG data and, conversely, predicted EEG activity from speech features. Machine learning techniques are generally employed to construct encoding and decoding models, which necessitate a substantial amount of data.

We present SparrKULee: A Speech-evoked Auditory Repository of EEG, measured at KU Leuven, comprising 64-channel EEG recordings from 85 young individuals with normal hearing, each of whom listened to 90-150 minutes of natural speech. This dataset is more extensive than any currently available dataset in terms of both the number of participants and the amount of data per participant. It is suitable for training larger machine learning models. We evaluate the dataset using linear and state-of-the-art non-linear models in a speech encoding/decoding and match/mismatch paradigm, providing benchmark scores for future research.

Our github repository contains the necessary code to perform preprocessing steps needed to obtain the files in the derivatives folder, as well as extra code to show the technical validation of our dataset and tools to download the dataset more easily.
This link provides a download of the whole dataset in one big zip file ( > 100GB) .
For a download of the dataset using already zipped files, split up into smaller chunks, click here.

Due to privacy concerns, there are some restricted files in the dataset. Users requesting access should send a mail to sparrkulee@kuleuven.be , stating what they want to use the data for. Access will be granted to non-commercial users, complying to the CC-BY-NC-4.0 licence
Jaar van publicatie:2023
Toegankelijkheid:open
Uitgever:KU Leuven RDR
Licentie:CC-BY-NC-4.0
Formaat:apr, bdf, json, npz, pkl, tsv, xml
Trefwoorden: Auditory EEG