Projects
NIKAW: exploring networks of ideas and knowledge in the Ancient World using a bilingual language model KU Leuven
The NIKAW-project aims to exploit textual information from the ancient world to reconstruct the transmission of knowledge across multilingual, geographically and chronologically extended communities. The period in focus is the advent of Christianity, the VIII century BCE to the IV century CE. In the context of this project, we will attempt to create a bilingual, state of the art NLP pipeline for extracting and examining the mentions of ...
A neurolinguistic model for the subcortical organization of grammatical language comprehension: evidence from behavioural and neurophysiological registration. Ghent University
This research project aims to directly register within the thalamus, globus pallidus interna (Gpi) and subthalamic nucleus (STN) to what extent these nuclei are involved in grammatical comprehension. Thirty right- handed Deep Brain Stimulation (DBS) patients, divided into three groups (DBS thalamus, GPi and STN) will be included in the study. During EEG registration, the subjects will be asked to judge word-class information, subject-verb ...
De novo mass spectrometry peptide sequencing with a transformer large language model. University of Antwerp
Evaluating and incorporating common sense in large language models to improve implicit language understanding. University of Antwerp
Latent Variable Models for Language and Image Understanding in Social Media and E-Commerce Data KU Leuven
More content has been created in the past few years than in the entire history of humankind. With the exponential growth of user-contributed content, it becomes increasingly important to develop systems capable to intelligently process both language and images.
While understanding language appears effortless for humans from a young age, for computers, this is quite a challenging task. Inherently, languages are ambiguous and rich. Many ...
Addressing Limitations of Language Models KU Leuven
Addressing Limitations of Language Models
Applications that automatically process language and/or speech are numerous, including, but not limited to: Automatic Speech Recognition (ASR), Machine Translation, Speech Translation, Spelling Correction, Natural Language Understanding and Natural Language Generation. Every application typically needs a special-purpose system and datasets with task-specific labels, but there is one common ...
Identifying drivers of language change using neural agent-based models. Vrije Universiteit Brussel
Agent-based ...
Probing the cross-lingual knowledge of large language models with BLI KU Leuven
Bilingual lexicon induction (BLI) is the art of translating words of two languages from monolingual corpora. Older approaches to BLI focus on classical features such as contextual and temporal similarity, more contemporary ones focus mainly on creating word embeddings in a shared cross-lingual space. However, the recent surge of large language model (LLM) size and quality opens new potential for single-word and especially multiword term ...
Design and analysis of multilingual-learning based models for advanced natural language understanding applications Ghent University
Thanks to recent deep learning breakthroughs, Natural Language Processing (NLP) has seen significant progress. Yet, this progress mainly concerns high-resource languages (e.g., English), and many seemingly basic tasks have not been satisfactorily solved, especially for many low-resource languages (e.g., Dutch). We thus observe a performance gap among languages, caused by a discrepancy in the amount of both (i) available training data, and ...