Project
Nonlinear data fusion for arbitrary entity-relation graphs with application to genome interpretation for personalized medicine
In this project, we will develop a scalable data fusion algorithm for nonlinear inference over arbitrary Entity-Relation graphs, which are a simple way to describe complex problem domains by specifying how entities (classes of objects) interact. Our data fusion approach is general and could be applied to many problems, but we will focus on the groundbreaking challenge of genome interpretation, whose solution is at the core of personalized medicine. We will focus on the in silico diagnosis of oligo- and polygenic disorders starting from patient sequencing data, contextual information (such as gene and pathway data), and the predicted deleteriousness of the variants (a topic on which I worked during my PhD). To model the genotype-phenotype relation from these heterogeneous sources of information, we will introduce important extensions to the current data fusion methods, such as nonlinear inference over arbitrary ER graphs and the interpretability of the predictions (which is crucial when interacting with geneticists and medical doctors). We will train and test our model on data coming from major international consortia, which cover tens of thousands of patients, hereby learning the relation between the variants and the corresponding phenotypes. Finally, we will apply our model to concrete use cases investigated by our research partners, such as Brugada Syndrome (UZ Brussels) and 22q11 deletion (KU Leuven), aiming at building a diagnostic tool to help the diagnosis of patients.