< Terug naar vorige pagina

Publicatie

Data-driven Diagnostic Decision Support Systems: a Secondary Use for Early Pregnancy and Kidney Transplant Research Databases

Boek - Dissertatie

With the advent of information technology in healthcare, data occupy a central place in clinical practice. This digital dependency calls for efficient tools to assist physicians in managing and processing the explosion of data resources. These tools are commonly known as clinical decision support systems (CDSS). They cover a broad range of applications from alerting physicians of drug allergy to the facilitation of administrative data encoding. In this thesis, we focused on diagnostic decision support systems (DDSS), a subtype of CDSS devoted to enhancing the diagnostic decision process with relevant information provided at the patient-level. DDSS often target a very specific medical condition. Therefore, their development largely depends on applied research centered around a predefined medical problem. In this thesis, we expose the research on DDSS initiated by three different medical challenges. The first challenge focuses on patient similarity. Physicians often solve current cases by referring to prior encountered patients. To exploit the valuable clinical databases for similar patient retrieval, proper metrics of inter-patient similarity must first be defined. In this project, we focused on tree-based ensemble algorithms, which consist of combinations of individual decision trees, to learn task-specific patient similarity metrics. While requiring little optimization, the flexibility of these algorithms makes them suitable to learn metrics for various medical outcomes (e.g. diagnosis, prognosis or scoring systems). We demonstrated that these approaches, by natively handling the presence of missing values, provide powerful task-specific patient similarity metrics for heterogeneous clinical databases. Our second use-case targets the problem of viability prediction at the end of the first trimester of pregnancy using early gestation data. Our main contributions consist in the development and the internal validation of two calibrated models to predict the first trimester outcome based on demographics, clinical, sonographic and biomarkers predictors. We also extensively assessed the added value of plasmatic biomarkers variables in the predictive models. In addition, we studied the imputation of incomplete data at prediction time using a modified version of fixed chained equation imputation. This online imputation framework allows to perform accurate model predictions on moderately incomplete queries. Finally, we leveraged a post-hoc interpretability framework, along with relevant visualization plots, to facilitate the understanding of the models' predictions from the clinical end-users. In the third project, we focus on kidney transplant biopsies and explore another aspect of DDSS. While most of the existing DDSS are based on pre-established diagnostic classifications, we first aimed to improve the existing classification of acute kidney transplant rejection. The current categorization, based on the histological assessment of kidney transplant biopsies, produces categories that are not mutually exclusive and makes use of lesions that are not specific of the disease processes. While this complex classification accurately depicts the observed histological reality, its clinical interpretation remains difficult, leading to potential unstable clinical decisions. In this project, we derived and validated six novel phenotypes of acute rejection by semi-supervised consensus clustering, a special form of clustering where external guidance is added to enforce the creation of clinically meaningful clusters. Finally, the last part of this thesis, as a synthesis of the use cases research, outlines a generic framework for data-driven DDSS. Additional theoretical considerations with regard to DDSS are exposed. In particular, we discuss healthcare-related data and their preprocessing, the choice of appropriate clinical predictive models for DDSS and important notions around clinical model interpretability.
Jaar van publicatie:2022
Toegankelijkheid:Open