< Back to previous page

Publication

More meaning than meets the eye. Robust and scalable applications of pre-trained representations for biomedical NLP

Book - Dissertation

Pre-trained distributional representations of words and phrases have become omnipresent in natural language processing (NLP), where they have led to significant improvements in machine learning performance for a wide range of applications. Recent research has investigated to what extent these representations are effective for tackling the challenges of the biomedical text domain. However, it remains difficult to properly disentangle the interplay of model architectures, training objectives, data sources, and downstream biomedical NLP tasks for which the representations are used as input features. As a result, it is still unclear to which extent these representations can be applied to encode specific biomedical semantics for future applications which would require complex domain knowledge. In this thesis, we specifically explore what we consider to be robust and scalable applications of pre-trained representations for biomedical NLP. These applications go against the current dominant paradigm in NLP research, which has achieved many successes by fine-tuning large and complex neural network architectures using vast amounts of data. In contrast, we explicitly try to minimize the complexity of models that use the pre-trained representations, as well as the amount of supervised data necessary for developing the models, while keeping the models transferable across various domains and applicable in unsupervised ways, e.g. using distance metrics such as cosine similarity. While this paradigm can impose a performance ceiling on our proposed models compared to other state-of-the-art approaches, it also offers various benefits. Firstly, it helps to highlights the contribution of various aspects of a method. For instance, it can emphasize the effectiveness of training objectives which work for models with low complexity. Secondly, it minimizes the computational cost of our proposed systems, and as such aims at contributing to more equitable and democratic NLP research. Lastly, the limitations of this paradigm also challenge us to explore novel approaches that are more efficient. For example, we can compensate for less model complexity and training data by finding more effective training objectives
Number of pages: 123
Publication year:2021
Keywords:Doctoral thesis
Accessibility:Open