< Back to previous page

Project

Vector embeddings as database views.

Over the past decade, vector embedding methods have been developed as a means of enabling machine learning over structured data such as graphs or, more generally, relational databases. While the empirical effectiveness of vector embeddings for focused learning tasks and application domains is well-researched, exactly what information of the structured data is encoded in embeddings is less understood. In this project, we postulate that by looking at embeddings through the lens of database research, we can gain more insight in what information embeddings contain. Concretely, we propose to design query languages in which vector embeddings can naturally be expressed. In this setting, questions concerning the kind of information that is encoded in the embedded vectors can naturally be phrased as a query rewriting using views problem, which we will study. Furthermore, by taking into account structural properties of embedding queries, we open the door to a transfer of methods in databases to vector embeddings, and back. In particular, database methods for incremental query evaluation and query sampling can be applied for the efficient learning of embedding parameters, while, conversely, embeddings can be exploited for database indexing.
Date:1 Jan 2022 →  Today
Keywords:THEORY OF DATABASES, MACHINE LEARNING
Disciplines:Machine learning and decision making, Data models, Database systems and architectures, Database theory, Theoretical computer science not elsewhere classified
Project type:Collaboration project