< Back to previous page

Publication

Efficient similarity computation for collaborative filtering in dynamic environments

Book Contribution - Chapter

The problem of computing all pairwise similarities in a large collection of vectors is a well-known and common data mining task. As the number and dimensionality of these vectors keeps increasing, however, currently existing approaches are often unable to meet the strict efficiency requirements imposed by the environments they need to perform in. Real-time neighbourhood-based collaborative filtering (CF) is one example of such an environment in which performance is critical. In this work, we present a novel algorithm for efficient and exact similarity computation between sparse, high-dimensional vectors. Our approach exploits the sparsity that is inherent to implicit feedback data-streams, entailing significant gains compared to other methods. Furthermore, as our model learns incrementally, it is naturally suited for dynamic real-time CF environments. We propose a MapReduce-inspired parallellisation procedure along with our method, and show how even more speed-up can be achieved. Additionally, in many real-world systems, many items are actually not recommendable at any given time, due to recency, stock, seasonality, or enforced business rules. We exploit this fact to further improve the computational efficiency of our approach. Experimental evaluation on both real-world and publicly available datasets shows that our approach scales up to millions of processed user-item interactions per second, and well advances the state-of-the-art.
Book: Proceedings of the 13th ACM Conference on Recommender Systems (RecSys '19), September 16-20, 2019, Copenhagen, Denmark
Pages: 251 - 259
ISBN:978-1-4503-6243-6
Publication year:2019
Keywords:H1 Book chapter
BOF-keylabel:yes
Authors from:Higher Education
Accessibility:Open