< Terug naar vorige pagina

Publicatie

Distributed mining of convoys in large scale datasets

Tijdschriftbijdrage - Tijdschriftartikel

Tremendous increase in the use of the mobile devices equipped with the GPS and other location sensors has resulted in the generation of a huge amount of movement data. In recent years, mining this data to understand the collective mobility behavior of humans, animals and other objects has become popular. Numerous mobility patterns, or their mining algorithms have been proposed, each representing a specific movement behavior. Convoy pattern is one such pattern which can be used to find groups of people moving together in public transport or to prevent traffic jams. A convoy is a set of at least m objects moving together for at least k consecutive time stamps where m and k are user-defined parameters. Existing algorithms for detecting convoy patterns do not scale to real-life dataset sizes. Therefore in this paper, we propose a generic distributed convoy pattern mining algorithm called DCM and show how such an algorithm can be implemented using the MapReduce framework. We present a cost model for DCM and a detailed theoretical analysis backed by experimental results. We show the effect of partition size on the performance of DCM. The results from our experiments on different data-sets and hardware setups, show that our distributed algorithm is scalable in terms of data size and number of nodes, and more efficient than any existing sequential as well as distributed convoy pattern mining algorithm, showing speed-ups of up to 16 times over SPARE, the state of the art distributed co-movement pattern mining framework. DCM is thus able to process large datasets which SPARE is unable to.
Tijdschrift: Geoinformatica
ISSN: 1384-6175
Volume: 25
Pagina's: 353 - 396
Jaar van publicatie:2021
Trefwoorden:A1 Journal article
BOF-keylabel:ja
BOF-publication weight:1
CSS-citation score:1
Auteurs:International
Authors from:Higher Education
Toegankelijkheid:Open