< Back to previous page

Project

Data Analytics for Algorithm Design

The development of algorithms solving computationally hard optimisation problems has a long history. Several important contributions can be traced back to the mid of the 20th century. Today, state-of-the-art optimisation problems are of high complexity, combining different components developed around a variety of conceptual approaches. The number of concepts as well as alternative implementations of those concepts makes algorithm design an intricate task and any automated support is welcome. The fast growth of computing power makes it possible to collect vast amounts of data on algorithm performance. The collected data can be processed by advanced data analytics resulting in a better understanding of algorithms as well as problems. This understanding supports further engineering efforts leading to even higher performing methods. The rise of "data science" - a combination of machine learning, statistics and data analysis - has seen applications in optimisation. Automatic algorithm configuration, algorithm selection and hyper-heuristics with learning schemes, to name a few, are among the most prominent topics in this marriage of the two fields.

This dissertation centres around the theme "Data Science for Optimisation". We focus on two key aspects: automatic algorithm configuration and the analysis of algorithm parameters' impact on algorithm performance.

Algorithm configuration is the task of selecting the best design of an algorithm, given a parametrised algorithm with a number of design choices and a set of problem instances. Recently, a number of automatised tools have been proposed for algorithm configuration. In chapter 3, we improve the efficiency of algorithm configuration for a local-search based metaheuristic with multiple neighbourhoods. An algorithm parameter determines the probability that a specific neighbourhood is selected at each iteration. Our method recognises neighbourhoods behaving similarly in the course of the solving procedure on a given set of problem instances. The similarity is defined based on a number of pre-defined observables. The neighbourhoods are then grouped together using clustering analysis. The number of parameters is reduced by assigning one single parameter per cluster. 

In chapter 4, we consider improving the performance of irace - a state-of-the art automatic algorithm configurator - by meta-tuning. Meta-tuning uses an automatic configurator to optimise the parameters of a configurator on a set of algorithm configuration benchmarks. The benchmark algorithms are modelled by surrogates to make the procedure computationally feasible. Surrogate modelling uses regression to build a predictor for the algorithm performance on every problem instance.

During algorithm configuration, several algorithm configurations are evaluated and an algorithm performance dataset is being built. This dataset can provide insights into how different algorithm parameters influence algorithm performance. The idea is to build a predictor on the given performance dataset and then using it for analysing variable importance. In chapter 5, the combined usage of three techniques utilising this idea is demonstrated in three case studies. The three techniques are forward selection, fANOVA and ablation analysis. We illustrate how to interpret the analysis results in the three cases, and discuss the advantage of combining different analysis methods.

Date:4 Feb 2014 →  16 May 2018
Keywords:automated parameter tuning/configuration
Disciplines:Applied mathematics in specific fields, Computer architecture and networks, Distributed computing, Information sciences, Information systems, Programming languages, Scientific computing, Theoretical computer science, Visual computing, Other information and computing sciences
Project type:PhD project