Project

Statistical Methodology for Immense Lively Experiments (SMILE)

Nowadays technological advances have dramatically changed the size of data sets. Modern data may have a large number of observations, a large number of dimensions, or both. In some situations the data set is not static but rather an ongoing data. Such large scale data sets are often called big data which is a general term that comprises large sample size as well as (ultra)high-dimensional databases which can be static or dynamic. Statistical methodology needs to be adapted and extended to cope with these data structures. A first goal of this project is to develop reliable screening techniques for ultrahigh-dimensional data of which the dimension can go into the hundreds of thousands. A second goal is to develop reliable statistical methods for the cellwise outlier setting, with accompanying computationally efficient algorithms. A third goal is to develop ultrafast methods and algorithms for the detection of data heterogeneity in high-dimensional dynamic data streams, and to identify their root cause.

Date:1 Oct 2015 → 30 Sep 2021

Keywords:Dimension reduction, Sparsity, Cellwise contamination, Dynamic data, Efficient algorithms, Big data, Outliers, Robust statistics

Disciplines:Applied mathematics in specific fields, Statistics and numerical methods

Project

Statistical Methodology for Immense Lively Experiments (SMILE)

Researchers

Project partners

Funding

Publications