< Back to previous page
Publication
Artificial intelligence in astronomy Unraveling variable stars with machine learning and the NASA Kepler and TESS space missions
Book - Dissertation
Astronomy is entering the era of big data through the continuous delivery of millions of observations by both ground- and space-based observatories each day. These massive data sets provide us with a unique opportunity to probe the Universe on an unprecedented scale. The brightness observations of stars delivered by space missions contain a wealth of information for stellar variability, exoplanet and stellar astrophysics studies. The large sets of pulsating stars observations will in particular enable asteroseismology to close the gap between theoretical stellar structure and evolution models, and modern observations. These massive data sets also pose a challenge however, because astronomical data analysis methods were originally developed to process modest samples of stars. In this thesis, we therefore develop a new state-of-the-art machine learning framework to automatically analyze the millions of observations from space. This work forms a fundamental basis to bring astronomy to an era of automated scientific discovery with machine learning. We develop a machine learning framework to perform a large and detailed classification of variable stars observed from space. We specifically construct the framework to process the uninterrupted high-cadence (sampling rates from a minute to half an hour) light curve data from the NASA Kepler and NASA TESS space missions, and in preparation for the upcoming ESA PLATO space mission. The framework first consists of a supervised classification module to classify light curves according to their high-level stellar variability types, and then of an unsupervised learning module to perform a more detailed classification. The supervised classifier uses stacked generalization to combine the predictions from four individual classifiers that are each specialized in the identification of different variability classes. We successfully validate the supervised classifier on data from the Kepler mission and demonstrate that it can successfully classify stars outside of our training sample. We then subsequently transfer the supervised methodology to TESS by updating the training set to account for the different systematic characteristics between the two surveys. We analyze the performance by classifying the stars in TESS sectors 14, 15 and 26 that were previously observed by Kepler, and demonstrate the readiness of our framework to classify the full TESS primary mission. Finally, we develop an unsupervised clustering algorithm to identify the pulsating stars for which both the near-core region and the outer envelopes can be probed. We construct the clustering algorithm by integrating techniques from the biomedical domain to work on asteroseismic light curves. We find that our methodology can cluster different types of pulsating stars and that the structure of the clusters in the data space reveals a substantial amount of information with regard to the interior properties of pulsating stars. Hence, this framework also lays the foundations for unraveling the interior structure properties of stars with automated machine learning methods. Overall, this thesis results in the creation of a new state-of-the-art large-scale classification methodology to identify variable stars from the vast numbers of light curves from space missions. The results are a treasure trove for stellar and planetary astrophysicists alike, as the identified variable stars can be used (i) to constrain stellar structure and evolution models, and (ii) to start unraveling the interior physics of stars with machine learning.
Publication year:2023
Accessibility:Open