The dataset contains the data for the hierarchical cluster analysis as explained in the article "A panorama of inchoative constructions in Spanish: Cluster analysis as an answer to the near-synonymy puzzle". The dataset contains the data for the hierarchical cluster analysis as explained in the article "A panorama of inchoative constructions in Spanish: Cluster analysis as an answer to the near-synonymy puzzle". In total, the dataset contains 3955 observations, which are tokens of the inchoative construction for the following auxiliaries: comenzar, empezar, meter, poner, echar(se), liar, ...