This paper deals with a cluster-wise regression method for distributional data. The objects to cluster are observed on a dependent character and on a set of explanatory variables. A dependence relation is then assumed, which can be improved by considering local structures among the data. The proposed algorithm is based on the K-means clustering algorithm: the centroids of the clusters are linear regression models and the objects are assigned to the clusters according to minimum sum of squared errors. The generalised CR algorithm is based on a linear regression model for distributional variables and on a K-means algorithm developed for similar data; both the methods use a L2 Wasserstein distance.
A generalised clusteriwise regression for distributional data
Rosanna Verde;Antonio Balzanella
2021
Abstract
This paper deals with a cluster-wise regression method for distributional data. The objects to cluster are observed on a dependent character and on a set of explanatory variables. A dependence relation is then assumed, which can be improved by considering local structures among the data. The proposed algorithm is based on the K-means clustering algorithm: the centroids of the clusters are linear regression models and the objects are assigned to the clusters according to minimum sum of squared errors. The generalised CR algorithm is based on a linear regression model for distributional variables and on a K-means algorithm developed for similar data; both the methods use a L2 Wasserstein distance.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.