This paper deals with the co-clustering of distributional data applied to multiple time sequences. The aims are: to get a double-partition of data into clusters of units and variables; to summarize the main concepts in the data through histogram prototypes; to overview the evolution over time of the monitored phenomenon. We extend the double k-means algorithm to handle distributional data by using the L2 Wasserstein distance for comparing distributions. Moreover, we adapt double k-means algorithm to compute optimal relevance weights associated with the variables
Mining multiple time sequences through co-clustering algorithms for distributional data
Balzanella Antonio;Irpino Antonio;
2021
Abstract
This paper deals with the co-clustering of distributional data applied to multiple time sequences. The aims are: to get a double-partition of data into clusters of units and variables; to summarize the main concepts in the data through histogram prototypes; to overview the evolution over time of the monitored phenomenon. We extend the double k-means algorithm to handle distributional data by using the L2 Wasserstein distance for comparing distributions. Moreover, we adapt double k-means algorithm to compute optimal relevance weights associated with the variablesFile in questo prodotto:
Non ci sono file associati a questo prodotto.
I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.