This paper deals with the co-clustering of distributional data applied to multiple time sequences. The aims are: to get a double-partition of data into clusters of units and variables; to summarize the main concepts in the data through histogram prototypes; to overview the evolution over time of the monitored phenomenon. We extend the double k-means algorithm to handle distributional data by using the L2 Wasserstein distance for comparing distributions. Moreover, we adapt double k-means algorithm to compute optimal relevance weights associated with the variables

Mining multiple time sequences through co-clustering algorithms for distributional data

Balzanella Antonio;Irpino Antonio;
2021

Abstract

This paper deals with the co-clustering of distributional data applied to multiple time sequences. The aims are: to get a double-partition of data into clusters of units and variables; to summarize the main concepts in the data through histogram prototypes; to overview the evolution over time of the monitored phenomenon. We extend the double k-means algorithm to handle distributional data by using the L2 Wasserstein distance for comparing distributions. Moreover, we adapt double k-means algorithm to compute optimal relevance weights associated with the variables
2021
978-88-5518-340-6
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11591/463746
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact