An histogram data is described by a set of distributions. In this paper, we propose a clustering approach using an adaptation of the Self-Organizing Map (SOM) algorithm. The idea is to combine the dimension reduction obtained with a SOM and the clustering of the data in this reduced space. The L2 Wasserstein distance is used to measure dissimilarity between distributions and to estimate local data densities in the original space. The main advantage of the proposed algorithm is that the number of clusters is found automatically. Applications on synthetic and real data sets demonstrate the validity of the proposed approach.

Clustering of histogram data; a topological learning approach

Rosanna Verde;Antonio Irpino
2017

Abstract

An histogram data is described by a set of distributions. In this paper, we propose a clustering approach using an adaptation of the Self-Organizing Map (SOM) algorithm. The idea is to combine the dimension reduction obtained with a SOM and the clustering of the data in this reduced space. The L2 Wasserstein distance is used to measure dissimilarity between distributions and to estimate local data densities in the original space. The main advantage of the proposed algorithm is that the number of clusters is found automatically. Applications on synthetic and real data sets demonstrate the validity of the proposed approach.
2017
978-88-6453-521-0
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11591/388694
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact