"In this paper we introduce a new strategy for summarizing a fast changing. data stream. Evolving data streams are generated by non stationary processes which. require to adapt the knowledge discovery process to the new emerging concepts.. To deal with this challenge we propose a clustering algorithm where each cluster is. summarized by a histogram and data are allocated to clusters through a Wasserstein. derived distance. Histograms are a well known graphical tool for representing the. frequency distribution of data and are widely used in data stream mining, however,. unlike to existing methods, we discover a set of histograms where each one represents. a main concept in the data. In order to evaluate the performance of the method,. we have performed extensive tests on simulated data."
Data stream summarization by histograms clustering
BALZANELLA, Antonio;VERDE, Rosanna
2013
Abstract
"In this paper we introduce a new strategy for summarizing a fast changing. data stream. Evolving data streams are generated by non stationary processes which. require to adapt the knowledge discovery process to the new emerging concepts.. To deal with this challenge we propose a clustering algorithm where each cluster is. summarized by a histogram and data are allocated to clusters through a Wasserstein. derived distance. Histograms are a well known graphical tool for representing the. frequency distribution of data and are widely used in data stream mining, however,. unlike to existing methods, we discover a set of histograms where each one represents. a main concept in the data. In order to evaluate the performance of the method,. we have performed extensive tests on simulated data."I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.