"In recent years, Data Stream Mining (DSM) has received a lot of attention due to the increasing number of applicative contexts which generate temporally ordered, fast changing, and potentially infinite data. To deal with such data, learning techniques require to satisfy several computational and storage constraints so that new and specific methods have to be developed. In this paper we introduce a new strategy for dealing with the problem of streaming time series clustering. The method allows to detect a partition of the streams over a user chosen time period and to discover evolutions in proximity relations. We show that it is possible to reach these aims, performing the clustering of temporally non overlapping data batches arriving on-line and then running a suitable clustering algorithm on a dissimilarity matrix updated using the outputs of the local clustering. Through an application on real and simulated data, we will show that this method provides results comparable to algorithms for stored data."
Clustering and change detection of multiple streaming time series
BALZANELLA, Antonio;VERDE, Rosanna
2013
Abstract
"In recent years, Data Stream Mining (DSM) has received a lot of attention due to the increasing number of applicative contexts which generate temporally ordered, fast changing, and potentially infinite data. To deal with such data, learning techniques require to satisfy several computational and storage constraints so that new and specific methods have to be developed. In this paper we introduce a new strategy for dealing with the problem of streaming time series clustering. The method allows to detect a partition of the streams over a user chosen time period and to discover evolutions in proximity relations. We show that it is possible to reach these aims, performing the clustering of temporally non overlapping data batches arriving on-line and then running a suitable clustering algorithm on a dissimilarity matrix updated using the outputs of the local clustering. Through an application on real and simulated data, we will show that this method provides results comparable to algorithms for stored data."I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.