Nowadays the analysis of big data streaming is of main interest in several fields, as it represents an important source of information, which may be useful for forecasting tasks. However, dealing with this type of data involves a series of challenges concerning software, format, and dimensionality issues. Indeed, a stream is an unbounded, ordered sequence of objects that can be read only once or a small number of times. The main characteristics of data streaming are that data continuously flow, and their size is extremely large and potentially infinite. In this context, extracting relevant and reliable information from big data become a crucial aspect. To this end, we analyze data streams in a functional framework, focusing on the forecasting problem in time series with nonparametric techniques. Specifically, we investigate data streams in user defined time periods, identifying a suitable probability distribution function able to describe the main characteristics of the data. In particular, we aim to study the repartition function of the stream random variable to obtain a memory of the process over time. In this framework, the high dimensionality of the data is reduced into a matrix of functional observations, whose units are represented by the user-defined time periods.
Data stream reduction via functional time series analysis
Fabrizio Maturo
2018
Abstract
Nowadays the analysis of big data streaming is of main interest in several fields, as it represents an important source of information, which may be useful for forecasting tasks. However, dealing with this type of data involves a series of challenges concerning software, format, and dimensionality issues. Indeed, a stream is an unbounded, ordered sequence of objects that can be read only once or a small number of times. The main characteristics of data streaming are that data continuously flow, and their size is extremely large and potentially infinite. In this context, extracting relevant and reliable information from big data become a crucial aspect. To this end, we analyze data streams in a functional framework, focusing on the forecasting problem in time series with nonparametric techniques. Specifically, we investigate data streams in user defined time periods, identifying a suitable probability distribution function able to describe the main characteristics of the data. In particular, we aim to study the repartition function of the stream random variable to obtain a memory of the process over time. In this framework, the high dimensionality of the data is reduced into a matrix of functional observations, whose units are represented by the user-defined time periods.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.