Big data processing systems are characterized by a relevant number of components that are used in parallel to run multiple instances of the same tasks in order to achieve the needed performance levels in applications characterized by huge amounts of data. Such a number of components depend on the dimension of the involved data, so that new resources (e.g., processing or storage servers) are usually added as the working database grows. A reliable performance evaluation of these systems is at the same time crucial, in order to enable administrators and developers to keep the pace with data growth, and extremely difficult, due to the intrinsic complexity of these architectures. Notwithstanding, the available literature does not yet offer sufficient experiences, nor significant methodologies, in such a direction. This paper presents a novel modeling approach, based on mean field analysis, a set of methods for approximate inference of probabilistic models, derived from statistical physics, for performance evaluation of big data systems. This approach, by containing the excessive state space growth characterizing more traditional modeling methodologies, also requires a significantly reduced effort with respect to simulation based ones. © 2013 Elsevier B.V. All rights reserved.
Exploiting mean field analysis to model performances of big data architectures
IACONO, Mauro;
2014
Abstract
Big data processing systems are characterized by a relevant number of components that are used in parallel to run multiple instances of the same tasks in order to achieve the needed performance levels in applications characterized by huge amounts of data. Such a number of components depend on the dimension of the involved data, so that new resources (e.g., processing or storage servers) are usually added as the working database grows. A reliable performance evaluation of these systems is at the same time crucial, in order to enable administrators and developers to keep the pace with data growth, and extremely difficult, due to the intrinsic complexity of these architectures. Notwithstanding, the available literature does not yet offer sufficient experiences, nor significant methodologies, in such a direction. This paper presents a novel modeling approach, based on mean field analysis, a set of methods for approximate inference of probabilistic models, derived from statistical physics, for performance evaluation of big data systems. This approach, by containing the excessive state space growth characterizing more traditional modeling methodologies, also requires a significantly reduced effort with respect to simulation based ones. © 2013 Elsevier B.V. All rights reserved.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.