The growing availability of data over the last decades has given rise to a number of successful technologies, ranging from data collection and storage infrastructures to hardware and software tools for efficient computation of analytics. This context, in principle, places a great demand on data quality. As a matter of fact, experience has shown that the open Web and other platforms hosting user-generated content or real-time data can provide little quality control at content production time. To address these challenges, our aim is to provide a general and configurable model for assessing data quality supporting task composition. In particular, we introduce a model characterized along the notion of matching, illustrating the issues that can be addressed by this approach with a concrete case study. We also identify and discuss challenges to be addressed in future research to strengthen this idea.
Towards configurable composite data quality assessment
Bellini E.
2019
Abstract
The growing availability of data over the last decades has given rise to a number of successful technologies, ranging from data collection and storage infrastructures to hardware and software tools for efficient computation of analytics. This context, in principle, places a great demand on data quality. As a matter of fact, experience has shown that the open Web and other platforms hosting user-generated content or real-time data can provide little quality control at content production time. To address these challenges, our aim is to provide a general and configurable model for assessing data quality supporting task composition. In particular, we introduce a model characterized along the notion of matching, illustrating the issues that can be addressed by this approach with a concrete case study. We also identify and discuss challenges to be addressed in future research to strengthen this idea.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.