In this paper, a big data pipeline is presented, taking in consideration both structured and unstructured data made available by the Italian Ministry of Justice, regarding their telematic civil process. Indeed, the complexity and volume of the data provided by the ministry requires the application of big data analysis techniques, in concert with machine and deep learning frameworks, to be correctly analysed and to obtain meaningful information that could support the ministry itself in better managing civil processes. The pipeline has two main objectives: to provide a consistent workflow of activities to be applied to the incoming data, aiming at extracting useful information for the ministry's decision making tasks, and to homogenize the incoming data, so that they can be stored in a centralized and coherent data lake to be used as a reference for further analysis and considerations.

A Big Data Pipeline and Machine Learning for Uniform Semantic Representation of Data and Documents From {IT} Systems of the Italian Ministry of Justice

Beniamino Di Martino
Supervision
;
Luigi Colucci Cante
Writing – Original Draft Preparation
;
Salvatore D( extquotesingle)Angelo;Mariangela Graziano
Writing – Original Draft Preparation
;
Fiammetta Marulli
Writing – Original Draft Preparation
;
2022

Abstract

In this paper, a big data pipeline is presented, taking in consideration both structured and unstructured data made available by the Italian Ministry of Justice, regarding their telematic civil process. Indeed, the complexity and volume of the data provided by the ministry requires the application of big data analysis techniques, in concert with machine and deep learning frameworks, to be correctly analysed and to obtain meaningful information that could support the ministry itself in better managing civil processes. The pipeline has two main objectives: to provide a consistent workflow of activities to be applied to the incoming data, aiming at extracting useful information for the ministry's decision making tasks, and to homogenize the incoming data, so that they can be stored in a centralized and coherent data lake to be used as a reference for further analysis and considerations.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11591/518368
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? 5
social impact