Recent years have seen the widespread adoption of Artificial Intelligence techniques in several domains, including healthcare, justice, assisted driving and Natural Language Processing (NLP) based applications (e.g., the Fake News detection). Those mentioned are just a few examples of some domains that are particularly critical and sensitive to the reliability of the adopted machine learning systems. Therefore, several Artificial Intelligence approaches were adopted as support to realize easy and reliable solutions aimed at improving the early diagnosis, personalized treatment, remote patient monitoring and better decision-making with a consequent reduction of healthcare costs. Recent studies have shown that these techniques are venerable to attacks by adversaries at phases of artificial intelligence. Poisoned data set are the most common attack to the reliability of Artificial Intelligence approaches. Noise, for example, can have a significant impact on the overall performance of a machine learning model. This study discusses the strength of impact of noise on classification algorithms. In detail, the reliability of several machine learning techniques to distinguish correctly pathological and healthy voices by analysing poisoning data was evaluated. Voice samples selected by available database, widely used in research sector, the Saarbruecken Voice Database, were processed and analysed to evaluate the resilience and classification accuracy of these techniques. All analyses are evaluated in terms of accuracy, specificity, sensitivity, F1-score and ROC area.
Exploring the impact of data poisoning attacks on machine learning model reliability
Verde L.
Methodology
;Marulli F.
Conceptualization
;Marrone S.Writing – Original Draft Preparation
2021
Abstract
Recent years have seen the widespread adoption of Artificial Intelligence techniques in several domains, including healthcare, justice, assisted driving and Natural Language Processing (NLP) based applications (e.g., the Fake News detection). Those mentioned are just a few examples of some domains that are particularly critical and sensitive to the reliability of the adopted machine learning systems. Therefore, several Artificial Intelligence approaches were adopted as support to realize easy and reliable solutions aimed at improving the early diagnosis, personalized treatment, remote patient monitoring and better decision-making with a consequent reduction of healthcare costs. Recent studies have shown that these techniques are venerable to attacks by adversaries at phases of artificial intelligence. Poisoned data set are the most common attack to the reliability of Artificial Intelligence approaches. Noise, for example, can have a significant impact on the overall performance of a machine learning model. This study discusses the strength of impact of noise on classification algorithms. In detail, the reliability of several machine learning techniques to distinguish correctly pathological and healthy voices by analysing poisoning data was evaluated. Voice samples selected by available database, widely used in research sector, the Saarbruecken Voice Database, were processed and analysed to evaluate the resilience and classification accuracy of these techniques. All analyses are evaluated in terms of accuracy, specificity, sensitivity, F1-score and ROC area.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.