Human Digital Twin is an emerging technology that could revolutionize the current healthcare system by enabling the delivery of Personalized Health Services through the use of tools such as artificial intelligence. However, the considerable complexity of the structure of the human body, brought about by continuous molecular and physiological changes, makes it extremely difficult to process medical data extracted by artificial intelligence techniques. The latter requires a large amount of data for reliable performance, which is often difficult to obtain due to limited quality and availability. In this paper, we propose a methodology to generate artificial medical data. In detail, we focus on generating artificial voice signals. The analysis of voice recordings is fundamental to diagnose specific pneumo-articulatory apparatus diseases, such as dysphonia. The generative neural network employed is based on the WaveNet model, due to its autoregressive sampling, which enables generating recordings of variable length. We propose a setup which enables to generate artificial samples of required sex and pathology to balance and augment the dataset using only one generative network. The quality of the generative network is assessed by balancing the training dataset by generated data and training a convolutional classifier, which is tested on a dataset which was not introduced to the generative network during training. We achieved reasonable improvements in classification accuracy, particularly for the under-represented sex in terms of accuracy, arguing that this approach is worthy of future research.

Improving Voice Pathology Classification Using Artificial Data Generation

Verde, Laura;Marulli, Fiammetta;Marrone, Stefano;
2024

Abstract

Human Digital Twin is an emerging technology that could revolutionize the current healthcare system by enabling the delivery of Personalized Health Services through the use of tools such as artificial intelligence. However, the considerable complexity of the structure of the human body, brought about by continuous molecular and physiological changes, makes it extremely difficult to process medical data extracted by artificial intelligence techniques. The latter requires a large amount of data for reliable performance, which is often difficult to obtain due to limited quality and availability. In this paper, we propose a methodology to generate artificial medical data. In detail, we focus on generating artificial voice signals. The analysis of voice recordings is fundamental to diagnose specific pneumo-articulatory apparatus diseases, such as dysphonia. The generative neural network employed is based on the WaveNet model, due to its autoregressive sampling, which enables generating recordings of variable length. We propose a setup which enables to generate artificial samples of required sex and pathology to balance and augment the dataset using only one generative network. The quality of the generative network is assessed by balancing the training dataset by generated data and training a convolutional classifier, which is tested on a dataset which was not introduced to the generative network during training. We achieved reasonable improvements in classification accuracy, particularly for the under-represented sex in terms of accuracy, arguing that this approach is worthy of future research.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11591/574009
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact