After the acute disease, post-COVID-19 patients may present several and persistent symptoms, known as the new paradigm of "post-acute COVID-19 syndrome". This necessitates a multidisciplinary rehabilitation that has been proposed but whose effectiveness is still to be assessed. In this study, convalescent COVID-19 patients undergoing pulmonary rehabilitation (PR) after reporting long-term symptoms were consecutively enrolled. Then, they were grouped by laboratory parameters at admission through an unsupervised Machine Learning (ML) approach. We aimed to identify potential indicators that could discriminate several phenotypes leading to a different responsiveness to the rehabilitation program. A k-means clustering method was performed; then, statistical analysis was employed to compare clinical and hematochemical parameters of the obtained clusters. The dataset consisted of 78 patients (84.8% males, mean age 60.72 years). The optimal number for clustering was k=2 with a silhouette coefficient of 0.85, and D-Dimer resulted the most discriminating parameter, thus confirming its role as a marker of inflammation. The phenotypes exhibited statistically significant differences in terms of age (p=0.007), packs of cigarettes per year (p=0.003), uricemia (p=0.010), PCR (p=0.026), D-Dimer (p<0.001), red blood cells (p=0.005), hemoglobin (p=0.039), hematocrit (p=0.026), PaO2 (p=0.006), SpO(2) (p=0.011). Overall, our findings suggest the effectiveness of ML in identifying personalized prevention, interventional and rehabilitation strategies.

Unsupervised Machine Learning to Identify Convalescent COVID-19 Phenotypes

Donisi, L;
2022

Abstract

After the acute disease, post-COVID-19 patients may present several and persistent symptoms, known as the new paradigm of "post-acute COVID-19 syndrome". This necessitates a multidisciplinary rehabilitation that has been proposed but whose effectiveness is still to be assessed. In this study, convalescent COVID-19 patients undergoing pulmonary rehabilitation (PR) after reporting long-term symptoms were consecutively enrolled. Then, they were grouped by laboratory parameters at admission through an unsupervised Machine Learning (ML) approach. We aimed to identify potential indicators that could discriminate several phenotypes leading to a different responsiveness to the rehabilitation program. A k-means clustering method was performed; then, statistical analysis was employed to compare clinical and hematochemical parameters of the obtained clusters. The dataset consisted of 78 patients (84.8% males, mean age 60.72 years). The optimal number for clustering was k=2 with a silhouette coefficient of 0.85, and D-Dimer resulted the most discriminating parameter, thus confirming its role as a marker of inflammation. The phenotypes exhibited statistically significant differences in terms of age (p=0.007), packs of cigarettes per year (p=0.003), uricemia (p=0.010), PCR (p=0.026), D-Dimer (p<0.001), red blood cells (p=0.005), hemoglobin (p=0.039), hematocrit (p=0.026), PaO2 (p=0.006), SpO(2) (p=0.011). Overall, our findings suggest the effectiveness of ML in identifying personalized prevention, interventional and rehabilitation strategies.
2022
978-1-6654-8299-8
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11591/497310
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
social impact