Speech-based depression detection has attracted significant attention over the last years. A debated problem is whether it is better to use language (what people say), paralanguage (how they say it) or a combination of the two. This article addresses the question through the analysis of a Gated Multimodal Unit trained to weight modalities according to how effectively they account for the condition of a speaker (depressed or nondepressed). The experiments involved 29 individuals diagnosed with depression and 30 non-depressed participants. Besides an accuracy of 83.0% (F1 score 80.0%), the results show that the Gated Multimodal Unit tends to give more weight to paralanguage. However, the relative contribution of language tends to be higher, to a statistically significant extent, in the case of nondepressed speakers.
Language or paralanguage, This is the problem: Comparing depressed and non-depressed speakers through the analysis of gated multimodal units
Esposito A.;
2021
Abstract
Speech-based depression detection has attracted significant attention over the last years. A debated problem is whether it is better to use language (what people say), paralanguage (how they say it) or a combination of the two. This article addresses the question through the analysis of a Gated Multimodal Unit trained to weight modalities according to how effectively they account for the condition of a speaker (depressed or nondepressed). The experiments involved 29 individuals diagnosed with depression and 30 non-depressed participants. Besides an accuracy of 83.0% (F1 score 80.0%), the results show that the Gated Multimodal Unit tends to give more weight to paralanguage. However, the relative contribution of language tends to be higher, to a statistically significant extent, in the case of nondepressed speakers.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.