This work investigates dynamic emotional expressions’ recognition. The aims consist into examining which of five emotions (happiness, fear, anger, surprise, and sadness) is better recognized, and whether and how emotion recognition is affected by the single or joint activation of the auditory and visual communicative channels, and by the encoder’ gender. 41 participants labeled a set of dynamic stimuli, balanced by gender, depicting the 5 basic emotions, portrayed through 3 different modalities: audio, mute-video and audio-video combined. Outcomes show that happiness and anger were the most accurately decoded emotions. The decoding accuracy of fear, sadness, and anger was higher when portrayed through the audio-video combined modality. Moreover, emotions decoded through the audio channel received the lowest accuracy scores. Regarding stimuli’ gender, female expressions of fear and sadness were better categorized than male ones. A significant interaction between the type and gender of stimuli revealed that male stimuli expressing happiness, fear and sadness were better recognized through the combined audio-video modality, while female stimuli expressing happiness, fear and sadness were better decoded through the audio or video alone. Finally, male mute-video stimuli portraying surprise were better decoded than female ones; moreover, female expressions of surprise were better recognized in the audio modality, compared to the mute-video one. These findings have important implications in the Ambient Assisted Living field, suggesting to consider different features when developing assistive technologies employing unimodal or multimodal communicative channels. Findings also highlight the importance to guarantee users’ freedom to choose assistants’ features, for natural and pleasant experiences.
Emotion Recognition from Unimodal to Bimodal: Exploring the Effects of Communicative Modes and Gender of Stimuli on the Decoding Accuracy of Dynamic Emotional Expressions
Milo R.;Amorese T.;Cuciniello M.;Cordasco G.;Esposito A.
2024
Abstract
This work investigates dynamic emotional expressions’ recognition. The aims consist into examining which of five emotions (happiness, fear, anger, surprise, and sadness) is better recognized, and whether and how emotion recognition is affected by the single or joint activation of the auditory and visual communicative channels, and by the encoder’ gender. 41 participants labeled a set of dynamic stimuli, balanced by gender, depicting the 5 basic emotions, portrayed through 3 different modalities: audio, mute-video and audio-video combined. Outcomes show that happiness and anger were the most accurately decoded emotions. The decoding accuracy of fear, sadness, and anger was higher when portrayed through the audio-video combined modality. Moreover, emotions decoded through the audio channel received the lowest accuracy scores. Regarding stimuli’ gender, female expressions of fear and sadness were better categorized than male ones. A significant interaction between the type and gender of stimuli revealed that male stimuli expressing happiness, fear and sadness were better recognized through the combined audio-video modality, while female stimuli expressing happiness, fear and sadness were better decoded through the audio or video alone. Finally, male mute-video stimuli portraying surprise were better decoded than female ones; moreover, female expressions of surprise were better recognized in the audio modality, compared to the mute-video one. These findings have important implications in the Ambient Assisted Living field, suggesting to consider different features when developing assistive technologies employing unimodal or multimodal communicative channels. Findings also highlight the importance to guarantee users’ freedom to choose assistants’ features, for natural and pleasant experiences.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.