This is an unedited manuscript accepted for publication and provided as an Article in Press for early access at the author’s request. The article will undergo copyediting, typesetting, and galley proof review before final publication. Please be aware that errors may be identified during production that could affect the content. All legal disclaimers of the journal apply.
Gowtami Annapurna Dinavahi,
Chandini R.,
Swetha Akshaya P.,
Kavya D,
Vandana M,
- Assistant Professor, Department of Computer Science Engineering, GVP College of Engineering for Women, Visakhapatnam, Andhra Pradesh, India
- Student, Department of Computer Science Engineering, GVP College of Engineering for Women, Visakhapatnam, Andhra Pradesh, India
- Student, Department of Computer Science Engineering, GVP College of Engineering for Women, Visakhapatnam, Andhra Pradesh, India
- Student, Department of Computer Science Engineering, GVP College of Engineering for Women, Visakhapatnam, Andhra Pradesh, India
- Student, Department of Computer Science Engineering, GVP College of Engineering for Women, Visakhapatnam, Andhra Pradesh, India
Abstract
Speech Emotion Recognition is a speech processing task and a computer-based approach designed to identify and classify the emotions conveyed in audio signals. The aim of this system is to evaluate a speaker’s emotional state—such as happiness, anger, sadness, or frustration—by analyzing their speech patterns, which include prosodic features like pitch, frequency, and rhythm. Speech Emotion Recognition is used in various real-life scenarios that include Customer Service, Healthcare, Education, Human-Computer Interactions, Market Research. This study presents a comparative analysis of Mixed Convolutional Neural Networks (MCNN) and Residual Convolutional Neural Networks (RCNN) for speech emotion recognition, specifically considering the incorporation of gender information. Mixed Convolutional Neural Networks (MCNN) and Residual Convolutional Neural Networks (RCNN) models are trained and evaluated on datasets consisting of speech samples labelled with different emotions, along with gender information. At the end, the above models are evaluated on following datasets: SAVEE, RAVDEES, EMODB. This comparative analysis will aid in understanding the strengths and limitations of each model and guide researchers in selecting the most suitable model for speech emotion recognition considering gender information.
Keywords: Customer Segmentation, Product Segmentation, Clustering, Business, Clients
[This article belongs to Journal of Communication Engineering & Systems (joces)]
Gowtami Annapurna Dinavahi, Chandini R., Swetha Akshaya P., Kavya D, Vandana M. Comparative Analysis of MCNN and RCNN for Speech Emotion Recognition using Gender Information. Journal of Communication Engineering & Systems. 2024; 15(01):-.
Gowtami Annapurna Dinavahi, Chandini R., Swetha Akshaya P., Kavya D, Vandana M. Comparative Analysis of MCNN and RCNN for Speech Emotion Recognition using Gender Information. Journal of Communication Engineering & Systems. 2024; 15(01):-. Available from: https://journals.stmjournals.com/joces/article=2024/view=191780
References
- Singh V, Prasad S. Speech emotion recognition system using gender dependent convolution neural network. Procedia Computer Science. 2023 Jan 1;218:2533-40.
- Zhang LM, Li Y, Zhang YT, Ng GW, Leau YB, Yan H. A deep learning method using gender-specific features for emotion recognition. Sensors. 2023 Jan 25;23(3):1355.
- Tigga NP, Garg S. Speech Emotion Recognition for multiclass classification using Hybrid CNN-LSTM. International Journal of Microsystems and Iot. 2023;1:9-17.
- Fu H, Zhuang Z, Wang Y, Huang C, Duan W. Cross-Corpus Speech Emotion Recognition Based on Multi-Task Learning and Subdomain Adaptation. Entropy. 2023 Jan 7;25(1):124.
- Wani TM, Gunawan TS, Qadri SA, Kartiwi M, Ambikairajah E. A comprehensive review of speech emotion recognition systems. IEEE access. 2021 Mar 22;9:47795-814.
- Nicolini M, Ntalampiras S. A Hierarchical Approach for Multilingual Speech Emotion Recognition. InICPRAM 2023 (pp. 679-685).
- Latif S, Rana R, Khalifa S, Jurdak R, Epps J, Schuller BW. Multi-task semi-supervised adversarial autoencoding for speech emotion recognition. IEEE Transactions on Affective computing. 2020 Apr 1;13(2):992-1004.
- Nwe TL, Foo SW, De Silva LC. Speech emotion recognition using hidden Markov models. Speech communication. 2003 Nov 1;41(4):603-23.
- Le D, Provost EM. Emotion recognition from spontaneous speech using hidden markov models with deep belief networks. In2013 IEEE Workshop on Automatic Speech Recognition and Understanding 2013 Dec 8 (pp. 216-221). IEEE.
- Lin YL, Wei G. Speech emotion recognition based on HMM and SVM. In2005 international conference on machine learning and cybernetics 2005 Aug 18 (Vol. 8, pp. 4898-4901). IEEE.
Journal of Communication Engineering & Systems
Volume | 15 |
Issue | 01 |
Received | 25/10/2024 |
Accepted | 07/11/2024 |
Published | 31/12/2024 |
Views: 0