Volume 21 No 1 (2023)
Download PDF
An efficient Speech Emotion Recognition using LSTM model.
Pagidirayi Anil Kumar, Dr. B. Anuradha
Abstract
This paper proposes a deep learning-based Long Short-Term Memory (LSTM) method to recognize the speech emotions taken
from the audio files in a two-stage approach using Mel Frequency Cepstral Coefficients (MFCC) and feature extraction models.
Initially audio signals are pre-processed, MFCC features are extracted from the datasets and then the proposed novel method of
feature extraction technique is used to build a network which trains the attributes in recognizing emotions such as angry, happy,
sad, disgust, fear, and neutral. To achieve better outcomes than other deep learning algorithms, analysis of training and testing
datasets are carried out using the Python Colab platform. Using the CREMA-D and SAVEE datasets, accuracy, true positive rates
of speech emotions, and spectrograms of human emotions are recognized.
Keywords
SER, LSTM, MFCC, CREMA-D, SAVEE.
Copyright
Copyright © Neuroquantology
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
Articles published in the Neuroquantology are available under Creative Commons Attribution Non-Commercial No Derivatives Licence (CC BY-NC-ND 4.0). Authors retain copyright in their work and grant IJECSE right of first publication under CC BY-NC-ND 4.0. Users have the right to read, download, copy, distribute, print, search, or link to the full texts of articles in this journal, and to use them for any other lawful purpose.