Skip to main content
AkademIndex

Products

For developers

AkademBasesoonOpen API for the ecosystem
Latin
English
Article

Hybrid LSTM–Attention and CNN Model for Enhanced Speech Emotion Recognition

Fazliddin MakhmudovDepartment of Computer Engineering, Gachon University, Seongnam 1342, Republic of KoreaAlpamis KutlimuratovDepartment of Econometrics, Tashkent State University of Economics, Tashkent 100066, UzbekistanYoung Im ChoDepartment of Computer Engineering, Gachon University, Seongnam 1342, Republic of Korea
Applied Sciencesjournal2024en
ABI

Abstract

Emotion recognition is crucial for enhancing human–machine interactions by establishing a foundation for AI systems that integrate cognitive and emotional understanding, bridging the gap between machine functions and human emotions. Even though deep learning algorithms are actively used in this field, the study of sequence modeling that accounts for the shifts in emotions over time has not been thoroughly explored. In this research, we present a comprehensive speech emotion-recognition framework that amalgamates the ZCR, RMS, and MFCC feature sets. Our approach employs both CNN and LSTM networks, complemented by an attention model, for enhanced emotion prediction. Specifically, the LSTM model addresses the challenges of long-term dependencies, enabling the system to factor in historical emotional experiences alongside current ones. We also incorporate the psychological “peak–end rule”, suggesting that preceding emotional states significantly influence the present emotion. The CNN plays a pivotal role in restructuring input dimensions, facilitating nuanced feature processing. We rigorously evaluated the proposed model utilizing two distinct datasets, namely TESS and RAVDESS. The empirical outcomes highlighted the model’s superior performance, with accuracy rates reaching 99.8% for TESS and 95.7% for RAVDESS. These results are a notable advancement, showcasing our system’s precision and innovative contributions to emotion recognition.

Topics

Identifiers

Citations and references

Cited by 051 references
Metrics — AkademScholar · Coming soon