Emotion Recognition Using a Hybrid Model of Transformer and CNN-LSTM for Improved Performance
Аннотация
Emotion recognition is a crucial aspect of AI technology that fills in the gap of human-machine interaction, allowing systems to react to human emotions effectively. The traditional models often struggle with precision and robustness when it comes to noisy data along with the complexity of analyzing human emotions. This paper proposes a hybrid model architecture by combining the Transformer model and CNN-LSTMs in order to maximize emotion recognition. The Transformer module captures input data's long-range dependencies, whereas the CNN-LSTM architecture extracts spatial and temporal features from voice signals, with potential extension to multimodal inputs such as facial expressions and physiological data. This work shows the capability of advanced hybrid architecture in developing emotionally intelligent systems in healthcare, education, customer service, and other applications. The findings pave the way for future work on multimodal data integration and generalization in complicated environments.
Перевод пока недоступен