Article

GENERAL DEEP LEARNING ARCHITECTURES FOR MULTIMODAL EMOTION DETECTION

Kurbanov AbdurahmonDepartment of Computer Science and Programming , Jizzakh branch of the National University of Uzbekistan named after Mirzo Ulugbek , Gulistan , Uzbekistan

Advanced Computational Intelligence An International Journal (ACII)journal2025

ABI

Abstract

Multimodal emotion recognition is an important area of artificial intelligence, which allows for accurate analysis of human emotional states by combining various data sources such as facial expressions, body movements, speech tone, and physiological signals. This paper studies the application of deep learning architectures to multimodal emotion recognition, in particular, the effectiveness of the late fusion strategy. In the paper, the ST-GCN (Spatio-Temporal Graph Convolutional Network) model is used to extract motion features from body movements, and the DeepFaceEmocNet25 model is used to extract emotion features from facial expressions, trained on the FaceEmocDS dataset. These models are integrated through the late fusion method, providing high accuracy in detecting seven emotion classes (happy, angry, sad, surprised, disgusted, fearful, neutral). Late fusion preserves the independent features of each modality and combines them through concatenation and a fully connected classifier. The paper presents mathematical formulas, practical code examples, and experimental setups, and analyzes the technical details of the system. The multimodal approach is widely used in healthcare, education, security, and gaming industries, but there are challenges such as data heterogeneity, limited data sets, and computational costs. Future research will focus on small-data training, real-time analysis, and cultural adaptability. This work presents innovative deep learning solutions in the field of multimodal emotion recognition.

Topics

Emotion and Mood Recognition Face recognition and analysis Face and Expression Recognition

Identifiers

DOI: 10.5121/acii.2025.12401

Citations and references

Cited by 00 references

Metrics — AkademScholar · Coming soon