SKELETON-BASED HUMAN ACTION RECOGNITION USING SPATIO-TEMPORAL LATENT FEATURES WITH A GCN MODEL
Abstract
Owing to its resilience to visual noise and viewpoint variations, skeleton-based analysis has become a cornerstone of human action recognition research. Despite its practical significance, existing methodologies often suffer from a reliance on single-stream skeletal representations, which fail to encompass the full complexity of action features. This study introduces Latent Features for Human Action Recognition (LFHAR), a novel architecture designed to overcome these limitations by utilizing diverse spatio-temporal latent representations for improved feature extraction. The approach applies graph-based transformations to individual skeletal frames in temporal sequences, then arranges the derived graph features into spatio-temporal matrices. Evaluation of standard datasets demonstrates the stability and invariance characteristics of the LFHAR architecture. The method produces substantial performance improvements, achieving accuracy increases of 2.7% on dataset NTU-RGB+D-60-classes and 2.1% on dataset NTU-RGB+D-120-classes, confirming its accuracy in improving human action recognition.