Maqola

Human Pose Estimation and Skeleton-Based Action Recognition: A Systematic Review of 2D/3D Deep Learning Approaches

Kudratjon ZohirovKarshi State Technical University,Department of Software and Hardware Support of Computer Systems,Karshi,UzbekistanFeruz RuziboevTashkent University of Information Technologies,Department of Convergence of Digital Technologies,Tashkent,UzbekistanSardor BoykobilovKarshi State Technical University,Department of Software and Hardware Support of Computer Systems,Karshi,UzbekistanMirjakhon TemirovTashkent University of Information Technologies,Department of Convergence of Digital Technologies,Tashkent,UzbekistanKamoliddin AblakulovKarshi State Technical University,Department of Software and Hardware Support of Computer Systems,Karshi,UzbekistanElbek JabborovKarshi State Technical University,Department of Information Systems and Technologies,Karshi,Uzbekistan

2026

ABI

Annotatsiya

This article presents a systematic review of modern 2D and 3D deep learning approaches used in the fields of Human Pose Estimation (HPE) and Skeleton-Based Action Recognition (SBAR). The research was conducted based on the SALSA methodology and the published works in leading scientific databases were analyzed. Within the framework of the review, approaches based on the detection of human joint points from image and video sequences, reconstruction of the skeletal structure, and modeling of actions in the spatial and spatiotemporal (ST) domains were compared. The article reviews the architectures of 2D and 3D HPE models, joint hiding in multi-person scenes, real-time requirements, and the possibilities of application in embedded devices. The spatial, spatiotemporal, and graphical features used in SBAR systems and their impact on computational complexity and energy efficiency are also analyzed. The performance of the models was compared based on evaluation criteria such as MPJPE (Mean Per Joint Position Error), AP (Average Precision), RMSE (Root Mean Square Error), and Pearson correlation. The results show that skeleton-based approaches are effective solutions for real-time and resource-constrained systems. However, choosing the optimal model and features requires a trade-off between accuracy, computational complexity, and energy efficiency. This paper identifies promising architectures, features, and hardware adaptation strategies for practical applications of HPE and SBAR systems in real-world environments.

Mavzular

Human Pose and Action Recognition Human Motion and Animation Hand Gesture Recognition Systems

Identifikatorlar

DOI: 10.1109/wccct69960.2026.11549659

Iqtiboslar va manbalar

0 ta iqtibos43 ta foydalanilgan manba

Koʻrsatkichlar — AkademScholar · Tez orada