Article

An Efficient Uzbek Speaker Recognition System for Resource-Constrained Devices Using Compact Acoustic Features and Lightweight Deep Models

Parakhat NurimovTashkent Institute of Irrigation and Agricultural Mechanization Engineers, National Research University, Tashkent, UzbekistanNarzillo MamatovTashkent Institute of Irrigation and Agricultural Mechanization Engineers, National Research University, Tashkent, Uzbekistan

Engineering Technology & Applied Science Researchjournal2026

ABI

Abstract

Speaker recognition systems have achieved strong performance, but many high-performing approaches remain computationally expensive and therefore not well-suited to resource-constrained devices. This limitation is particularly important in low-resource settings, including Uzbek speech applications, where practical lightweight solutions remain limited. This study presents an efficient Uzbek closed-set, text-independent speaker identification framework based on compact acoustic features and lightweight deep models. Two acoustic representations, namely MFCC-13 and Log-Mel-40, were evaluated along with two lightweight convolutional architectures, namely Small CNN and Compact CNN. The systems were assessed for recognition accuracy, F1 Score, parameter count, model size, and inference latency. The experimental results showed that the Log-Mel-40 + Compact CNN configuration achieved the best overall performance, obtaining 96.44% accuracy and 0.8957 F1-score, while maintaining a compact model size of 0.4606 MB and low inference latency. The findings indicate that practical Uzbek speaker recognition can be achieved on resource-constrained platforms through an appropriate combination of compact acoustic features and lightweight deep models.

Topics

Speech Recognition and Synthesis Speech and Audio Processing Music and Audio Processing

Identifiers

DOI: 10.48084/etasr.19226

Citations and references

Cited by 014 references

Metrics — AkademScholar · Coming soon