Maqola

Speech-to-Text Models in Uzbek Language: Achievements and Limitations

Fayzullo NazarovSamarkand State University,Department of Artificial Intelligence and Information Systems,Samarkand,UzbekistanAkbar SolievSamarkand State University,Department of Artificial Intelligence and Information Systems,Samarkand,UzbekistanB. EshtemirovSamarkand State University,Department of Artificial Intelligence and Information Systems,Samarkand,Uzbekistan

2026

ABI

Annotatsiya

In the last few years, artificial intelligence and natural language processing (NLP) have changed the way people and computers interact in a big way. Nonetheless, creating strong Automatic Speech Recognition (ASR) systems for languages like Uzbek that don't have a lot of resources and are agglutinative is still a big problem. This paper offers a comprehensive examination of the present condition of Uzbek ASR technologies, grounded in research disseminated from 2020 to 2025. The research delineates significant constraints arising from the language's intricate morphological framework, considerable dialectal variation, and the deficiency of substantial annotated datasets. We look at different architectures, such as old-school DNN-HMM hybrids and newer End-to-End (E2E) models like Transformers and Conformers. Comparative results show that E2E-Conformer architectures combined with specialised language models (UzLM) work better, with a Word Error Rate (WER) of 13.9%. The results indicate that the creation of more extensive open-source corpora, the adoption of Self-Supervised Learning, and the application of multilingual transfer learning represent the most promising avenues for future progress in the Uzbek speech technology domain.

Mavzular

Economic and Industrial Development Advanced Computational Techniques in Science and Engineering Language Acquisition and Education

Identifikatorlar

DOI: 10.1109/smartindustrycon68821.2026.11492975

Iqtiboslar va manbalar

0 ta iqtibos14 ta foydalanilgan manba

Koʻrsatkichlar — AkademScholar · Tez orada