Перейти к основному содержанию
AkademIndex

Продукты

Для разработчиков

AkademBaseОткрытый API экосистемы
Статья

Speech-to-Text Models in Uzbek Language: Achievements and Limitations

Fayzullo NazarovSamarkand State University,Department of Artificial Intelligence and Information Systems,Samarkand,UzbekistanAkbar SolievSamarkand State University,Department of Artificial Intelligence and Information Systems,Samarkand,UzbekistanB. EshtemirovSamarkand State University,Department of Artificial Intelligence and Information Systems,Samarkand,Uzbekistan
2026
ABI

Аннотация

In the last few years, artificial intelligence and natural language processing (NLP) have changed the way people and computers interact in a big way. Nonetheless, creating strong Automatic Speech Recognition (ASR) systems for languages like Uzbek that don't have a lot of resources and are agglutinative is still a big problem. This paper offers a comprehensive examination of the present condition of Uzbek ASR technologies, grounded in research disseminated from 2020 to 2025. The research delineates significant constraints arising from the language's intricate morphological framework, considerable dialectal variation, and the deficiency of substantial annotated datasets. We look at different architectures, such as old-school DNN-HMM hybrids and newer End-to-End (E2E) models like Transformers and Conformers. Comparative results show that E2E-Conformer architectures combined with specialised language models (UzLM) work better, with a Word Error Rate (WER) of 13.9%. The results indicate that the creation of more extensive open-source corpora, the adoption of Self-Supervised Learning, and the application of multilingual transfer learning represent the most promising avenues for future progress in the Uzbek speech technology domain.

Перевод пока недоступен

Темы

Идентификаторы

Цитирования и источники

Показатели — AkademScholar · Скоро