Перейти к основному содержанию
AkademIndex

Продукты

Для разработчиков

AkademBaseскороОткрытый API экосистемы
Латиница
Русский
Статья

Computational Model of Morphology and Stemming of Uzbek Words on Complete Set of Endings

Ualsher TukeyevAl-Farabi Kazakh National University,Information systems department,Almaty,KazakhstanNargiza GabdullinaAl-Farabi Kazakh National University,Information systems department,Almaty,KazakhstanNazerke KaripbayevaAl-Farabi Kazakh National University,Information systems department,Almaty,KazakhstanNilufar AbdurakhmonovaNational University of Uzbekistan,Department of Computational and Applied Linguistics,Tashkent,UzbekistanTolganay BalabekovaAl-Farabi Kazakh National University,Information systems department,Almaty,KazakhstanAidana KaribayevaAl-Farabi Kazakh National University,Information systems department,Almaty,Kazakhstan
2024en
ABI

Аннотация

The Uzbek language belongs to the Turkic-speaking group and is one of the low-resource languages. In this regard, increasing and expanding the language and electronic resources in the Uzbek language is essential. For many natural language processing (NLP) tasks, such as stemming, segmentation, and morphological analysis, a set of endings and stem and stop words are required. The article contains a complete set of Uzbek endings and a dictionary of stem and stop words. The endings were collected for two main parts of speech, that is, for the noun and the verb. The dictionary of verb endings includes all possible combinations of tenses, voices, moods, and participles. Using the collected linguistic resources, stemming programs for Uzbek texts were tested, problems were identified based on the experiment results, and the program was processed according to them. The results of the experiments using the developed linguistic resources of the Uzbek language showed an accuracy of 94.18% on average.

Темы

Идентификаторы

Цитирования и источники

Показатели — AkademScholar · Скоро