Computational Model of Morphology and Stemming of Uzbek Words on Complete Set of Endings
Аннотация
The Uzbek language belongs to the Turkic-speaking group and is one of the low-resource languages. In this regard, increasing and expanding the language and electronic resources in the Uzbek language is essential. For many natural language processing (NLP) tasks, such as stemming, segmentation, and morphological analysis, a set of endings and stem and stop words are required. The article contains a complete set of Uzbek endings and a dictionary of stem and stop words. The endings were collected for two main parts of speech, that is, for the noun and the verb. The dictionary of verb endings includes all possible combinations of tenses, voices, moods, and participles. Using the collected linguistic resources, stemming programs for Uzbek texts were tested, problems were identified based on the experiment results, and the program was processed according to them. The results of the experiments using the developed linguistic resources of the Uzbek language showed an accuracy of 94.18% on average.
Ҳали таржима қилинмаган