UzSentiment Sentiment Analysis Methods for Uzbek Texts
Abstract
This article is devoted to modern methods of sentiment analysis in Uzbek texts and the practical experience of applying them. The study compares two main approaches Naive Bayes and Support Vector Machine (SVM) - theoretically and practically, based on a corpus called UzSentiment, which consists of 100,000 texts labelled with positive and negative tags. The article begins with a literature review of existing international and Uzbek-language research in the field of sentiment analysis, highlighting the key challenges faced by lowresource and morphologically rich languages. Then, the selected models are described in detail with their mathematical formulations (Bayesian probability, margin optimisation). All models are trained on the UzSentiment corpus and evaluated using Accuracy, Precision, Recall, and F1-score, with results compared through tables. The findings show that SVM achieves higher accuracy compared to Naive Bayes, while both approaches offer advantages in terms of computational efficiency. Additionally, the strengths and weaknesses of each method, optimisation opportunities for the Uzbek language, and practical application scenarios are examined. This study focuses on classical machine learning approaches, while deep learning models are considered beyond the scope of this work.