Асосий контентга ўтиш
AkademIndex

Маҳсулотлар

Ишлаб чиқувчилар учун

AkademBaseЭкотизим учун очиқ API
Мақола

Oghuz Dialect Analysis of the Uzbek Language: Methodological Approach and Experimental Study

Saidbek P. BabayazovCyber University,Nurafshon,UzbekistanSh. Kh. IsmoilovUrgench Innovation University,Urgench,UzbekistanObidjan MadaminovUrgench State University,Urgench,UzbekistanUmidbek P. BabayazovUrgench State University,Urgench,UzbekistanNafisa RuzimovaDilfuza XajiyevaUrgench State Pedagogical Institute,Urgench,Uzbekistan
2025
ABI

Аннотация

In this research, the authors present a relatively simple to implement yet effective detector for the Oghuz dialect of Uzbek. The method is compatible with standard natural language preprocessing, specifically normalization, tokenization, and spelling-aware regular expressions. Furthermore, a carefully selected set of diagnostic features (euklama enhancers, connectors, and auxiliary particles) is used for text analysis. We evaluate texts by normalizing the total number of pattern matches by the number of tokens and apply a single, adjustable threshold to distinguish dialectal from standard Uzbek. With stratified development, the rule-based system provides strong separability with a practical operating point. At the same time, it delivers high precision and recall, where the addition of a TF-IDF + logistic regression layer provides a small boost in edge cases while maintaining interpretability and low computational cost. A detailed error analysis identifies key error types—interscript variation, colloquial overlaps, and the handling of multi-word/clitic fragments—and motivates targeted corrections to normalization, matching constraints, and MWU rules. In addition to classification, the inventory supports corpus formation and training by providing pattern-based diagnostics, facilitating gradual refinement and major updates to context coders as needed.

Ҳали таржима қилинмаган

Мавзулар

Идентификаторлар

Иқтибослар ва манбалар

Кўрсаткичлар — AkademScholar · Тез орада