Статья

Rule-Based Syntactic Analysis for Uzbek Language: An Alternative Approach to Overcome Data Scarcity and Enhance Interpretability

Davlatyor MenglievState University,IT Department Novosibirsk,Novosibirsk,RussiaVladimir BarakhninState University,IT Department Novosibirsk,Novosibirsk,RussiaBahodir IbragimovNovosibirsk State University,Department of Mathematics and Mechanics,Novosibirsk,Russia

2023en

ABI

Аннотация

This research paper introduces an innovative rule-based syntactic analysis algorithm specifically tailored for the Uzbek language, designed to address and overcome the challenges associated with insufficient data availability commonly faced in machine learning approaches. By leveraging the unique characteristics of Uzbek grammar, the study establishes a comprehensive rule set for effectively parsing sentences in this low-resource language. It covers various aspects including tokenization, construction of dependency trees, followed by rigorous optimization and testing performed on diverse Uzbek texts. Despite the absence of machine learning, the study's relevance is elevated by providing a solution to data scarcity, offering a transparent, interpretable system that ensures faster development, reduced computational resource requirements, and enhanced resilience to noise and errors in data. The paper provides a thorough examination of Uzbek grammar, syntactic features, and a set of parsing rules. It also reviews related works, outlines the proposed algorithm's development, and presents the future potential for Natural Language Processing techniques in low-resource languages like Uzbek.

Темы

Natural Language Processing Techniques Translation Studies and Practices

Идентификаторы

DOI: 10.1109/edm58354.2023.10225235

Цитирования и источники

Цитирований: 12 Использованных источников: 9

Показатели — AkademScholar · Скоро