Article

Semantic Tagging of Uzbek and Kirgiz Languages Sentenses in UMR Format

Eşref AdalıIstanbul Technical University,Computer Eng. and Informatics Faculty,Istanbul,TürkiyeKhamroeva Shahlo MirdjonovnaTashkent State University of Uzbek Language and Literature Named Alisher Navo’i,Dept. of Computational Linguistics and Digital Technologies,Tashkent,UzbekistanAbuzalova Mehriniso KadirovnaBukhara State University,Department of Uzbek Linguistics and Journalism,Bukhara,UzbekistanBermet ChontaevaUniversity of Tübingen,Department of General and Computational Linguistics,Tübingen,Germany

2025

ABI

Abstract

This paper explores the semantic tagging of Uzbek and Kirgiz languages units using the Universal Meaning Representation (UMR) framework. UMR is a cross-linguistic semantic representation scheme designed to capture the meaning of utterances in a language-independent manner. The study focuses on adapting this framework to the specific features of the Uzbek language, including its agglutinative morphology, relatively free word order, and complex syntactic structures. The paper outlines the principles of annotating lexical and syntactic elements of Uzbek, Kirgiz the development of a tagset tailored to its linguistic characteristics, and the process of semantic annotation. Particular attention is given to challenges such as disambiguation, syntax-semantics alignment, and semi-automated annotation workflows. The findings contribute to the creation of computational resources for the Uzbek and Kirgiz languages and have potential applications in machine translation, information extraction, and the development of semantic ontologies.

Topics

Natural Language Processing Techniques

Identifiers

DOI: 10.1109/ubmk67458.2025.11206912

Citations and references

Cited by 04 references

Metrics — AkademScholar · Coming soon