Skip to main content
Article

Stages of Creating an Uzbek-English Parallel Corpus and Principles of Selecting a Linguistic Base

Elov Botir BoltayevichRuhillo Alaev HabibovichNational University of Uzbekistan Named After Mirzo Ulugbek,Tashkent,UzbekistanMarufjon Amirkulov AlikulovichSabohat KenjayevaEshmamatovna Karshi State University,Karshi,UzbekistanJamshid Elov BekmurodovichTashkent Information Technology University,Tashkent,Uzbekistan
2025
ABI

Abstract

This paper is a conceptual study that explores the fundamental stages of creating an Uzbek-English parallel corpus, with special emphasis on the linguistic and methodological principles of selecting the base texts. The study identifies and reviews criteria for the inclusion of texts, such as genre diversity, representativeness, alignment accuracy, and linguistic relevance. Particular attention is given to balancing modern and classical texts, as well as to the role of technological tools in achieving consistent sentence-level alignment. The pipeline and recommendations presented in this paper are based on a synthesis of existing research and are proposed as a guideline for corpus developers aiming to construct a reliable and research-oriented bilingual resource.

Topics

Identifiers

Citations and references

Cited by 06 references