Development of a Modern Corpus of Computational Linguistics
Аннотация
This article is dedicated to theoretical and practical issues that arose during the creation of corpus for Uzbek language. Foreign experience and software have been investigated while creating the corpus of the Uzbek language. The modern corpus will be designed and developed in the Uzbek language as a balanced, large-scale and universal corpus. The theoretical and practical methods have been studied before the creation of the corpus are. Throughout the process, different softwares have been used to solve specific problems and The created corpus will be an open source for non-commercial use. The article describes the initial stages of the structure of the corpus and the requirements for the creation of modern corpus.