Статья

Development of a Modern Corpus of Computational Linguistics

Mukhammadsolikh TursunovSamarkand Branch of Tashkent, University of Information Technologies, Samarkand, UzbekistanA. B. KarshievSamarkand Branch of Tashkent, University of Information Technologies, Samarkand, UzbekistanSuyun KarimovSamarkand State University University blv. 15, Samarkand, Uzbekistan

2020 International Conference on Information Science and Communications Technologies (ICISCT)conference2020en

ABI

Аннотация

This article is dedicated to theoretical and practical issues that arose during the creation of corpus for Uzbek language. Foreign experience and software have been investigated while creating the corpus of the Uzbek language. The modern corpus will be designed and developed in the Uzbek language as a balanced, large-scale and universal corpus. The theoretical and practical methods have been studied before the creation of the corpus are. Throughout the process, different softwares have been used to solve specific problems and The created corpus will be an open source for non-commercial use. The article describes the initial stages of the structure of the corpus and the requirements for the creation of modern corpus.

Темы

Natural Language Processing Techniques Translation Studies and Practices Lexicography and Language Studies

Идентификаторы

DOI: 10.1109/icisct50599.2020.9351376

Цитирования и источники

Цитирований: 3 Использованных источников: 17

Показатели — AkademScholar · Скоро