Tagging the Corpus of Fitrat by Lexical Layer
Аннотация
As the science of computer technologies is developing, the demand for information is also increasing. Corpus linguistics provides facilities for collecting, using, and processing information, works created by authors, and literary heritage in electronic form. Today, in world linguistics, the study of author's works with the help of corpus is gaining momentum. Creation of Abdurauf Fitrat's author corpus also allows users to get information quickly and use it easily and effectively. This article covers the issues of studying the lexical features of the language of Fitrat's works, identifying the lexicon specific to the own and acquired layers, and tagging them. Also, the article highlights the difficulties that arose during the process of tagging corpus texts and the peculiarities identified in the language. In the article, the number of lemmas used in the author's scientific and artistic works was determined, to which layer they belong, and statistics were calculated using the corpus. The share of lemmas in scientific and artistic works in their own and acquired layers was compared. Descriptive, historical-comparative, statistical analysis, automatic analysis methods were used in the research process. The obtained results were shown in tables and diagrams.