Асосий контентга ўтиш
AkademIndex

Маҳсулотлар

Ишлаб чиқувчилар учун

AkademBaseЭкотизим учун очиқ API
Мақола

Extraction and Data Analysis Basis Words: Case Study on School Corpus

Khabibulla MadatovUrgench State University Named After Abu Rayhan Biruni,The Departments of Computer Science,Urgench, Khorezm,UzbekistanSurayyo KhajibaevaUrgench State University Named After Abu Rayhan Biruni,The Departments of Computer Science,Urgench, Khorezm,Uzbekistan
2025
ABI

Аннотация

This study analyses a large text corpus based on 142 textbooks created for school education in Uzbekistan. In the proposed approach, the Basis Word Extraction Using Synonym Thesaurus Support method is developed specifically for each grade, using a thesaurus. The corpus was studied in blocks of Primary School Corpus (grades 1–4), Basic Secondary School Corpus (grades 5–9), and Secondary School Corpus (grades 10–11). As a result, basis words that differ from the general corpus were extracted for each grade, as well as new basis words that were not found in previous grades and were specific to this grade. The main idea of this method is to extract basis words from the lemma set of each grade using a synonym database. As a result of this method, 17599 basis words were extracted from the Uzbek Primary School Corpus, 47203 from the Uzbek Basic Secondary School Corpus, and 20491 from the Uzbek Secondary School Corpus. This method enables the analysis of the lexical complexity and class-specific vocabulary of texts intended for schoolchildren.

Ҳали таржима қилинмаган

Мавзулар

Идентификаторлар

Иқтибослар ва манбалар

Кўрсаткичлар — AkademScholar · Тез орада