Asosiy kontentga oʻtish
AkademIndex

Mahsulotlar

Ishlab chiquvchilar uchun

AkademBasetez oradaEkotizim uchun ochiq API
Lotin
Oʻzbek
Maqola

Identification of Named Entities from Uzbek Historical Texts: A Multilingual BERT Approach

Nodirbek BoltayevUrgench State University,Urgench,UzbekistanMunisa NayimovaBukhara State University,Bukaha,UzbekistanН. П. АбубакироваAlisher AbidjanovCyber University,Nurafshon,UzbekistanKamolova MadinaSamarkand State Institute of Foreign Languages,Samarkand,UzbekistanSevara BerdimurotovaTermez University of Economics and Service,Termez,Uzbekistan
2025
ABI

Annotatsiya

This paper presents an algorithm for recognizing named entities in Uzbek historical texts dating back to 1928– 1940. To accomplish the task, we used the Multilingual BERT deep learning model, which was trained on a custom dataset. It should be noted that this dataset was formed from 5,500 sentences, each of which was annotated using the BIOES scheme. The authors argued that this annotation scheme was chosen because it is one of the most popular annotation schemes for named entity detection tasks. Organizations, persons, and locations were selected as categories of named entities. The model was trained using the early stopping mechanism, which allowed us to select the best metric weights obtained at the 11th training epoch. For an objective assessment, testing was conducted on various thematic historical texts and modern Uzbek texts, which once again confirmed the high efficiency of the model for historical data and revealed a significant decrease in accuracy on modern texts.

Mavzular

Identifikatorlar

Iqtiboslar va manbalar

Koʻrsatkichlar — AkademScholar · Tez orada