Article

UzRoberta: A pre-trained language model for Uzbek

Rifkat DavronovV.I.Romanovskiy Institute of Mathematics, Uzbekistan Academy of Sciences, 9, University str., Tashkent 100174, UzbekistanFatima АdilovaV.I.Romanovskiy Institute of Mathematics, Uzbekistan Academy of Sciences, 9, University str., Tashkent 100174, Uzbekistan

AIP conference proceedingsjournal2024en

ABI

Abstract

I would like to introduce the UzLUE standard, which stands for Uzbek Language Understanding Evaluation. A challenge for understanding the natural language of Uzbek (NLU) is UzLUE, which includes message classification. Build jobs from scratch using a diverse source corpus while respecting copyright to make it fully accessible for everyone. We use this UzLUE-RoBERTa, a pre-trained language model (PLM), to support replication of the base model in UzLUE and facilitate future research. We found that UzLUE-RoBERT [1]-base outperforms other baselines including multilingual PLM.

Topics

Natural Language Processing Techniques Topic Modeling Text and Document Classification Technologies

Identifiers

DOI: 10.1063/5.0199871

Citations and references

Cited by 2 5 references

Metrics — AkademScholar · Coming soon