Статья

Creating a tokenization algorithm based on the knowledge base for the Uzbek language*

Ilkhom BakaevBukhara State University,Department of “Information Technologie”,Bukhara,Uzbekistan

2022en

ABI

Аннотация

Currently, the correct selection of tokens from incoming information is one of the important issues in such areas as machine translation, information retrieval, information extraction from text, and information security. Algorithms for extracting tokens from texts are called tokenization. In this study, a tokenization algorithm has been developed that works on the basis of a knowledge base to extract lexemes from a text.

Темы

Natural Language Processing Techniques Translation Studies and Practices Topic Modeling

Идентификаторы

DOI: 10.1109/icisct55600.2022.10146893

Цитирования и источники

Цитирований: 1 Использованных источников: 9

Показатели — AkademScholar · Скоро