Eurasian Journal of Mathematical Theory and Computer Sciences
2023en
ABI
Abstract
This article examines the main methods of text data processing: lemmatization, tokenization, and stemming.These methods are used to normalize and prepare text for analysis and machine learning.The algorithms and approaches for implementing each method are described, and their advantages and disadvantages are analyzed.The research results guide the selection of an appropriate method based on the task and the characteristics of the text being processed.
Identifiers
Citations and references
Cited by 40 references