On a Method for Text Data Classification
Аннотация
This article examines the problem of converting text data into numerical coordinate vectors using the BоW, TF-IDI and Word2vec methods and subsequently performing classification. To solve the classification problem, various mathematical methods can be used, such as k-nearest neighbors, SVM, spherical Apollonius, and angle calculation between vectors. In this work, text data were converted into numerical coordinate vectors using Word2Vec, and classification was performed using the method of determining the angle between vectors. Furthermore, by comparing the results obtained from these methods, the most appropriate approach for text data processing was identified, and the advantages and disadvantages of the above methods in computer-based text processing were studied.
Перевод пока недоступен