The Algorithm of Uzbek Text Summarizer
Аннотация
The main goal of scientific researchers is to improve their knowledge by finding information about their field from their daily news. In this case, the researcher faces the issue of concluding the text to analyze the text data. Automatic text summaries are one of the main issues of NLP (Natural Language Processing). In the automatic conclusion of the text, 2 types of approaches were recognized: the first is to summarize the text, expressing the summary of the text using words equivalent to it; the second is to clarify important sentences from within the text sentences and summarize the text. This article presents a text summarization model and an approach based on the TF-IDF (Term Frequency-Inverse Document Frequency) algorithm, as part of the second approach to automatically summarize texts in Uzbek. In this case, a significant part of the text given in Uzbek is distinguished by unique words. The sentence weight normalized from the separated part of the text is calculated. Based on the introduced criterion, sentences concerning weight are distinguished and the n-gram model is used.
Перевод пока недоступен