Asosiy kontentga oʻtish
AkademIndex

Mahsulotlar

Ishlab chiquvchilar uchun

AkademBaseEkotizim uchun ochiq API
Maqola

Generating Stopword List for Sanskrit Language

Jaideepsinh K. RauljiAhmedabad University, Ahmedabad, Gujarat, IndiaJatinderkumar R. Saini
2017en
ABI

Annotatsiya

In the era of information burst, optimization of processes for Information Retrieval, Text Summarization, Text and Data Analytic systems becomes utmost important. Therefore in order to achieve accuracy, redundant words with low or no semantic meaning must be filtered out. Such words are known as Stopwords. Stopwords list has been developed for languages like English, Chinese, Arabic, Hindi, etc but standard stopword list is still missing for Sanskrit language. Identifying stop words manually from Sanskrit text is a herculean task hence this paper reflects an automated stop word generator algorithm based on frequency of word and its implementation to ease the task. To fine-tune the generated list still manual intervention by language expert is required thus following a hybrid approach. The paper presents the first of its kind, a list of seventy-five generic stopwords of Sanskrit language extracted from a data amounting to nearly seventy-six thousand words.

Hali tarjima qilinmagan

Identifikatorlar

Iqtiboslar va manbalar

2 ta iqtibos0 ta foydalanilgan manba