Статья

A Benchmark Dataset for Cricket Sentiment Analysis in Bangla Social Media Text

Tanjim MahmudDepartment of Computer Science and Engineering, Rangamati Science and Technology University, BangladeshRezaul KarimDepartment of Computer Science and Engineering, Rangamati Science and Technology University, BangladeshRishita ChakmaDepartment of Computer Science and Engineering, Rangamati Science and Technology University, BangladeshTanjia ChowdhuryDepartment of Computer Science and Engineering, Southern University Bangladesh, Chittagong, BangladeshMohammad Shahadat HossainDepartment of Computer Science and Engineering, University of Chittagong, Chittagong, BangladeshKarl AnderssonLuleå University of Technology, Skelleftea, Sweden

2024en

ABI

Аннотация

This study introduces a novel benchmark dataset designed for Cricket Sentiment Analysis on Bangla social media posts, emphasizing a low-resource setting. The dataset was meticulously curated through manual collection across diverse social media platforms, ensuring comprehensive representation of user sentiments. Annotations validated dataset quality, achieving a remarkable Cohen Kappa score of 0.97. Experimentation with machine learning (ML) models revealed challenges, with traditional approaches yielding modest RNN accuracy of 0.5239. However, deep learning (DL) models showcased significant performance enhancements. The LSTM model achieved 0.897 accuracy, while the BiLSTM model surpassed expectations at 0.952. These findings highlight DL’s efficacy in capturing nuanced sentiments in Bangla cricket-related social media posts, contributing a high-quality benchmark dataset and insights into DL’s suitability for sentiment analysis in low-resource linguistic contexts.

Перевод пока недоступен

Идентификаторы

DOI: 10.1016/j.procs.2024.06.038

Цитирования и источники

Цитирований: 3Использованных источников: 0

Показатели — AkademScholar