Sentiment Analysis of Social Media Posts Using LSTM Networks
Аннотация
The Sentiment140 collection, which contains 1.6 million labeled tweets, is used in this research, to study sentiment classification. Cleaning, tokenization, and sequence padding to 50 tokens were among the many preprocessing steps that were applied to the data. Three models were investigated: aUtilizing the VADER model as a baseline, a 1D Convolutional Neural Network (CNN), and a dual-directional Long ShortTerm Memory (BiLSTM) network. While the BiLSTM contained <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$\mathbf{6 4}$</tex> units with dropout for regularization, the CNN was made up of dense layers, a conversion layer with 128 filters to perform global peak pooling, and a layer to embed with 128 dimensions. Both models had a batch size of 128 and were trained over 20 epochs. The BiLSTM dramatically outperformed VADER, exhibiting an accuracy of roughly 8587% on the test set with low loss and successfully identifying specific and sequential developments in informal web-based text.
Ҳали таржима қилинмаган