Перейти к основному содержанию
AkademIndex

Продукты

Для разработчиков

AkademBaseОткрытый API экосистемы
Статья

Generative approach to Aspect Based Sentiment Analysis with GPT Language Models

S.N. ChumakovNational Center for Cognitive Research, ITMO University, St. Petersburg 197101, RussiaAnton KovantsevNational Center for Cognitive Research, ITMO University, St. Petersburg 197101, RussiaAnatoliy SurikovNational Center for Cognitive Research, ITMO University, St. Petersburg 197101, Russia
2023en
ABI

Аннотация

Aspect Sentiment Triplet Extraction (ASTE) is a modern and effective form of sentiment analysis that enables the extraction of highly representative features of source textual data. Recent solutions rely on models built upon Bidirectional Encoder Representations from Transformers (BERT) embeddings and large manually-tagged datasets. This implies that usage of such methods requires large amounts of gold-tagged domain-specific data and is vulnerable to data drifts, while not being able to recognize segmented and summarize more complex terms. We propose an open-domain generative method for ASTE based on Generative pre-trained transformer (GPT) with few-shot and fine-tuning strategies. This method has shown to be applicable for the task, with the models being capable of consistent structuring of the output triplet, simplification of the terms without losing meaningful information, as well as successful analysis of data from unknown domains. Resulting models was tested on mixed domain Russian-language automatically tagged data with thorough manual editing by means of a large language model (LLM) with a few-shot approach and English data, which was only automatically tagged. The developed models have shown to take advantage of the ability to perform learning in a few-shot way, allowing knowledge distillation from larger to cardinally smaller ones. Models have also been tested on summarizing of large amounts of reviews and have shown results comparable to enterprise grade solutions.

Перевод пока недоступен

Идентификаторы

Цитирования и источники

Цитирований: 2Использованных источников: 0