The Internet has become one of the significant sources for sharing information and expressing users’opinions about products and their interests with the associated aspects.It is essential to learn about product revie...The Internet has become one of the significant sources for sharing information and expressing users’opinions about products and their interests with the associated aspects.It is essential to learn about product reviews;however,to react to such reviews,extracting aspects of the entity to which these reviews belong is equally important.Aspect-based Sentiment Analysis(ABSA)refers to aspects extracted from an opinionated text.The literature proposes different approaches for ABSA;however,most research is focused on supervised approaches,which require labeled datasets with manual sentiment polarity labeling and aspect tagging.This study proposes a semisupervised approach with minimal human supervision to extract aspect terms by detecting the aspect categories.Hence,the study deals with two main sub-tasks in ABSA,named Aspect Category Detection(ACD)and Aspect Term Extraction(ATE).In the first sub-task,aspects categories are extracted using topic modeling and filtered by an oracle further,and it is fed to zero-shot learning as the prompts and the augmented text.The predicted categories are the input to find similar phrases curated with extracting meaningful phrases(e.g.,Nouns,Proper Nouns,NER(Named Entity Recognition)entities)to detect the aspect terms.The study sets a baseline accuracy for two main sub-tasks in ABSA on the Multi-Aspect Multi-Sentiment(MAMS)dataset along with SemEval-2014 Task 4 subtask 1 to show that the proposed approach helps detect aspect terms via aspect categories.展开更多
The Internet revolution has resulted in abundant data from various sources,including social media,traditional media,etcetera.Although the availability of data is no longer an issue,data labelling for exploiting it in ...The Internet revolution has resulted in abundant data from various sources,including social media,traditional media,etcetera.Although the availability of data is no longer an issue,data labelling for exploiting it in supervised machine learning is still an expensive process and involves tedious human efforts.The overall purpose of this study is to propose a strategy to automatically label the unlabeled textual data with the support of active learning in combination with deep learning.More specifically,this study assesses the performance of different active learning strategies in automatic labelling of the textual dataset at sentence and document levels.To achieve this objective,different experiments have been performed on the publicly available dataset.In first set of experiments,we randomly choose a subset of instances from training dataset and train a deep neural network to assess performance on test set.In the second set of experiments,we replace the random selection with different active learning strategies to choose a subset of the training dataset to train the same model and reassess its performance on test set.The experimental results suggest that different active learning strategies yield performance improvement of 7% on document level datasets and 3%on sentence level datasets for auto labelling.展开更多
文摘The Internet has become one of the significant sources for sharing information and expressing users’opinions about products and their interests with the associated aspects.It is essential to learn about product reviews;however,to react to such reviews,extracting aspects of the entity to which these reviews belong is equally important.Aspect-based Sentiment Analysis(ABSA)refers to aspects extracted from an opinionated text.The literature proposes different approaches for ABSA;however,most research is focused on supervised approaches,which require labeled datasets with manual sentiment polarity labeling and aspect tagging.This study proposes a semisupervised approach with minimal human supervision to extract aspect terms by detecting the aspect categories.Hence,the study deals with two main sub-tasks in ABSA,named Aspect Category Detection(ACD)and Aspect Term Extraction(ATE).In the first sub-task,aspects categories are extracted using topic modeling and filtered by an oracle further,and it is fed to zero-shot learning as the prompts and the augmented text.The predicted categories are the input to find similar phrases curated with extracting meaningful phrases(e.g.,Nouns,Proper Nouns,NER(Named Entity Recognition)entities)to detect the aspect terms.The study sets a baseline accuracy for two main sub-tasks in ABSA on the Multi-Aspect Multi-Sentiment(MAMS)dataset along with SemEval-2014 Task 4 subtask 1 to show that the proposed approach helps detect aspect terms via aspect categories.
基金the Deanship of Scientific Research at Shaqra University for supporting this work.
文摘The Internet revolution has resulted in abundant data from various sources,including social media,traditional media,etcetera.Although the availability of data is no longer an issue,data labelling for exploiting it in supervised machine learning is still an expensive process and involves tedious human efforts.The overall purpose of this study is to propose a strategy to automatically label the unlabeled textual data with the support of active learning in combination with deep learning.More specifically,this study assesses the performance of different active learning strategies in automatic labelling of the textual dataset at sentence and document levels.To achieve this objective,different experiments have been performed on the publicly available dataset.In first set of experiments,we randomly choose a subset of instances from training dataset and train a deep neural network to assess performance on test set.In the second set of experiments,we replace the random selection with different active learning strategies to choose a subset of the training dataset to train the same model and reassess its performance on test set.The experimental results suggest that different active learning strategies yield performance improvement of 7% on document level datasets and 3%on sentence level datasets for auto labelling.