Targeted multimodal sentiment classification(TMSC)aims to identify the sentiment polarity of a target mentioned in a multimodal post.The majority of current studies on this task focus on mapping the image and the text...Targeted multimodal sentiment classification(TMSC)aims to identify the sentiment polarity of a target mentioned in a multimodal post.The majority of current studies on this task focus on mapping the image and the text to a high-dimensional space in order to obtain and fuse implicit representations,ignoring the rich semantic information contained in the images and not taking into account the contribution of the visual modality in the multimodal fusion representation,which can potentially influence the results of TMSC tasks.This paper proposes a general model for Improving Targeted Multimodal Sentiment Classification with Semantic Description of Images(ITMSC)as a way to tackle these issues and improve the accu-racy of multimodal sentiment analysis.Specifically,the ITMSC model can automatically adjust the contribution of images in the fusion representation through the exploitation of semantic descriptions of images and text similarity relations.Further,we propose a target-based attention module to capture the target-text relevance,an image-based attention module to capture the image-text relevance,and a target-image matching module based on the former two modules to properly align the target with the image so that fine-grained semantic information can be extracted.Our experimental results demonstrate that our model achieves comparable performance with several state-of-the-art approaches on two multimodal sentiment datasets.Our findings indicate that incorporating semantic descriptions of images can enhance our understanding of multimodal content and lead to improved sentiment analysis performance.展开更多
Multimodal Sentiment Analysis(SA)is gaining popularity due to its broad application potential.The existing studies have focused on the SA of single modalities,such as texts or photos,posing challenges in effectively h...Multimodal Sentiment Analysis(SA)is gaining popularity due to its broad application potential.The existing studies have focused on the SA of single modalities,such as texts or photos,posing challenges in effectively handling social media data with multiple modalities.Moreover,most multimodal research has concentrated on merely combining the two modalities rather than exploring their complex correlations,leading to unsatisfactory sentiment classification results.Motivated by this,we propose a new visualtextual sentiment classification model named Multi-Model Fusion(MMF),which uses a mixed fusion framework for SA to effectively capture the essential information and the intrinsic relationship between the visual and textual content.The proposed model comprises three deep neural networks.Two different neural networks are proposed to extract the most emotionally relevant aspects of image and text data.Thus,more discriminative features are gathered for accurate sentiment classification.Then,a multichannel joint fusion modelwith a self-attention technique is proposed to exploit the intrinsic correlation between visual and textual characteristics and obtain emotionally rich information for joint sentiment classification.Finally,the results of the three classifiers are integrated using a decision fusion scheme to improve the robustness and generalizability of the proposed model.An interpretable visual-textual sentiment classification model is further developed using the Local Interpretable Model-agnostic Explanation model(LIME)to ensure the model’s explainability and resilience.The proposed MMF model has been tested on four real-world sentiment datasets,achieving(99.78%)accuracy on Binary_Getty(BG),(99.12%)on Binary_iStock(BIS),(95.70%)on Twitter,and(79.06%)on the Multi-View Sentiment Analysis(MVSA)dataset.These results demonstrate the superior performance of our MMF model compared to single-model approaches and current state-of-the-art techniques based on model evaluation criteria.展开更多
Textual data streams have been extensively used in practical applications where consumers of online products have expressed their views regarding online products.Due to changes in data distribution,commonly referred t...Textual data streams have been extensively used in practical applications where consumers of online products have expressed their views regarding online products.Due to changes in data distribution,commonly referred to as concept drift,mining this data stream is a challenging problem for researchers.The majority of the existing drift detection techniques are based on classification errors,which have higher probabilities of false-positive or missed detections.To improve classification accuracy,there is a need to develop more intuitive detection techniques that can identify a great number of drifts in the data streams.This paper presents an adaptive unsupervised learning technique,an ensemble classifier based on drift detection for opinion mining and sentiment classification.To improve classification performance,this approach uses four different dissimilarity measures to determine the degree of concept drifts in the data stream.Whenever a drift is detected,the proposed method builds and adds a new classifier to the ensemble.To add a new classifier,the total number of classifiers in the ensemble is first checked if the limit is exceeded before the classifier with the least weight is removed from the ensemble.To this end,a weighting mechanism is used to calculate the weight of each classifier,which decides the contribution of each classifier in the final classification results.Several experiments were conducted on real-world datasets and the resultswere evaluated on the false positive rate,miss detection rate,and accuracy measures.The proposed method is also compared with the state-of-the-art methods,which include DDM,EDDM,and PageHinkley with support vector machine(SVM)and Naive Bayes classifiers that are frequently used in concept drift detection studies.In all cases,the results show the efficiency of our proposed method.展开更多
Sentiment classification is a useful tool to classify reviews about sentiments and attitudes towards a product or service.Existing studies heavily rely on sentiment classification methods that require fully annotated ...Sentiment classification is a useful tool to classify reviews about sentiments and attitudes towards a product or service.Existing studies heavily rely on sentiment classification methods that require fully annotated inputs.However,there is limited labelled text available,making the acquirement process of the fully annotated input costly and labour-intensive.Lately,semi-supervised methods emerge as they require only partially labelled input but perform comparably to supervised methods.Nevertheless,some works reported that the performance of the semi-supervised model degraded after adding unlabelled instances into training.Literature also shows that not all unlabelled instances are equally useful;thus identifying the informative unlabelled instances is beneficial in training a semi-supervised model.To achieve this,an informative score is proposed and incorporated into semisupervised sentiment classification.The evaluation is performed on a semisupervised method without an informative score and with an informative score.By using the informative score in the instance selection strategy to identify informative unlabelled instances,semi-supervised models perform better compared to models that do not incorporate informative scores into their training.Although the performance of semi-supervised models incorporated with an informative score is not able to surpass the supervised models,the results are still found promising as the differences in performance are subtle with a small difference of 2%to 5%,but the number of labelled instances used is greatly reduced from100%to 40%.The best finding of the proposed instance selection strategy is achieved when incorporating an informative score with a baseline confidence score at a 0.5:0.5 ratio using only 40%labelled data.展开更多
Sentiment Analysis(SA)is one of the subfields in Natural Language Processing(NLP)which focuses on identification and extraction of opinions that exist in the text provided across reviews,social media,blogs,news,and so...Sentiment Analysis(SA)is one of the subfields in Natural Language Processing(NLP)which focuses on identification and extraction of opinions that exist in the text provided across reviews,social media,blogs,news,and so on.SA has the ability to handle the drastically-increasing unstructured text by transform-ing them into structured data with the help of NLP and open source tools.The current research work designs a novel Modified Red Deer Algorithm(MRDA)Extreme Learning Machine Sparse Autoencoder(ELMSAE)model for SA and classification.The proposed MRDA-ELMSAE technique initially performs pre-processing to transform the data into a compatible format.Moreover,TF-IDF vec-torizer is employed in the extraction of features while ELMSAE model is applied in the classification of sentiments.Furthermore,optimal parameter tuning is done for ELMSAE model using MRDA technique.A wide range of simulation analyses was carried out and results from comparative analysis establish the enhanced effi-ciency of MRDA-ELMSAE technique against other recent techniques.展开更多
Automatic classification of sentiment data(e.g., reviews, blogs) has many applications in enterprise user management systems, and can help us understand people's attitudes about products or services. However, it is...Automatic classification of sentiment data(e.g., reviews, blogs) has many applications in enterprise user management systems, and can help us understand people's attitudes about products or services. However, it is difficult to train an accurate sentiment classifier for different domains. One of the major reasons is that people often use different words to express the same sentiment in different domains, and we cannot easily find a direct mapping relationship between them to reduce the differences between domains. So, the accuracy of the sentiment classifier will decline sharply when we apply a classifier trained in one domain to other domains. In this paper, we propose a novel approach called words alignment based on association rules(WAAR) for cross-domain sentiment classification,which can establish an indirect mapping relationship between domain-specific words in different domains by learning the strong association rules between domain-shared words and domain-specific words in the same domain. In this way, the differences between the source domain and target domain can be reduced to some extent, and a more accurate cross-domain classifier can be trained. Experimental results on Amazon~ datasets show the effectiveness of our approach on improving the performance of cross-domain sentiment classification.展开更多
With the rising and spreading of micro-blog, the sentiment classification of short texts has become a research hotspot. Some methods have been developed in the past decade. However, since the Chinese and English are d...With the rising and spreading of micro-blog, the sentiment classification of short texts has become a research hotspot. Some methods have been developed in the past decade. However, since the Chinese and English are different in language syntax, semantics and pragmatics, sentiment classification methods that are effective for English twitter may fail on Chinese micro-blog. In addition, the colloquialism and conciseness of short Chinese texts introduces additional challenges to sentiment classification. In this work, a novel hybrid learning model was proposed for sentiment classification of Chinese micro-blogs, which included two stages. In the first stage, emotional scores were calculated over the whole dataset by utilizing an improved Chinese-oriented sentiment dictionary classification method. Data with extremely high or low scores were directly labeled. In the second stage, the remaining data were labeled by using an integrated classification method based on sentiment dictionary, support vector machine(SVM) and k-nearest neighbor(KNN). An improved feature selection method was adopted to enhance the discriminative power of the selected features. The two-stage hybrid framework made the proposed method effective for sentiment classification of Chinese micro-blogs. Experiments on the COAE2014(Chinese Opinion Analysis Evaluation 2014) dataset show that the proposed method outperforms other schemes.展开更多
This paper presents a method for aspect based sentiment classification tasks, named convolutional multi-head self-attention memory network(CMA-Mem Net). This is an improved model based on memory networks, and makes it...This paper presents a method for aspect based sentiment classification tasks, named convolutional multi-head self-attention memory network(CMA-Mem Net). This is an improved model based on memory networks, and makes it possible to extract more rich and complex semantic information from sequences and aspects. In order to fix the memory network’s inability to capture context-related information on a word-level,we propose utilizing convolution to capture n-gram grammatical information. We use multi-head self-attention to make up for the problem where the memory network ignores the semantic information of the sequence itself. Meanwhile, unlike most recurrent neural network(RNN) long short term memory(LSTM), gated recurrent unit(GRU) models, we retain the parallelism of the network. We experiment on the open datasets Sem Eval-2014 Task 4 and Sem Eval-2016 Task 6. Compared with some popular baseline methods, our model performs excellently.展开更多
Sentiment analysis(SA)is a growing field at the intersection of computer science and computational linguistics that endeavors to automati-cally identify the sentiment presented in text.Computational linguistics aims t...Sentiment analysis(SA)is a growing field at the intersection of computer science and computational linguistics that endeavors to automati-cally identify the sentiment presented in text.Computational linguistics aims to describe the fundamental methods utilized in the formation of computer methods for understanding natural language.Sentiment is classified as a negative or positive assessment articulated through language.SA can be commonly used for the movie review classification that involves the automatic determination that a review posted online(of a movie)can be negative or positive toward the thing that has been reviewed.Deep learning(DL)is becoming a powerful machine learning(ML)method for dealing with the increasing demand for precise SA.With this motivation,this study designs a computational intelligence enabled modified sine cosine optimization with a adaptive deep belief network for movie review classification(MSCADBN-MVC)technique.The major intention of the MSCADBN-MVC technique is focused on the identification of sentiments that exist in the movie review data.Primarily,the MSCADBN-MVC model follows data pre-processing and the word2vec word embedding process.For the classification of sentiments that exist in the movie reviews,the ADBN model is utilized in this work.At last,the hyperparameter tuning of the ADBN model is carried out using the MSCA technique,which integrates the Levy flight concepts into the standard sine cosine algorithm(SCA).In order to demonstrate the significant performance of the MSCADBN-MVC model,a wide-ranging experimental analysis is performed on three different datasets.The comprehensive study highlighted the enhancements of the MSCADBN-MVC model in the movie review classification process with maximum accuracy of 88.93%.展开更多
The COVID-19 pandemic has become one of the severe diseases in recent years.As it majorly affects the common livelihood of people across the universe,it is essential for administrators and healthcare professionals to ...The COVID-19 pandemic has become one of the severe diseases in recent years.As it majorly affects the common livelihood of people across the universe,it is essential for administrators and healthcare professionals to be aware of the views of the community so as to monitor the severity of the spread of the outbreak.The public opinions are been shared enormously in microblogging med-ia like twitter and is considered as one of the popular sources to collect public opinions in any topic like politics,sports,entertainment etc.,This work presents a combination of Intensity Based Emotion Classification Convolution Neural Net-work(IBEC-CNN)model and Non-negative Matrix Factorization(NMF)for detecting and analyzing the different topics discussed in the COVID-19 tweets as well the intensity of the emotional content of those tweets.The topics were identified using NMF and the emotions are classified using pretrained IBEC-CNN,based on predefined intensity scores.The research aimed at identifying the emotions in the Indian tweets related to COVID-19 and producing a list of topics discussed by the users during the COVID-19 pandemic.Using the Twitter Application Programming Interface(Twitter API),huge numbers of COVID-19 tweets are retrieved during January and July 2020.The extracted tweets are ana-lyzed for emotions fear,joy,sadness and trust with proposed Intensity Based Emotion Classification Convolution Neural Network(IBEC-CNN)model which is pretrained.The classified tweets are given an intensity score varies from 1 to 3,with 1 being low intensity for the emotion,2 being the moderate and 3 being the high intensity.To identify the topics in the tweets and the themes of those topics,Non-negative Matrix Factorization(NMF)has been employed.Analysis of emotions of COVID-19 tweets has identified,that the count of positive tweets is more than that of count of negative tweets during the period considered and the negative tweets related to COVID-19 is less than 5%.Also,more than 75%nega-tive tweets expressed sadness,fear are of low intensity.A qualitative analysis has also been conducted and the topics detected are grouped into themes such as eco-nomic impacts,case reports,treatments,entertainment and vaccination.The results of analysis show that the issues related to the pandemic are expressed dif-ferent emotions in twitter which helps in interpreting the public insights during the pandemic and these results are beneficial for planning the dissemination of factual health statistics to build the trust of the people.The performance comparison shows that the proposed IBEC-CNN model outperforms the conventional models and achieved 83.71%accuracy.The%of COVID-19 tweets that discussed the different topics vary from 7.45%to 26.43%on topics economy,Statistics on cases,Government/Politics,Entertainment,Lockdown,Treatments and Virtual Events.The least number of tweets discussed on politics/government on the other hand the tweets discussed most about treatments.展开更多
Recently,the effectiveness of neural networks,especially convolutional neural networks,has been validated in the field of natural language processing,in which,sentiment classification for online reviews is an importan...Recently,the effectiveness of neural networks,especially convolutional neural networks,has been validated in the field of natural language processing,in which,sentiment classification for online reviews is an important and challenging task.Existing convolutional neural networks extract important features of sentences without local features or the feature sequence.Thus,these models do not perform well,especially for transition sentences.To this end,we propose a Piecewise Pooling Convolutional Neural Network(PPCNN)for sentiment classification.Firstly,with a sentence presented by word vectors,convolution operation is introduced to obtain the convolution feature map vectors.Secondly,these vectors are segmented according to the positions of transition words in sentences.Thirdly,the most significant feature of each local segment is extracted using max pooling mechanism,and then the different aspects of features can be extracted.Specifically,the relative sequence of these features is preserved.Finally,after processed by the dropout algorithm,the softmax classifier is trained for sentiment classification.Experimental results show that the proposed method PPCNN is effective and superior to other baseline methods,especially for datasets with transition sentences.展开更多
Currently,individuals use online social media,namely Facebook or Twitter,for sharing their thoughts and emotions.Detection of emotions on social networking sites’finds useful in several applications in social welfare...Currently,individuals use online social media,namely Facebook or Twitter,for sharing their thoughts and emotions.Detection of emotions on social networking sites’finds useful in several applications in social welfare,commerce,public health,and so on.Emotion is expressed in several means,like facial and speech expressions,gestures,and written text.Emotion recognition in a text document is a content-based classification problem that includes notions from deep learning(DL)and natural language processing(NLP)domains.This article proposes a Deer HuntingOptimizationwithDeep Belief Network Enabled Emotion Classification(DHODBN-EC)on English Twitter Data in this study.The presented DHODBN-EC model aims to examine the existence of distinct emotion classes in tweets.At the introductory level,the DHODBN-EC technique pre-processes the tweets at different levels.Besides,the word2vec feature extraction process is applied to generate the word embedding process.For emotion classification,the DHODBN-EC model utilizes the DBN model,which helps to determine distinct emotion class labels.Lastly,the DHO algorithm is leveraged for optimal hyperparameter adjustment of the DBN technique.An extensive range of experimental analyses can be executed to demonstrate the enhanced performance of the DHODBN-EC approach.A comprehensive comparison study exhibited the improvements of the DHODBN-EC model over other approaches with increased accuracy of 96.67%.展开更多
With the development of short video industry,video and bullet screen have become important ways to spread public opinions.Public attitudes can be timely obtained through emotional analysis on bullet screen,which can a...With the development of short video industry,video and bullet screen have become important ways to spread public opinions.Public attitudes can be timely obtained through emotional analysis on bullet screen,which can also reduce difficulties in management of online public opinions.A convolutional neural network model based on multi-head attention is proposed to solve the problem of how to effectively model relations among words and identify key words in emotion classification tasks with short text contents and lack of complete context information.Firstly,encode word positions so that order information of input sequences can be used by the model.Secondly,use a multi-head attention mechanism to obtain semantic expressions in different subspaces,effectively capture internal relevance and enhance dependent relationships among words,as well as highlight emotional weights of key emotional words.Then a dilated convolution is used to increase the receptive field and extract more features.On this basis,the above multi-attention mechanism is combined with a convolutional neural network to model and analyze the seven emotional categories of bullet screens.Testing from perspectives of model and dataset,experimental results can validate effectiveness of our approach.Finally,emotions of bullet screens are visualized to provide data supports for hot event controls and other fields.展开更多
The ability of pre-trained BERT model to achieve outstanding performances on many Natural Language Processing(NLP)tasks has attracted the attention of researchers in recent times.However,the huge computational and mem...The ability of pre-trained BERT model to achieve outstanding performances on many Natural Language Processing(NLP)tasks has attracted the attention of researchers in recent times.However,the huge computational and memory requirements have hampered its widespread deployment on devices with limited resources.The concept of knowledge distillation has shown to produce smaller and faster distilled models with less trainable parameters and intended for resource-constrained environments.The distilled models can be fine-tuned with great performance on a wider range of tasks,such as sentiment classification.This paper evaluates the performance of DistilBERT model and other pre-canned text classifiers on a Covid-19 online news binary classification dataset.The analysis shows that despite having fewer trainable parameters than the BERT-based model,the DistilBERT model achieved an accuracy of 0.94 on the validation set after only two training epochs.The paper also highlights the usefulness of the ktrain library in facilitating the building,training,and application of state-of-the-art Machine Learning and Deep Learning models.展开更多
Sentiment analysis is now more and more important in modern natural language processing,and the sentiment classification is the one of the most popular applications.The crucial part of sentiment classification is feat...Sentiment analysis is now more and more important in modern natural language processing,and the sentiment classification is the one of the most popular applications.The crucial part of sentiment classification is feature extraction.In this paper,two methods for feature extraction,feature selection and feature embedding,are compared.Then Word2Vec is used as an embedding method.In this experiment,Chinese document is used as the corpus,and tree methods are used to get the features of a document:average word vectors,Doc2Vec and weighted average word vectors.After that,these samples are fed to three machine learning algorithms to do the classification,and support vector machine(SVM) has the best result.Finally,the parameters of random forest are analyzed.展开更多
With the rapid development of social economy,the society has entered into a new stage of development,especially in new media under the background of rapid development,makes the importance of news and information to ge...With the rapid development of social economy,the society has entered into a new stage of development,especially in new media under the background of rapid development,makes the importance of news and information to get the comprehensive promotion,and in order to further identify the positive and negative news,should be fully using machine learning methods,based on the emotion to realize the automatic classifying of news,in order to improve the efficiency of news classification.Therefore,the article first makes clear the basic outline of news sentiment classification.Secondly,the specific way of automatic classification of news emotion is deeply analyzed.On the basis of this,the paper puts forward the concrete measures of automatic classification of news emotion by using machine learning.展开更多
Sentiment analysis is the computational study of how opinions, attitudes, emotions, and perspectives are expressed in language, and has been the important task of natural language processing. Sentiment analysis is hig...Sentiment analysis is the computational study of how opinions, attitudes, emotions, and perspectives are expressed in language, and has been the important task of natural language processing. Sentiment analysis is highly valuable for both research and practical applications. The focuses were put on the difficulties in the construction of sentiment classifiers which normally need tremendous labeled domain training data, and a novel unsupervised framework was proposed to make use of the Chinese idiom resources to develop a general sentiment classifier. Furthermore, the domain adaption of general sentiment classifier was improved by taking the general classifier as the base of a self-training procedure to get a domain self-training sentiment classifier. To validate the effect of the unsupervised framework, several experiments were carried out on publicly available Chinese online reviews dataset. The experiments show that the proposed framework is effective and achieves encouraging results. Specifically, the general classifier outperforms two baselines(a Na?ve 50% baseline and a cross-domain classifier), and the bootstrapping self-training classifier approximates the upper bound domain-specific classifier with the lowest accuracy of 81.5%, but the performance is more stable and the framework needs no labeled training dataset.展开更多
Although much progress has been made to date on sentiment classification, lacking annotated corpora remains a problem. In this paper we propose to expand corpora for Chinese polarity classification via opinion paraphr...Although much progress has been made to date on sentiment classification, lacking annotated corpora remains a problem. In this paper we propose to expand corpora for Chinese polarity classification via opinion paraphrase generation. To this end, we first exploit three strategies for opinion paraphrase generation, namely sentences re-ordering, opinion element substitution and explicit attribution implying. To improve the quality of the generated opinion paraphrases, we define four criteria for opinion paraphrase evaluation and thus present a filtering algorithm to discard improper opinion paraphrase candidates. To assess the proposed method, we further apply the expanded corpus to a SVM classifier for polarity classification. The experimental results show that the generated opinion paraphrases are beneficial to polarity classification.展开更多
文摘Targeted multimodal sentiment classification(TMSC)aims to identify the sentiment polarity of a target mentioned in a multimodal post.The majority of current studies on this task focus on mapping the image and the text to a high-dimensional space in order to obtain and fuse implicit representations,ignoring the rich semantic information contained in the images and not taking into account the contribution of the visual modality in the multimodal fusion representation,which can potentially influence the results of TMSC tasks.This paper proposes a general model for Improving Targeted Multimodal Sentiment Classification with Semantic Description of Images(ITMSC)as a way to tackle these issues and improve the accu-racy of multimodal sentiment analysis.Specifically,the ITMSC model can automatically adjust the contribution of images in the fusion representation through the exploitation of semantic descriptions of images and text similarity relations.Further,we propose a target-based attention module to capture the target-text relevance,an image-based attention module to capture the image-text relevance,and a target-image matching module based on the former two modules to properly align the target with the image so that fine-grained semantic information can be extracted.Our experimental results demonstrate that our model achieves comparable performance with several state-of-the-art approaches on two multimodal sentiment datasets.Our findings indicate that incorporating semantic descriptions of images can enhance our understanding of multimodal content and lead to improved sentiment analysis performance.
文摘Multimodal Sentiment Analysis(SA)is gaining popularity due to its broad application potential.The existing studies have focused on the SA of single modalities,such as texts or photos,posing challenges in effectively handling social media data with multiple modalities.Moreover,most multimodal research has concentrated on merely combining the two modalities rather than exploring their complex correlations,leading to unsatisfactory sentiment classification results.Motivated by this,we propose a new visualtextual sentiment classification model named Multi-Model Fusion(MMF),which uses a mixed fusion framework for SA to effectively capture the essential information and the intrinsic relationship between the visual and textual content.The proposed model comprises three deep neural networks.Two different neural networks are proposed to extract the most emotionally relevant aspects of image and text data.Thus,more discriminative features are gathered for accurate sentiment classification.Then,a multichannel joint fusion modelwith a self-attention technique is proposed to exploit the intrinsic correlation between visual and textual characteristics and obtain emotionally rich information for joint sentiment classification.Finally,the results of the three classifiers are integrated using a decision fusion scheme to improve the robustness and generalizability of the proposed model.An interpretable visual-textual sentiment classification model is further developed using the Local Interpretable Model-agnostic Explanation model(LIME)to ensure the model’s explainability and resilience.The proposed MMF model has been tested on four real-world sentiment datasets,achieving(99.78%)accuracy on Binary_Getty(BG),(99.12%)on Binary_iStock(BIS),(95.70%)on Twitter,and(79.06%)on the Multi-View Sentiment Analysis(MVSA)dataset.These results demonstrate the superior performance of our MMF model compared to single-model approaches and current state-of-the-art techniques based on model evaluation criteria.
基金The authors extend their appreciation to the Deanship of Scientific Research at King Khalid University for funding this work through Large Groups(Project under Grant Number(RGP.2/49/43)).
文摘Textual data streams have been extensively used in practical applications where consumers of online products have expressed their views regarding online products.Due to changes in data distribution,commonly referred to as concept drift,mining this data stream is a challenging problem for researchers.The majority of the existing drift detection techniques are based on classification errors,which have higher probabilities of false-positive or missed detections.To improve classification accuracy,there is a need to develop more intuitive detection techniques that can identify a great number of drifts in the data streams.This paper presents an adaptive unsupervised learning technique,an ensemble classifier based on drift detection for opinion mining and sentiment classification.To improve classification performance,this approach uses four different dissimilarity measures to determine the degree of concept drifts in the data stream.Whenever a drift is detected,the proposed method builds and adds a new classifier to the ensemble.To add a new classifier,the total number of classifiers in the ensemble is first checked if the limit is exceeded before the classifier with the least weight is removed from the ensemble.To this end,a weighting mechanism is used to calculate the weight of each classifier,which decides the contribution of each classifier in the final classification results.Several experiments were conducted on real-world datasets and the resultswere evaluated on the false positive rate,miss detection rate,and accuracy measures.The proposed method is also compared with the state-of-the-art methods,which include DDM,EDDM,and PageHinkley with support vector machine(SVM)and Naive Bayes classifiers that are frequently used in concept drift detection studies.In all cases,the results show the efficiency of our proposed method.
基金This research is supported by Fundamental Research Grant Scheme(FRGS),Ministry of Education Malaysia(MOE)under the project code,FRGS/1/2018/ICT02/USM/02/9 titled,Automated Big Data Annotation for Training Semi-Supervised Deep Learning Model in Sentiment Classification.
文摘Sentiment classification is a useful tool to classify reviews about sentiments and attitudes towards a product or service.Existing studies heavily rely on sentiment classification methods that require fully annotated inputs.However,there is limited labelled text available,making the acquirement process of the fully annotated input costly and labour-intensive.Lately,semi-supervised methods emerge as they require only partially labelled input but perform comparably to supervised methods.Nevertheless,some works reported that the performance of the semi-supervised model degraded after adding unlabelled instances into training.Literature also shows that not all unlabelled instances are equally useful;thus identifying the informative unlabelled instances is beneficial in training a semi-supervised model.To achieve this,an informative score is proposed and incorporated into semisupervised sentiment classification.The evaluation is performed on a semisupervised method without an informative score and with an informative score.By using the informative score in the instance selection strategy to identify informative unlabelled instances,semi-supervised models perform better compared to models that do not incorporate informative scores into their training.Although the performance of semi-supervised models incorporated with an informative score is not able to surpass the supervised models,the results are still found promising as the differences in performance are subtle with a small difference of 2%to 5%,but the number of labelled instances used is greatly reduced from100%to 40%.The best finding of the proposed instance selection strategy is achieved when incorporating an informative score with a baseline confidence score at a 0.5:0.5 ratio using only 40%labelled data.
基金We acknowledge Taif University for Supporting this study through Taif University Researchers Supporting Project number(TURSP-2020/173)Taif University,Taif,Saudi Arabia.
文摘Sentiment Analysis(SA)is one of the subfields in Natural Language Processing(NLP)which focuses on identification and extraction of opinions that exist in the text provided across reviews,social media,blogs,news,and so on.SA has the ability to handle the drastically-increasing unstructured text by transform-ing them into structured data with the help of NLP and open source tools.The current research work designs a novel Modified Red Deer Algorithm(MRDA)Extreme Learning Machine Sparse Autoencoder(ELMSAE)model for SA and classification.The proposed MRDA-ELMSAE technique initially performs pre-processing to transform the data into a compatible format.Moreover,TF-IDF vec-torizer is employed in the extraction of features while ELMSAE model is applied in the classification of sentiments.Furthermore,optimal parameter tuning is done for ELMSAE model using MRDA technique.A wide range of simulation analyses was carried out and results from comparative analysis establish the enhanced effi-ciency of MRDA-ELMSAE technique against other recent techniques.
基金Project supported by the National Natural Science Foundation of China(Nos.61703013,91546111,91646201,61672070,and61672071)the Beijing Municipal Natural Science Foundation(No.4152005)+1 种基金the Key Projects of Beijing Municipal Education Commission(Nos.KZ201610005009 and KM201810005024)the International Cooperation Seed Grant from Beijing University of Technology of 2016(No.007000514116520)
文摘Automatic classification of sentiment data(e.g., reviews, blogs) has many applications in enterprise user management systems, and can help us understand people's attitudes about products or services. However, it is difficult to train an accurate sentiment classifier for different domains. One of the major reasons is that people often use different words to express the same sentiment in different domains, and we cannot easily find a direct mapping relationship between them to reduce the differences between domains. So, the accuracy of the sentiment classifier will decline sharply when we apply a classifier trained in one domain to other domains. In this paper, we propose a novel approach called words alignment based on association rules(WAAR) for cross-domain sentiment classification,which can establish an indirect mapping relationship between domain-specific words in different domains by learning the strong association rules between domain-shared words and domain-specific words in the same domain. In this way, the differences between the source domain and target domain can be reduced to some extent, and a more accurate cross-domain classifier can be trained. Experimental results on Amazon~ datasets show the effectiveness of our approach on improving the performance of cross-domain sentiment classification.
基金Projects(61573380,61303185)supported by the National Natural Science Foundation of ChinaProject(13BTQ052)supported by the National Social Science Foundation of China+1 种基金Project(2016M592450)supported by the China Postdoctoral Science FoundationProject(2016JJ4119)supported by the Hunan Provincial Natural Science Foundation of China
文摘With the rising and spreading of micro-blog, the sentiment classification of short texts has become a research hotspot. Some methods have been developed in the past decade. However, since the Chinese and English are different in language syntax, semantics and pragmatics, sentiment classification methods that are effective for English twitter may fail on Chinese micro-blog. In addition, the colloquialism and conciseness of short Chinese texts introduces additional challenges to sentiment classification. In this work, a novel hybrid learning model was proposed for sentiment classification of Chinese micro-blogs, which included two stages. In the first stage, emotional scores were calculated over the whole dataset by utilizing an improved Chinese-oriented sentiment dictionary classification method. Data with extremely high or low scores were directly labeled. In the second stage, the remaining data were labeled by using an integrated classification method based on sentiment dictionary, support vector machine(SVM) and k-nearest neighbor(KNN). An improved feature selection method was adopted to enhance the discriminative power of the selected features. The two-stage hybrid framework made the proposed method effective for sentiment classification of Chinese micro-blogs. Experiments on the COAE2014(Chinese Opinion Analysis Evaluation 2014) dataset show that the proposed method outperforms other schemes.
基金supported by the National Key Research and Development Program of China(2018YFC0830700)。
文摘This paper presents a method for aspect based sentiment classification tasks, named convolutional multi-head self-attention memory network(CMA-Mem Net). This is an improved model based on memory networks, and makes it possible to extract more rich and complex semantic information from sequences and aspects. In order to fix the memory network’s inability to capture context-related information on a word-level,we propose utilizing convolution to capture n-gram grammatical information. We use multi-head self-attention to make up for the problem where the memory network ignores the semantic information of the sequence itself. Meanwhile, unlike most recurrent neural network(RNN) long short term memory(LSTM), gated recurrent unit(GRU) models, we retain the parallelism of the network. We experiment on the open datasets Sem Eval-2014 Task 4 and Sem Eval-2016 Task 6. Compared with some popular baseline methods, our model performs excellently.
基金Supporting Project Number(PNURSP2022R281),Princess Nourah bint Abdulrahman University,Riyadh,Saudi ArabiaThe authors would like to thank the Deanship of Scientific Research at Umm Al-Qura University for supporting this work by Grant Code:(22UQU4320484DSR08).
文摘Sentiment analysis(SA)is a growing field at the intersection of computer science and computational linguistics that endeavors to automati-cally identify the sentiment presented in text.Computational linguistics aims to describe the fundamental methods utilized in the formation of computer methods for understanding natural language.Sentiment is classified as a negative or positive assessment articulated through language.SA can be commonly used for the movie review classification that involves the automatic determination that a review posted online(of a movie)can be negative or positive toward the thing that has been reviewed.Deep learning(DL)is becoming a powerful machine learning(ML)method for dealing with the increasing demand for precise SA.With this motivation,this study designs a computational intelligence enabled modified sine cosine optimization with a adaptive deep belief network for movie review classification(MSCADBN-MVC)technique.The major intention of the MSCADBN-MVC technique is focused on the identification of sentiments that exist in the movie review data.Primarily,the MSCADBN-MVC model follows data pre-processing and the word2vec word embedding process.For the classification of sentiments that exist in the movie reviews,the ADBN model is utilized in this work.At last,the hyperparameter tuning of the ADBN model is carried out using the MSCA technique,which integrates the Levy flight concepts into the standard sine cosine algorithm(SCA).In order to demonstrate the significant performance of the MSCADBN-MVC model,a wide-ranging experimental analysis is performed on three different datasets.The comprehensive study highlighted the enhancements of the MSCADBN-MVC model in the movie review classification process with maximum accuracy of 88.93%.
文摘The COVID-19 pandemic has become one of the severe diseases in recent years.As it majorly affects the common livelihood of people across the universe,it is essential for administrators and healthcare professionals to be aware of the views of the community so as to monitor the severity of the spread of the outbreak.The public opinions are been shared enormously in microblogging med-ia like twitter and is considered as one of the popular sources to collect public opinions in any topic like politics,sports,entertainment etc.,This work presents a combination of Intensity Based Emotion Classification Convolution Neural Net-work(IBEC-CNN)model and Non-negative Matrix Factorization(NMF)for detecting and analyzing the different topics discussed in the COVID-19 tweets as well the intensity of the emotional content of those tweets.The topics were identified using NMF and the emotions are classified using pretrained IBEC-CNN,based on predefined intensity scores.The research aimed at identifying the emotions in the Indian tweets related to COVID-19 and producing a list of topics discussed by the users during the COVID-19 pandemic.Using the Twitter Application Programming Interface(Twitter API),huge numbers of COVID-19 tweets are retrieved during January and July 2020.The extracted tweets are ana-lyzed for emotions fear,joy,sadness and trust with proposed Intensity Based Emotion Classification Convolution Neural Network(IBEC-CNN)model which is pretrained.The classified tweets are given an intensity score varies from 1 to 3,with 1 being low intensity for the emotion,2 being the moderate and 3 being the high intensity.To identify the topics in the tweets and the themes of those topics,Non-negative Matrix Factorization(NMF)has been employed.Analysis of emotions of COVID-19 tweets has identified,that the count of positive tweets is more than that of count of negative tweets during the period considered and the negative tweets related to COVID-19 is less than 5%.Also,more than 75%nega-tive tweets expressed sadness,fear are of low intensity.A qualitative analysis has also been conducted and the topics detected are grouped into themes such as eco-nomic impacts,case reports,treatments,entertainment and vaccination.The results of analysis show that the issues related to the pandemic are expressed dif-ferent emotions in twitter which helps in interpreting the public insights during the pandemic and these results are beneficial for planning the dissemination of factual health statistics to build the trust of the people.The performance comparison shows that the proposed IBEC-CNN model outperforms the conventional models and achieved 83.71%accuracy.The%of COVID-19 tweets that discussed the different topics vary from 7.45%to 26.43%on topics economy,Statistics on cases,Government/Politics,Entertainment,Lockdown,Treatments and Virtual Events.The least number of tweets discussed on politics/government on the other hand the tweets discussed most about treatments.
基金This work is supported in part by the Natural Science Foundation of China under grants(61503112,61673152 and 61503116).
文摘Recently,the effectiveness of neural networks,especially convolutional neural networks,has been validated in the field of natural language processing,in which,sentiment classification for online reviews is an important and challenging task.Existing convolutional neural networks extract important features of sentences without local features or the feature sequence.Thus,these models do not perform well,especially for transition sentences.To this end,we propose a Piecewise Pooling Convolutional Neural Network(PPCNN)for sentiment classification.Firstly,with a sentence presented by word vectors,convolution operation is introduced to obtain the convolution feature map vectors.Secondly,these vectors are segmented according to the positions of transition words in sentences.Thirdly,the most significant feature of each local segment is extracted using max pooling mechanism,and then the different aspects of features can be extracted.Specifically,the relative sequence of these features is preserved.Finally,after processed by the dropout algorithm,the softmax classifier is trained for sentiment classification.Experimental results show that the proposed method PPCNN is effective and superior to other baseline methods,especially for datasets with transition sentences.
基金Princess Nourah bint Abdulrahman University Researchers Supporting Project Number (PNURSP2022R281)Princess Nourah bint Abdulrahman University,Riyadh,Saudi ArabiaDeanship of Scientific Research at Umm Al-Qura University for supporting this work by Grant Code: (22UQU4340237DSR61).
文摘Currently,individuals use online social media,namely Facebook or Twitter,for sharing their thoughts and emotions.Detection of emotions on social networking sites’finds useful in several applications in social welfare,commerce,public health,and so on.Emotion is expressed in several means,like facial and speech expressions,gestures,and written text.Emotion recognition in a text document is a content-based classification problem that includes notions from deep learning(DL)and natural language processing(NLP)domains.This article proposes a Deer HuntingOptimizationwithDeep Belief Network Enabled Emotion Classification(DHODBN-EC)on English Twitter Data in this study.The presented DHODBN-EC model aims to examine the existence of distinct emotion classes in tweets.At the introductory level,the DHODBN-EC technique pre-processes the tweets at different levels.Besides,the word2vec feature extraction process is applied to generate the word embedding process.For emotion classification,the DHODBN-EC model utilizes the DBN model,which helps to determine distinct emotion class labels.Lastly,the DHO algorithm is leveraged for optimal hyperparameter adjustment of the DBN technique.An extensive range of experimental analyses can be executed to demonstrate the enhanced performance of the DHODBN-EC approach.A comprehensive comparison study exhibited the improvements of the DHODBN-EC model over other approaches with increased accuracy of 96.67%.
基金National Natural Science Foundation of China(No.61562057)Gansu Science and Technology Plan Project(No.18JR3RA104)。
文摘With the development of short video industry,video and bullet screen have become important ways to spread public opinions.Public attitudes can be timely obtained through emotional analysis on bullet screen,which can also reduce difficulties in management of online public opinions.A convolutional neural network model based on multi-head attention is proposed to solve the problem of how to effectively model relations among words and identify key words in emotion classification tasks with short text contents and lack of complete context information.Firstly,encode word positions so that order information of input sequences can be used by the model.Secondly,use a multi-head attention mechanism to obtain semantic expressions in different subspaces,effectively capture internal relevance and enhance dependent relationships among words,as well as highlight emotional weights of key emotional words.Then a dilated convolution is used to increase the receptive field and extract more features.On this basis,the above multi-attention mechanism is combined with a convolutional neural network to model and analyze the seven emotional categories of bullet screens.Testing from perspectives of model and dataset,experimental results can validate effectiveness of our approach.Finally,emotions of bullet screens are visualized to provide data supports for hot event controls and other fields.
基金Supported by National High Technology Research and Development Program of China (863 Program) (2008AA01Z144) National Natural Science Foundation of China (60803093 60975055)
基金This study was supported by the National Key R\&D Program of China,Grant No.2018YFA0306703.
文摘The ability of pre-trained BERT model to achieve outstanding performances on many Natural Language Processing(NLP)tasks has attracted the attention of researchers in recent times.However,the huge computational and memory requirements have hampered its widespread deployment on devices with limited resources.The concept of knowledge distillation has shown to produce smaller and faster distilled models with less trainable parameters and intended for resource-constrained environments.The distilled models can be fine-tuned with great performance on a wider range of tasks,such as sentiment classification.This paper evaluates the performance of DistilBERT model and other pre-canned text classifiers on a Covid-19 online news binary classification dataset.The analysis shows that despite having fewer trainable parameters than the BERT-based model,the DistilBERT model achieved an accuracy of 0.94 on the validation set after only two training epochs.The paper also highlights the usefulness of the ktrain library in facilitating the building,training,and application of state-of-the-art Machine Learning and Deep Learning models.
基金National Natural Science Foundation of China(No.71331008)
文摘Sentiment analysis is now more and more important in modern natural language processing,and the sentiment classification is the one of the most popular applications.The crucial part of sentiment classification is feature extraction.In this paper,two methods for feature extraction,feature selection and feature embedding,are compared.Then Word2Vec is used as an embedding method.In this experiment,Chinese document is used as the corpus,and tree methods are used to get the features of a document:average word vectors,Doc2Vec and weighted average word vectors.After that,these samples are fed to three machine learning algorithms to do the classification,and support vector machine(SVM) has the best result.Finally,the parameters of random forest are analyzed.
文摘With the rapid development of social economy,the society has entered into a new stage of development,especially in new media under the background of rapid development,makes the importance of news and information to get the comprehensive promotion,and in order to further identify the positive and negative news,should be fully using machine learning methods,based on the emotion to realize the automatic classifying of news,in order to improve the efficiency of news classification.Therefore,the article first makes clear the basic outline of news sentiment classification.Secondly,the specific way of automatic classification of news emotion is deeply analyzed.On the basis of this,the paper puts forward the concrete measures of automatic classification of news emotion by using machine learning.
基金This paper is supported by National Natural Science Foundation of China (No. 61074078) and Fundamental Research Funds for the Central Universities (No. 12MS121).
基金Projects(61170156,60933005)supported by the National Natural Science Foundation of China
文摘Sentiment analysis is the computational study of how opinions, attitudes, emotions, and perspectives are expressed in language, and has been the important task of natural language processing. Sentiment analysis is highly valuable for both research and practical applications. The focuses were put on the difficulties in the construction of sentiment classifiers which normally need tremendous labeled domain training data, and a novel unsupervised framework was proposed to make use of the Chinese idiom resources to develop a general sentiment classifier. Furthermore, the domain adaption of general sentiment classifier was improved by taking the general classifier as the base of a self-training procedure to get a domain self-training sentiment classifier. To validate the effect of the unsupervised framework, several experiments were carried out on publicly available Chinese online reviews dataset. The experiments show that the proposed framework is effective and achieves encouraging results. Specifically, the general classifier outperforms two baselines(a Na?ve 50% baseline and a cross-domain classifier), and the bootstrapping self-training classifier approximates the upper bound domain-specific classifier with the lowest accuracy of 81.5%, but the performance is more stable and the framework needs no labeled training dataset.
基金This study was supported by Natural Science Foundation of Heilongjiang Province under Grant No. F2016036, National Natural Science Foundation of China under Grant No. 61170148, and the Returned Scholar Foundation of Heilongjiang Province, respectively.
文摘Although much progress has been made to date on sentiment classification, lacking annotated corpora remains a problem. In this paper we propose to expand corpora for Chinese polarity classification via opinion paraphrase generation. To this end, we first exploit three strategies for opinion paraphrase generation, namely sentences re-ordering, opinion element substitution and explicit attribution implying. To improve the quality of the generated opinion paraphrases, we define four criteria for opinion paraphrase evaluation and thus present a filtering algorithm to discard improper opinion paraphrase candidates. To assess the proposed method, we further apply the expanded corpus to a SVM classifier for polarity classification. The experimental results show that the generated opinion paraphrases are beneficial to polarity classification.