This paper is an empirical study of unsupervised sentiment classification of Chinese reviews. The focus is on exploring the ways to improve the performance of the unsupervised sentiment classification based on limited...This paper is an empirical study of unsupervised sentiment classification of Chinese reviews. The focus is on exploring the ways to improve the performance of the unsupervised sentiment classification based on limited existing sentiment resources in Chinese. On the one hand, all available Chinese sentiment lexicons - individual and combined - are evaluated under our proposed framework. On the other hand, the domain dependent sentiment noise words are identified and removed using unlabeled data, to improve the classification performance. To the best of our knowledge, this is the first such attempt. Experiments have been conducted on three open datasets in two domains, and the results show that the proposed algorithm for sentiment noise words removal can improve the classification performance significantly.展开更多
Sentiment analysis is one of the most popular fields in NLP,and with the development of computer software and hardware,its application is increasingly extensive.Supervised corpus has a positive effect on model trainin...Sentiment analysis is one of the most popular fields in NLP,and with the development of computer software and hardware,its application is increasingly extensive.Supervised corpus has a positive effect on model training,but these corpus are prohibitively expensive to manually produce.This paper proposes a deep learning sentiment analysis model based on transfer learning.It represents the sentiment and semantics of words and improves the effect of Vietnamese sentiment analysis model by using English corpus.It generated semantic vectors through Word2Vec,an open-source tool,and built sentiment vectors through LSTM with attention mechanism to get sentiment word vector.With the method of sharing parameters,the model was pre-training with English corpus.Finally,the sentiment of the text was classified by stacked Bi-LSTM with attention mechanism,with input of sentiment word vector.Experiments show that the model can effectively improve the performance of Vietnamese sentiment analysis under small language materials.展开更多
This study addresses the problem of Chinese microblog opinion retrieval, which aims to retrieve opinionated Chinese microblog posts relevant to a target specified by a user query. Existing studies have shown that lexi...This study addresses the problem of Chinese microblog opinion retrieval, which aims to retrieve opinionated Chinese microblog posts relevant to a target specified by a user query. Existing studies have shown that lexicon-based approaches employed online public sentiment resources to rank sentiment words relying on the document features. However, this approach could not be effectively applied to mi- croblogs that have typical user-generated content with valu- able contextual information: "user-user" interpersonal interactions and "user-post/comment" intrapersonal interactions. This contextual information is very helpful in estimating the strength of sentiment words more accurately. In this study, we integrate the social contextual relationships among users, posts/comments, and sentiment words into a mutual reinforcement model and propose a unified three-layer heterogeneous graph, on which a random walk sentiment word weighting algorithm is presented to measure the strength of opinion of the sentiment words. Furthermore, the weights of sentiment words are incorporated into a lexicon-based model for Chinese microblog opinion retrieval. Comparative experiments are conducted on a Chinese microblog corpus, and the results show that our proposed mutual reinforcement model achieves significant improvement over previous methods.展开更多
基金Supported by the National Natural Science Foundation of China(Nos.60405011,60575057,and 60875073)
文摘This paper is an empirical study of unsupervised sentiment classification of Chinese reviews. The focus is on exploring the ways to improve the performance of the unsupervised sentiment classification based on limited existing sentiment resources in Chinese. On the one hand, all available Chinese sentiment lexicons - individual and combined - are evaluated under our proposed framework. On the other hand, the domain dependent sentiment noise words are identified and removed using unlabeled data, to improve the classification performance. To the best of our knowledge, this is the first such attempt. Experiments have been conducted on three open datasets in two domains, and the results show that the proposed algorithm for sentiment noise words removal can improve the classification performance significantly.
基金Chinese National Science Foundation(#61763007)the higher education research project of National Ethnic Affairs Commission“Research and Practice on the Training Mode of Applied Innovative SoftwareTalents Base on Collaborative Education and innovation” (17056)the InnovationTeam project of Xiangsihu Youth Scholars of Guangxi University For Nationalities。
文摘Sentiment analysis is one of the most popular fields in NLP,and with the development of computer software and hardware,its application is increasingly extensive.Supervised corpus has a positive effect on model training,but these corpus are prohibitively expensive to manually produce.This paper proposes a deep learning sentiment analysis model based on transfer learning.It represents the sentiment and semantics of words and improves the effect of Vietnamese sentiment analysis model by using English corpus.It generated semantic vectors through Word2Vec,an open-source tool,and built sentiment vectors through LSTM with attention mechanism to get sentiment word vector.With the method of sharing parameters,the model was pre-training with English corpus.Finally,the sentiment of the text was classified by stacked Bi-LSTM with attention mechanism,with input of sentiment word vector.Experiments show that the model can effectively improve the performance of Vietnamese sentiment analysis under small language materials.
文摘This study addresses the problem of Chinese microblog opinion retrieval, which aims to retrieve opinionated Chinese microblog posts relevant to a target specified by a user query. Existing studies have shown that lexicon-based approaches employed online public sentiment resources to rank sentiment words relying on the document features. However, this approach could not be effectively applied to mi- croblogs that have typical user-generated content with valu- able contextual information: "user-user" interpersonal interactions and "user-post/comment" intrapersonal interactions. This contextual information is very helpful in estimating the strength of sentiment words more accurately. In this study, we integrate the social contextual relationships among users, posts/comments, and sentiment words into a mutual reinforcement model and propose a unified three-layer heterogeneous graph, on which a random walk sentiment word weighting algorithm is presented to measure the strength of opinion of the sentiment words. Furthermore, the weights of sentiment words are incorporated into a lexicon-based model for Chinese microblog opinion retrieval. Comparative experiments are conducted on a Chinese microblog corpus, and the results show that our proposed mutual reinforcement model achieves significant improvement over previous methods.