Sentiment analysis is now more and more important in modern natural language processing,and the sentiment classification is the one of the most popular applications.The crucial part of sentiment classification is feat...Sentiment analysis is now more and more important in modern natural language processing,and the sentiment classification is the one of the most popular applications.The crucial part of sentiment classification is feature extraction.In this paper,two methods for feature extraction,feature selection and feature embedding,are compared.Then Word2Vec is used as an embedding method.In this experiment,Chinese document is used as the corpus,and tree methods are used to get the features of a document:average word vectors,Doc2Vec and weighted average word vectors.After that,these samples are fed to three machine learning algorithms to do the classification,and support vector machine(SVM) has the best result.Finally,the parameters of random forest are analyzed.展开更多
基金National Natural Science Foundation of China(No.71331008)
文摘Sentiment analysis is now more and more important in modern natural language processing,and the sentiment classification is the one of the most popular applications.The crucial part of sentiment classification is feature extraction.In this paper,two methods for feature extraction,feature selection and feature embedding,are compared.Then Word2Vec is used as an embedding method.In this experiment,Chinese document is used as the corpus,and tree methods are used to get the features of a document:average word vectors,Doc2Vec and weighted average word vectors.After that,these samples are fed to three machine learning algorithms to do the classification,and support vector machine(SVM) has the best result.Finally,the parameters of random forest are analyzed.