Sentiment analysis,commonly called opinion mining or emotion artificial intelligence(AI),employs biometrics,computational linguistics,nat-ural language processing,and text analysis to systematically identify,extract,m...Sentiment analysis,commonly called opinion mining or emotion artificial intelligence(AI),employs biometrics,computational linguistics,nat-ural language processing,and text analysis to systematically identify,extract,measure,and investigate affective states and subjective data.Sentiment analy-sis algorithms include emotion lexicon,traditional machine learning,and deep learning.In the text sentiment analysis algorithm based on a neural network,multi-layer Bi-directional long short-term memory(LSTM)is widely used,but the parameter amount of this model is too huge.Hence,this paper proposes a Bi-directional LSTM with a trapezoidal structure model.The design of the trapezoidal structure is derived from classic neural networks,such as LeNet-5 and AlexNet.These classic models have trapezoidal-like structures,and these structures have achieved success in the field of deep learning.There are two benefits to using the Bi-directional LSTM with a trapezoidal structure.One is that compared with the single-layer configuration,using the of the multi-layer structure can better extract the high-dimensional features of the text.Another is that using the trapezoidal structure can reduce the model’s parameters.This paper introduces the Bi-directional LSTM with a trapezoidal structure model in detail and uses Stanford sentiment treebank 2(STS-2)for experiments.It can be seen from the experimental results that the trapezoidal structure model and the normal structure model have similar performances.However,the trapezoidal structure model parameters are 35.75%less than the normal structure model.展开更多
Sentiment Analysis, an un-abating research area in text mining, requires a computational method for extracting useful information from text. In recent days, social media has become a really rich source to get informat...Sentiment Analysis, an un-abating research area in text mining, requires a computational method for extracting useful information from text. In recent days, social media has become a really rich source to get information about the behavioral state of people(opinion) through reviews and comments. Numerous techniques have been aimed to analyze the sentiment of the text, however, they were unable to come up to the complexity of the sentiments. The complexity requires novel approach for deep analysis of sentiments for more accurate prediction. This research presents a three-step Sentiment Analysis and Prediction(SAP) solution of Text Trend through K-Nearest Neighbor(KNN). At first, sentences are transformed into tokens and stop words are removed. Secondly, polarity of the sentence, paragraph and text is calculated through contributing weighted words, intensity clauses and sentiment shifters. The resulting features extracted in this step played significant role to improve the results. Finally, the trend of the input text has been predicted using KNN classifier based on extracted features. The training and testing of the model has been performed on publically available datasets of twitter and movie reviews. Experiments results illustrated the satisfactory improvement as compared to existing solutions. In addition, GUI(Hello World) based text analysis framework has been designed to perform the text analytics.展开更多
English text sentiment orientation analysis is a fundamental problem in the field of natural language processing.The traditional word segmentation method can produce ambiguity when dealing with English text.Therefore,...English text sentiment orientation analysis is a fundamental problem in the field of natural language processing.The traditional word segmentation method can produce ambiguity when dealing with English text.Therefore,this paper proposes a novel English text sentiment analysis based on convolutional neural network and U-network.The proposed method uses a parallel convolution layer to learn the associations and combinations between word vectors.The results are then input into the hierarchical attention network whose basic unit is U-network to determine the affective tendency.The experimental results show that the accuracy of bias classification on the English review dataset reaches 93.45%.Compared with many existing sentiment analysis models,it has more accuracy.展开更多
With the development of short video industry,video and bullet screen have become important ways to spread public opinions.Public attitudes can be timely obtained through emotional analysis on bullet screen,which can a...With the development of short video industry,video and bullet screen have become important ways to spread public opinions.Public attitudes can be timely obtained through emotional analysis on bullet screen,which can also reduce difficulties in management of online public opinions.A convolutional neural network model based on multi-head attention is proposed to solve the problem of how to effectively model relations among words and identify key words in emotion classification tasks with short text contents and lack of complete context information.Firstly,encode word positions so that order information of input sequences can be used by the model.Secondly,use a multi-head attention mechanism to obtain semantic expressions in different subspaces,effectively capture internal relevance and enhance dependent relationships among words,as well as highlight emotional weights of key emotional words.Then a dilated convolution is used to increase the receptive field and extract more features.On this basis,the above multi-attention mechanism is combined with a convolutional neural network to model and analyze the seven emotional categories of bullet screens.Testing from perspectives of model and dataset,experimental results can validate effectiveness of our approach.Finally,emotions of bullet screens are visualized to provide data supports for hot event controls and other fields.展开更多
The emergence of big data leads to an increasing demand for data processing methods.As the most influential media for Chinese domestic movie ratings,Douban contains a huge amount of data and one can understand users...The emergence of big data leads to an increasing demand for data processing methods.As the most influential media for Chinese domestic movie ratings,Douban contains a huge amount of data and one can understand users'perspectives towards these movies by analyzing these data.In this article,we study movie's critics from the Douban website,perform sentiment analysis on the data obtained by crawling,and visualize the results with a word cloud.We propose a lightweight sentiment analysis method which is free from heavy training and visualize the results in a more conceivable way.展开更多
With the rising and spreading of micro-blog, the sentiment classification of short texts has become a research hotspot. Some methods have been developed in the past decade. However, since the Chinese and English are d...With the rising and spreading of micro-blog, the sentiment classification of short texts has become a research hotspot. Some methods have been developed in the past decade. However, since the Chinese and English are different in language syntax, semantics and pragmatics, sentiment classification methods that are effective for English twitter may fail on Chinese micro-blog. In addition, the colloquialism and conciseness of short Chinese texts introduces additional challenges to sentiment classification. In this work, a novel hybrid learning model was proposed for sentiment classification of Chinese micro-blogs, which included two stages. In the first stage, emotional scores were calculated over the whole dataset by utilizing an improved Chinese-oriented sentiment dictionary classification method. Data with extremely high or low scores were directly labeled. In the second stage, the remaining data were labeled by using an integrated classification method based on sentiment dictionary, support vector machine(SVM) and k-nearest neighbor(KNN). An improved feature selection method was adopted to enhance the discriminative power of the selected features. The two-stage hybrid framework made the proposed method effective for sentiment classification of Chinese micro-blogs. Experiments on the COAE2014(Chinese Opinion Analysis Evaluation 2014) dataset show that the proposed method outperforms other schemes.展开更多
Bike sharing is considered a state-of-the-art transportation program. It is ideal for short or medium trips providing riders the ability to pick up a bike at any self-serve bike station and return it to any bike stati...Bike sharing is considered a state-of-the-art transportation program. It is ideal for short or medium trips providing riders the ability to pick up a bike at any self-serve bike station and return it to any bike station located within the system’s coverage area. The bike sharing programs in the United States are still very young compared to those in European countries. Washington DC was the first jurisdiction to devise a third generation bike sharing system in the US in 2008. To evaluate the popularity of a bike sharing program, a sentiment analysis of the riders’ feedback can be performed. Twitter is a great platform to understand people’s views instantly. Social media mining is, thus, gaining popularity in many research areas including transportation. Social media mining has two major advantages over conventional attitudinal survey methods—it can easily reach a large audience and it can reflect the true behavior of participants because of the anonymity social media provides. It is known that self-imposed censor is common in responding to conversational attitudinal surveys. This study performed text mining on the tweets related to a case study (Capital Bike share of Washington DC) to perform sentiment analysis or opinion mining. The results of the text mining mostly revealed higher positive sentiments towards the current system.展开更多
The rapid growth of social networks has produced an unprecedented amount of user-generated data, which provides an excellent opportunity for text mining. Sentiment analysis, an important part of text mining, attempts ...The rapid growth of social networks has produced an unprecedented amount of user-generated data, which provides an excellent opportunity for text mining. Sentiment analysis, an important part of text mining, attempts to learn about the authors’ opinion on a text through its content and structure. Such information is particularly valuable for determining the overall opinion of a large number of people. Examples of the usefulness of this are predicting box office sales or stock prices. One of the most accessible sources of user-generated data is Twitter, which makes the majority of its user data freely available through its data access API. In this study we seek to predict a sentiment value for stock related tweets on Twitter, and demonstrate a correlation between this sentiment and the movement of a company’s stock price in a real time streaming environment. Both n-gram and “word2vec” textual representation techniques are used alongside a random forest classification algorithm to predict the sentiment of tweets. These values are then evaluated for correlation between stock prices and Twitter sentiment for that each company. There are significant correlations between price and sentiment for several individual companies. Some companies such as Microsoft and Walmart show strong positive correlation, while others such as Goldman Sachs and Cisco Systems show strong negative correlation. This suggests that consumer facing companies are affected differently than other companies. Overall this appears to be a promising field for future research.展开更多
Opinion (sentiment) analysis on big data streams from the constantly generated text streams on social media networks to hundreds of millions of online consumer reviews provides many organizations in every field with o...Opinion (sentiment) analysis on big data streams from the constantly generated text streams on social media networks to hundreds of millions of online consumer reviews provides many organizations in every field with opportunities to discover valuable intelligence from the massive user generated text streams. However, the traditional content analysis frameworks are inefficient to handle the unprecedentedly big volume of unstructured text streams and the complexity of text analysis tasks for the real time opinion analysis on the big data streams. In this paper, we propose a parallel real time sentiment analysis system: Social Media Data Stream Sentiment Analysis Service (SMDSSAS) that performs multiple phases of sentiment analysis of social media text streams effectively in real time with two fully analytic opinion mining models to combat the scale of text data streams and the complexity of sentiment analysis processing on unstructured text streams. We propose two aspect based opinion mining models: Deterministic and Probabilistic sentiment models for a real time sentiment analysis on the user given topic related data streams. Experiments on the social media Twitter stream traffic captured during the pre-election weeks of the 2016 Presidential election for real-time analysis of public opinions toward two presidential candidates showed that the proposed system was able to predict correctly Donald Trump as the winner of the 2016 Presidential election. The cross validation results showed that the proposed sentiment models with the real-time streaming components in our proposed framework delivered effectively the analysis of the opinions on two presidential candidates with average 81% accuracy for the Deterministic model and 80% for the Probabilistic model, which are 1% - 22% improvements from the results of the existing literature.展开更多
基金supported by Yunnan Provincial Education Department Science Foundation of China under Grant construction of the seventh batch of key engineering research centers in colleges and universities(Grant Project:Yunnan College and University Edge Computing Network Engineering Research Center).
文摘Sentiment analysis,commonly called opinion mining or emotion artificial intelligence(AI),employs biometrics,computational linguistics,nat-ural language processing,and text analysis to systematically identify,extract,measure,and investigate affective states and subjective data.Sentiment analy-sis algorithms include emotion lexicon,traditional machine learning,and deep learning.In the text sentiment analysis algorithm based on a neural network,multi-layer Bi-directional long short-term memory(LSTM)is widely used,but the parameter amount of this model is too huge.Hence,this paper proposes a Bi-directional LSTM with a trapezoidal structure model.The design of the trapezoidal structure is derived from classic neural networks,such as LeNet-5 and AlexNet.These classic models have trapezoidal-like structures,and these structures have achieved success in the field of deep learning.There are two benefits to using the Bi-directional LSTM with a trapezoidal structure.One is that compared with the single-layer configuration,using the of the multi-layer structure can better extract the high-dimensional features of the text.Another is that using the trapezoidal structure can reduce the model’s parameters.This paper introduces the Bi-directional LSTM with a trapezoidal structure model in detail and uses Stanford sentiment treebank 2(STS-2)for experiments.It can be seen from the experimental results that the trapezoidal structure model and the normal structure model have similar performances.However,the trapezoidal structure model parameters are 35.75%less than the normal structure model.
文摘Sentiment Analysis, an un-abating research area in text mining, requires a computational method for extracting useful information from text. In recent days, social media has become a really rich source to get information about the behavioral state of people(opinion) through reviews and comments. Numerous techniques have been aimed to analyze the sentiment of the text, however, they were unable to come up to the complexity of the sentiments. The complexity requires novel approach for deep analysis of sentiments for more accurate prediction. This research presents a three-step Sentiment Analysis and Prediction(SAP) solution of Text Trend through K-Nearest Neighbor(KNN). At first, sentences are transformed into tokens and stop words are removed. Secondly, polarity of the sentence, paragraph and text is calculated through contributing weighted words, intensity clauses and sentiment shifters. The resulting features extracted in this step played significant role to improve the results. Finally, the trend of the input text has been predicted using KNN classifier based on extracted features. The training and testing of the model has been performed on publically available datasets of twitter and movie reviews. Experiments results illustrated the satisfactory improvement as compared to existing solutions. In addition, GUI(Hello World) based text analysis framework has been designed to perform the text analytics.
文摘English text sentiment orientation analysis is a fundamental problem in the field of natural language processing.The traditional word segmentation method can produce ambiguity when dealing with English text.Therefore,this paper proposes a novel English text sentiment analysis based on convolutional neural network and U-network.The proposed method uses a parallel convolution layer to learn the associations and combinations between word vectors.The results are then input into the hierarchical attention network whose basic unit is U-network to determine the affective tendency.The experimental results show that the accuracy of bias classification on the English review dataset reaches 93.45%.Compared with many existing sentiment analysis models,it has more accuracy.
基金National Natural Science Foundation of China(No.61562057)Gansu Science and Technology Plan Project(No.18JR3RA104)。
文摘With the development of short video industry,video and bullet screen have become important ways to spread public opinions.Public attitudes can be timely obtained through emotional analysis on bullet screen,which can also reduce difficulties in management of online public opinions.A convolutional neural network model based on multi-head attention is proposed to solve the problem of how to effectively model relations among words and identify key words in emotion classification tasks with short text contents and lack of complete context information.Firstly,encode word positions so that order information of input sequences can be used by the model.Secondly,use a multi-head attention mechanism to obtain semantic expressions in different subspaces,effectively capture internal relevance and enhance dependent relationships among words,as well as highlight emotional weights of key emotional words.Then a dilated convolution is used to increase the receptive field and extract more features.On this basis,the above multi-attention mechanism is combined with a convolutional neural network to model and analyze the seven emotional categories of bullet screens.Testing from perspectives of model and dataset,experimental results can validate effectiveness of our approach.Finally,emotions of bullet screens are visualized to provide data supports for hot event controls and other fields.
文摘The emergence of big data leads to an increasing demand for data processing methods.As the most influential media for Chinese domestic movie ratings,Douban contains a huge amount of data and one can understand users'perspectives towards these movies by analyzing these data.In this article,we study movie's critics from the Douban website,perform sentiment analysis on the data obtained by crawling,and visualize the results with a word cloud.We propose a lightweight sentiment analysis method which is free from heavy training and visualize the results in a more conceivable way.
基金Projects(61573380,61303185)supported by the National Natural Science Foundation of ChinaProject(13BTQ052)supported by the National Social Science Foundation of China+1 种基金Project(2016M592450)supported by the China Postdoctoral Science FoundationProject(2016JJ4119)supported by the Hunan Provincial Natural Science Foundation of China
文摘With the rising and spreading of micro-blog, the sentiment classification of short texts has become a research hotspot. Some methods have been developed in the past decade. However, since the Chinese and English are different in language syntax, semantics and pragmatics, sentiment classification methods that are effective for English twitter may fail on Chinese micro-blog. In addition, the colloquialism and conciseness of short Chinese texts introduces additional challenges to sentiment classification. In this work, a novel hybrid learning model was proposed for sentiment classification of Chinese micro-blogs, which included two stages. In the first stage, emotional scores were calculated over the whole dataset by utilizing an improved Chinese-oriented sentiment dictionary classification method. Data with extremely high or low scores were directly labeled. In the second stage, the remaining data were labeled by using an integrated classification method based on sentiment dictionary, support vector machine(SVM) and k-nearest neighbor(KNN). An improved feature selection method was adopted to enhance the discriminative power of the selected features. The two-stage hybrid framework made the proposed method effective for sentiment classification of Chinese micro-blogs. Experiments on the COAE2014(Chinese Opinion Analysis Evaluation 2014) dataset show that the proposed method outperforms other schemes.
文摘Bike sharing is considered a state-of-the-art transportation program. It is ideal for short or medium trips providing riders the ability to pick up a bike at any self-serve bike station and return it to any bike station located within the system’s coverage area. The bike sharing programs in the United States are still very young compared to those in European countries. Washington DC was the first jurisdiction to devise a third generation bike sharing system in the US in 2008. To evaluate the popularity of a bike sharing program, a sentiment analysis of the riders’ feedback can be performed. Twitter is a great platform to understand people’s views instantly. Social media mining is, thus, gaining popularity in many research areas including transportation. Social media mining has two major advantages over conventional attitudinal survey methods—it can easily reach a large audience and it can reflect the true behavior of participants because of the anonymity social media provides. It is known that self-imposed censor is common in responding to conversational attitudinal surveys. This study performed text mining on the tweets related to a case study (Capital Bike share of Washington DC) to perform sentiment analysis or opinion mining. The results of the text mining mostly revealed higher positive sentiments towards the current system.
文摘The rapid growth of social networks has produced an unprecedented amount of user-generated data, which provides an excellent opportunity for text mining. Sentiment analysis, an important part of text mining, attempts to learn about the authors’ opinion on a text through its content and structure. Such information is particularly valuable for determining the overall opinion of a large number of people. Examples of the usefulness of this are predicting box office sales or stock prices. One of the most accessible sources of user-generated data is Twitter, which makes the majority of its user data freely available through its data access API. In this study we seek to predict a sentiment value for stock related tweets on Twitter, and demonstrate a correlation between this sentiment and the movement of a company’s stock price in a real time streaming environment. Both n-gram and “word2vec” textual representation techniques are used alongside a random forest classification algorithm to predict the sentiment of tweets. These values are then evaluated for correlation between stock prices and Twitter sentiment for that each company. There are significant correlations between price and sentiment for several individual companies. Some companies such as Microsoft and Walmart show strong positive correlation, while others such as Goldman Sachs and Cisco Systems show strong negative correlation. This suggests that consumer facing companies are affected differently than other companies. Overall this appears to be a promising field for future research.
文摘Opinion (sentiment) analysis on big data streams from the constantly generated text streams on social media networks to hundreds of millions of online consumer reviews provides many organizations in every field with opportunities to discover valuable intelligence from the massive user generated text streams. However, the traditional content analysis frameworks are inefficient to handle the unprecedentedly big volume of unstructured text streams and the complexity of text analysis tasks for the real time opinion analysis on the big data streams. In this paper, we propose a parallel real time sentiment analysis system: Social Media Data Stream Sentiment Analysis Service (SMDSSAS) that performs multiple phases of sentiment analysis of social media text streams effectively in real time with two fully analytic opinion mining models to combat the scale of text data streams and the complexity of sentiment analysis processing on unstructured text streams. We propose two aspect based opinion mining models: Deterministic and Probabilistic sentiment models for a real time sentiment analysis on the user given topic related data streams. Experiments on the social media Twitter stream traffic captured during the pre-election weeks of the 2016 Presidential election for real-time analysis of public opinions toward two presidential candidates showed that the proposed system was able to predict correctly Donald Trump as the winner of the 2016 Presidential election. The cross validation results showed that the proposed sentiment models with the real-time streaming components in our proposed framework delivered effectively the analysis of the opinions on two presidential candidates with average 81% accuracy for the Deterministic model and 80% for the Probabilistic model, which are 1% - 22% improvements from the results of the existing literature.