Aiming at the importance of the analysis for public opinion on Internet, the authors propose a high-performance extraction method for public opinion. In this method, the space model for classification is adopted to de...Aiming at the importance of the analysis for public opinion on Internet, the authors propose a high-performance extraction method for public opinion. In this method, the space model for classification is adopted to describe the relationship between words and categories. The combined feature selection method is used to remove noisy words from the original feature space effectively. Then the category weight of words is calculated by the improved formula combining the frequency of words and distribution of words. Finally, the class weights of the not-categorized documents based on the category weight of words are obtained for realizing opinion extraction. Experiment results show that the method has comparatively high classification and good stability.展开更多
With rapid development of E-commerce, a large amount of data including reviews about different types of products can be accessed within short time. On top of this, opinion mining is becoming increasingly effective to ...With rapid development of E-commerce, a large amount of data including reviews about different types of products can be accessed within short time. On top of this, opinion mining is becoming increasingly effective to extract valuable information for product design, improvement and brand marketing, especially with fine-grained opinion mining. However, limited by the unstructured and causal expression of opinions, one cannot extract valuable information conveniently. In this paper, we propose an integrated strategy to automatically extract feature-based information, with which one can easily acquire detailed opinion about certain products.For adaptation to the reviews' characteristics, our strategy is made up of a multi-label classification(MLC) for reviews, a binary classification(BC) for sentences and a sentence-level sequence labelling with a deep learning method. During experiment, our approach achieves 82% accuracy in the final sequence labelling task under the setting of a 20-fold cross validation. In addition, the strategy can be expediently employed in other reviews as long as there is an according amount of labelled data for startup.展开更多
Opinion target extraction is one of the core tasks in sentiment analysison text data. In recent years, dependency parser–based approaches have beencommonly studied for opinion target extraction. However, dependency p...Opinion target extraction is one of the core tasks in sentiment analysison text data. In recent years, dependency parser–based approaches have beencommonly studied for opinion target extraction. However, dependency parsersare limited by language and grammatical constraints. Therefore, in this work, asequential pattern-based rule mining model, which does not have such constraints,is proposed for cross-domain opinion target extraction from product reviews inunknown domains. Thus, knowing the domain of reviews while extracting opinion targets becomes no longer a requirement. The proposed model also revealsthe difference between the concepts of opinion target and aspect, which are commonly confused in the literature. The model consists of two stages. In the firststage, the aspects of reviews are extracted from the target domain using the rulesautomatically generated from source domains. The aspects are also transferredfrom the source domains to a target domain. Moreover, aspect pruning is appliedto further improve the performance of aspect extraction. In the second stage, theopinion target is extracted among the aspects extracted at the former stage usingthe rules automatically generated for opinion target extraction. The proposedmodel was evaluated on several benchmark datasets in different domains andcompared against the literature. The experimental results revealed that the opiniontargets of the reviews in unknown domains can be extracted with higher accuracythan those of the previous works.展开更多
Opinion targets extraction of Chinese microblogs plays an important role in opinion mining. There has been a significant progress in this area recently, especially the method based on conditional random field (CRF)....Opinion targets extraction of Chinese microblogs plays an important role in opinion mining. There has been a significant progress in this area recently, especially the method based on conditional random field (CRF). However, this method only takes lexicon-related features into consideration and does not excavate the implied syntactic and semantic knowledge. We propose a novel approach which incorporates domain lexicon with groups of syntactical and semantic features. The approach acquires domain lexicon through a novel way which explores syntactic and semantic information through Part- of-Speech, dependency structure, phrase structure, semantic role and semantic similarity based on word embedding. And then we combine the domain lexicon with opinion targets extracted from CRF with groups of features for opinion targets extraction. Experimental results on COAE2014 dataset show the outperformance of the approach compared with other well-known methods on the task of opinion targets extraction.展开更多
基金Supported by the National High Technology Research and Development Program of China (2005AA147030)
文摘Aiming at the importance of the analysis for public opinion on Internet, the authors propose a high-performance extraction method for public opinion. In this method, the space model for classification is adopted to describe the relationship between words and categories. The combined feature selection method is used to remove noisy words from the original feature space effectively. Then the category weight of words is calculated by the improved formula combining the frequency of words and distribution of words. Finally, the class weights of the not-categorized documents based on the category weight of words are obtained for realizing opinion extraction. Experiment results show that the method has comparatively high classification and good stability.
基金the National Natural Science Foundation of China(No.61375053)
文摘With rapid development of E-commerce, a large amount of data including reviews about different types of products can be accessed within short time. On top of this, opinion mining is becoming increasingly effective to extract valuable information for product design, improvement and brand marketing, especially with fine-grained opinion mining. However, limited by the unstructured and causal expression of opinions, one cannot extract valuable information conveniently. In this paper, we propose an integrated strategy to automatically extract feature-based information, with which one can easily acquire detailed opinion about certain products.For adaptation to the reviews' characteristics, our strategy is made up of a multi-label classification(MLC) for reviews, a binary classification(BC) for sentences and a sentence-level sequence labelling with a deep learning method. During experiment, our approach achieves 82% accuracy in the final sequence labelling task under the setting of a 20-fold cross validation. In addition, the strategy can be expediently employed in other reviews as long as there is an according amount of labelled data for startup.
文摘Opinion target extraction is one of the core tasks in sentiment analysison text data. In recent years, dependency parser–based approaches have beencommonly studied for opinion target extraction. However, dependency parsersare limited by language and grammatical constraints. Therefore, in this work, asequential pattern-based rule mining model, which does not have such constraints,is proposed for cross-domain opinion target extraction from product reviews inunknown domains. Thus, knowing the domain of reviews while extracting opinion targets becomes no longer a requirement. The proposed model also revealsthe difference between the concepts of opinion target and aspect, which are commonly confused in the literature. The model consists of two stages. In the firststage, the aspects of reviews are extracted from the target domain using the rulesautomatically generated from source domains. The aspects are also transferredfrom the source domains to a target domain. Moreover, aspect pruning is appliedto further improve the performance of aspect extraction. In the second stage, theopinion target is extracted among the aspects extracted at the former stage usingthe rules automatically generated for opinion target extraction. The proposedmodel was evaluated on several benchmark datasets in different domains andcompared against the literature. The experimental results revealed that the opiniontargets of the reviews in unknown domains can be extracted with higher accuracythan those of the previous works.
基金The work was supported by the National Basic Research 973 Program of China under Grant Nos. 2013CB329605 and 2013CB329303, and the National Natural Science Foundation of China under Grant No. 61201351.
文摘Opinion targets extraction of Chinese microblogs plays an important role in opinion mining. There has been a significant progress in this area recently, especially the method based on conditional random field (CRF). However, this method only takes lexicon-related features into consideration and does not excavate the implied syntactic and semantic knowledge. We propose a novel approach which incorporates domain lexicon with groups of syntactical and semantic features. The approach acquires domain lexicon through a novel way which explores syntactic and semantic information through Part- of-Speech, dependency structure, phrase structure, semantic role and semantic similarity based on word embedding. And then we combine the domain lexicon with opinion targets extracted from CRF with groups of features for opinion targets extraction. Experimental results on COAE2014 dataset show the outperformance of the approach compared with other well-known methods on the task of opinion targets extraction.