Aiming at the problem that existing models in aspect-level sentiment analysis cannot fully and effectively utilize sentence semantic and syntactic structure information, this paper proposes a graph neural network-base...Aiming at the problem that existing models in aspect-level sentiment analysis cannot fully and effectively utilize sentence semantic and syntactic structure information, this paper proposes a graph neural network-based aspect-level sentiment classification model. Self-attention, aspectual word multi-head attention and dependent syntactic relations are fused and the node representations are enhanced with graph convolutional networks to enable the model to fully learn the global semantic and syntactic structural information of sentences. Experimental results show that the model performs well on three public benchmark datasets Rest14, Lap14, and Twitter, improving the accuracy of sentiment classification.展开更多
To address the difficulty of training high-quality models in some specific domains due to the lack of fine-grained annotation resources, we propose in this paper a knowledge-integrated cross-domain data generation met...To address the difficulty of training high-quality models in some specific domains due to the lack of fine-grained annotation resources, we propose in this paper a knowledge-integrated cross-domain data generation method for unsupervised domain adaptation tasks. Specifically, we extract domain features, lexical and syntactic knowledge from source-domain and target-domain data, and use a masking model with an extended masking strategy and a re-masking strategy to obtain domain-specific data that remove domain-specific features. Finally, we improve the sequence generation model BART and use it to generate high-quality target domain data for the task of aspect and opinion co-extraction from the target domain. Experiments were performed on three conventional English datasets from different domains, and our method generates more accurate and diverse target domain data with the best results compared to previous methods.展开更多
Purpose:In order to annotate the semantic information and extract the research level information of research papers,we attempt to seek a method to develop an information extraction system.Design/methodology/approach:S...Purpose:In order to annotate the semantic information and extract the research level information of research papers,we attempt to seek a method to develop an information extraction system.Design/methodology/approach:Semantic dictionary and conditional random field model(CRFM)were used to annotate the semantic information of research papers.Based on the annotation results,the research level information was extracted through regular expression.All the functions were implemented on Sybase platform.Findings:According to the result of our experiment in carbon nanotube research,the precision and recall rates reached 65.13%and 57.75%,respectively after the semantic properties of word class have been labeled,and F-measure increased dramatically from less than 50%to60.18%while added with semantic features.Our experiment also showed that the information extraction system for research level(IESRL)can extract performance indicators from research papers rapidly and effectively.Research limitations:Some text information,such as that of format and chart,might have been lost due to the extraction processing of text format from PDF to TXT files.Semantic labeling on sentences could be insufficient due to the rich meaning of lexicons in the semantic dictionary.Research implications:The established system can help researchers rapidly compare the level of different research papers and find out their implicit innovation values.It could also be used as an auxiliary tool for analyzing research levels of various research institutions.Originality/value:In this work,we have successfully established an information extraction system for research papers by a revised semantic annotation method based on CRFM and the semantic dictionary.Our system can analyze the information extraction problem from two levels,i.e.from the sentence level and noun(phrase)level of research papers.Compared with the extraction method based on knowledge engineering and that on machine learning,our system shows advantages of the both.展开更多
文摘Aiming at the problem that existing models in aspect-level sentiment analysis cannot fully and effectively utilize sentence semantic and syntactic structure information, this paper proposes a graph neural network-based aspect-level sentiment classification model. Self-attention, aspectual word multi-head attention and dependent syntactic relations are fused and the node representations are enhanced with graph convolutional networks to enable the model to fully learn the global semantic and syntactic structural information of sentences. Experimental results show that the model performs well on three public benchmark datasets Rest14, Lap14, and Twitter, improving the accuracy of sentiment classification.
文摘To address the difficulty of training high-quality models in some specific domains due to the lack of fine-grained annotation resources, we propose in this paper a knowledge-integrated cross-domain data generation method for unsupervised domain adaptation tasks. Specifically, we extract domain features, lexical and syntactic knowledge from source-domain and target-domain data, and use a masking model with an extended masking strategy and a re-masking strategy to obtain domain-specific data that remove domain-specific features. Finally, we improve the sequence generation model BART and use it to generate high-quality target domain data for the task of aspect and opinion co-extraction from the target domain. Experiments were performed on three conventional English datasets from different domains, and our method generates more accurate and diverse target domain data with the best results compared to previous methods.
基金supported by the National Social Science Foundation of China(Grant No.12CTQ032)
文摘Purpose:In order to annotate the semantic information and extract the research level information of research papers,we attempt to seek a method to develop an information extraction system.Design/methodology/approach:Semantic dictionary and conditional random field model(CRFM)were used to annotate the semantic information of research papers.Based on the annotation results,the research level information was extracted through regular expression.All the functions were implemented on Sybase platform.Findings:According to the result of our experiment in carbon nanotube research,the precision and recall rates reached 65.13%and 57.75%,respectively after the semantic properties of word class have been labeled,and F-measure increased dramatically from less than 50%to60.18%while added with semantic features.Our experiment also showed that the information extraction system for research level(IESRL)can extract performance indicators from research papers rapidly and effectively.Research limitations:Some text information,such as that of format and chart,might have been lost due to the extraction processing of text format from PDF to TXT files.Semantic labeling on sentences could be insufficient due to the rich meaning of lexicons in the semantic dictionary.Research implications:The established system can help researchers rapidly compare the level of different research papers and find out their implicit innovation values.It could also be used as an auxiliary tool for analyzing research levels of various research institutions.Originality/value:In this work,we have successfully established an information extraction system for research papers by a revised semantic annotation method based on CRFM and the semantic dictionary.Our system can analyze the information extraction problem from two levels,i.e.from the sentence level and noun(phrase)level of research papers.Compared with the extraction method based on knowledge engineering and that on machine learning,our system shows advantages of the both.