Due to the fact that semantic role labeling (SRL) is very necessary for deep natural language processing, a method based on conditional random fields (CRFs) is proposed for the SRL task. This method takes shallow ...Due to the fact that semantic role labeling (SRL) is very necessary for deep natural language processing, a method based on conditional random fields (CRFs) is proposed for the SRL task. This method takes shallow syntactic parsing as the foundation, phrases or named entities as the labeled units, and the CRFs model is trained to label the predicates' semantic roles in a sentence. The key of the method is parameter estimation and feature selection for the CRFs model. The L-BFGS algorithm was employed for parameter estimation, and three category features: features based on sentence constituents, features based on predicate, and predicate-constituent features as a set of features for the model were selected. Evaluation on the datasets of CoNLL-2005 SRL shared task shows that the method can obtain better performance than the maximum entropy model, and can achieve 80. 43 % precision and 63. 55 % recall for semantic role labeling.展开更多
Previous studies have shown that there is potential semantic dependency between part-of-speech and semantic roles.At the same time,the predicate-argument structure in a sentence is important information for semantic r...Previous studies have shown that there is potential semantic dependency between part-of-speech and semantic roles.At the same time,the predicate-argument structure in a sentence is important information for semantic role labeling task.In this work,we introduce the auxiliary deep neural network model,which models semantic dependency between part-of-speech and semantic roles and incorporates the information of predicate-argument into semantic role labeling.Based on the framework of joint learning,part-of-speech tagging is used as an auxiliary task to improve the result of the semantic role labeling.In addition,we introduce the argument recognition layer in the training process of the main task-semantic role labeling,so the argument-related structural information selected by the predicate through the attention mechanism is used to assist the main task.Because the model makes full use of the semantic dependency between part-of-speech and semantic roles and the structural information of predicate-argument,our model achieved the F1 value of 89.0%on the WSJ test set of CoNLL2005,which is superior to existing state-of-the-art model about 0.8%.展开更多
This paper explores a tree kernel based method for semantic role labeling (SRL) of Chinese nominal predicates via a convolution tree kernel. In particular, a new parse tree representation structure, called dependenc...This paper explores a tree kernel based method for semantic role labeling (SRL) of Chinese nominal predicates via a convolution tree kernel. In particular, a new parse tree representation structure, called dependency-driven constituent parse tree (D-CPT), is proposed to combine the advantages of both constituent and dependence parse trees. This is achieved by directly representing various kinds of dependency relations in a CPT-style structure, which employs dependency relation types instead of phrase labels in CPT (Constituent Parse Tree). In this way, D-CPT not only keeps the dependency relationship information in the dependency parse tree (DPT) structure but also retains the basic hierarchical structure of CPT style. Moreover, several schemes are designed to extract various kinds of necessary information, such as the shortest path between the nominal predicate and the argument candidate, the support verb of the nominal predicate and the head argument modified by the argument candidate, from D-CPT. This largely reduces the noisy information inherent in D-CPT. Finally, a convolution tree kernel is employed to compute the similarity between two parse trees. Besides, we also implement a feature-based method based on D-CPT. Evaluation on Chinese NomBank corpus shows that our tree kernel based method on D-CPT performs significantly better than other tree kernel-based ones and achieves comparable performance with the state-of-the-art feature-based ones. This indicates the effectiveness of the novel D-CPT structure in representing various kinds of dependency relations in a CPT-style structure and our tree kernel based method in exploring the novel D-CPT structure. This also illustrates that the kernel-based methods are competitive and they are complementary with the feature- based methods on SRL.展开更多
A more natural way for non-expert users to express their tasks in an open-ended set is to use natural language. In this case,a human-centered intelligent agent/robot is required to be able to understand and generate p...A more natural way for non-expert users to express their tasks in an open-ended set is to use natural language. In this case,a human-centered intelligent agent/robot is required to be able to understand and generate plans for these naturally expressed tasks. For this purpose, it is a good way to enhance intelligent robot's abilities by utilizing open knowledge extracted from the web, instead of hand-coded knowledge. A key challenge of utilizing open knowledge lies in the semantic interpretation of the open knowledge organized in multiple modes, which can be unstructured or semi-structured, before one can use it.Previous approaches used a limited lexicon to employ combinatory categorial grammar(CCG) as the underlying formalism for semantic parsing over sentences. Here, we propose a more effective learning method to interpret semi-structured user instructions. Moreover, we present a new heuristic method to recover missing semantic information from the context of an instruction. Experiments showed that the proposed approach renders significant performance improvement compared to the baseline methods and the recovering method is promising.展开更多
Predicate-Argument (PA) structure anal- ysis is often divided into three subtasks: predicate sense disambiguation, argument identification and argument classification mostly been modeled in To date, they have isol...Predicate-Argument (PA) structure anal- ysis is often divided into three subtasks: predicate sense disambiguation, argument identification and argument classification mostly been modeled in To date, they have isolation. However, this approach neglects logical constraints between them. We therefore exploite integrating predicate sense disambiguation with the latter two subtasks respectively, which verifies that the automatic predicate sense disambiguation could help the se- mantic role labeling task. In addition, a dual de- composition algorithm is used to alleviate the er- ror propagation between argument identification subtask and argument classification subtask by benefitting the argument identification subtask greatly. Experiment results show that our ap- proach leads to a better performance with PA a- nalysis than other pipeline approaches.展开更多
Nowadays,software requirements are still mainly analyzed manually,which has many drawbacks(such as a large amount of labor consumption,inefficiency,and even inaccuracy of the results).The problems are even worse in do...Nowadays,software requirements are still mainly analyzed manually,which has many drawbacks(such as a large amount of labor consumption,inefficiency,and even inaccuracy of the results).The problems are even worse in domain analysis scenarios because a large number of requirements from many users need to be analyzed.In this sense,automatic analysis of software requirements can bring benefits to software companies.For this purpose,we proposed an approach to automatically analyze software requirement specifications(SRSs) and extract the semantic information.In this approach,a machine learning and ontology based semantic role labeling(SRL) method was used.First of all,some common verbs were calculated from SRS documents in the E-commerce domain,and then semantic frames were designed for those verbs.Based on the frames,sentences from SRSs were selected and labeled manually,and the labeled sentences were used as training examples in the machine learning stage.Besides the training examples labeled with semantic roles,external ontology knowledge was used to relieve the data sparsity problem and obtain reliable results.Based on the Sem Cor and Word Net corpus,the senses of nouns and verbs were identified in a sequential manner through the K-nearest neighbor approach.Then the senses of the verbs were used to identify the frame types.After that,we trained the SRL labeling classifier with the maximum entropy method,in which we added some new features based on word sense,such as the hypernyms and hyponyms of the word senses in the ontology.Experimental results show that this new approach for automatic functional requirements analysis is effective.展开更多
Nowadays,the Internet has penetrated into all aspects of people's lives.A large number of online customer reviews have been accumulated in several product forums,which are valuable resources to be analyzed.However...Nowadays,the Internet has penetrated into all aspects of people's lives.A large number of online customer reviews have been accumulated in several product forums,which are valuable resources to be analyzed.However,these customer reviews are unstructured textual data,in which a lot of ambiguities exist,so analyzing them is a challenging task.At present,the effective deep semantic or fine-grained analysis of customer reviews is rare in the existing literature,and the analysis quality of most studies is also low.Therefore,in this paper a fine-grained opinion mining method is introduced to extract the detailed semantic information of opinions from multiple perspectives and aspects from Chinese automobile reviews.The conditional random field (CRF) model is used in this method,in which semantic roles are divided into two groups.One group relates to the objects being reviewed,which includes the roles of manufacturer,the brand,the type,and the aspects of cars.The other group of semantic roles is about the opinions of the objects,which includes the sentiment description,the aspect value,the conditions of opinions and the sentiment tendency.The overall framework of the method includes three major steps.The first step distinguishes the relevant sentences with the irrelevant sentences in the reviews.At the second step the relevant sentences are further classified into different aspects.At the third step fine-grained semantic roles are extracted from sentences of each aspect.The data used in the training process is manually annotated in fine granularity of semantic roles.The features used in this CRF model include basic word features,part-of-speech (POS) features,position features and dependency syntactic features.Different combinations of these features are investigated.Experimental results are analyzed and future directions are discussed.展开更多
基金The National Natural Science Foundation of China(No60663004)the PhD Programs Foundation of Ministry of Educa-tion of China (No20050007023)
文摘Due to the fact that semantic role labeling (SRL) is very necessary for deep natural language processing, a method based on conditional random fields (CRFs) is proposed for the SRL task. This method takes shallow syntactic parsing as the foundation, phrases or named entities as the labeled units, and the CRFs model is trained to label the predicates' semantic roles in a sentence. The key of the method is parameter estimation and feature selection for the CRFs model. The L-BFGS algorithm was employed for parameter estimation, and three category features: features based on sentence constituents, features based on predicate, and predicate-constituent features as a set of features for the model were selected. Evaluation on the datasets of CoNLL-2005 SRL shared task shows that the method can obtain better performance than the maximum entropy model, and can achieve 80. 43 % precision and 63. 55 % recall for semantic role labeling.
基金The work of this article is supported by Key Scientific Research Projects of Colleges and Universities in Henan Province(Grant No.20A520007)National Natural Science Foundation of China(Grant No.61402149).
文摘Previous studies have shown that there is potential semantic dependency between part-of-speech and semantic roles.At the same time,the predicate-argument structure in a sentence is important information for semantic role labeling task.In this work,we introduce the auxiliary deep neural network model,which models semantic dependency between part-of-speech and semantic roles and incorporates the information of predicate-argument into semantic role labeling.Based on the framework of joint learning,part-of-speech tagging is used as an auxiliary task to improve the result of the semantic role labeling.In addition,we introduce the argument recognition layer in the training process of the main task-semantic role labeling,so the argument-related structural information selected by the predicate through the attention mechanism is used to assist the main task.Because the model makes full use of the semantic dependency between part-of-speech and semantic roles and the structural information of predicate-argument,our model achieved the F1 value of 89.0%on the WSJ test set of CoNLL2005,which is superior to existing state-of-the-art model about 0.8%.
基金Supported by the National Natural Science Foundation of China under Grant Nos.61331011 and 61273320the National High Technology Research and Development 863 Program of China under Grant No.2012AA011102the Natural Science Foundation of Jiangsu Provincial Department of Education under Grant No.10KJB520016
文摘This paper explores a tree kernel based method for semantic role labeling (SRL) of Chinese nominal predicates via a convolution tree kernel. In particular, a new parse tree representation structure, called dependency-driven constituent parse tree (D-CPT), is proposed to combine the advantages of both constituent and dependence parse trees. This is achieved by directly representing various kinds of dependency relations in a CPT-style structure, which employs dependency relation types instead of phrase labels in CPT (Constituent Parse Tree). In this way, D-CPT not only keeps the dependency relationship information in the dependency parse tree (DPT) structure but also retains the basic hierarchical structure of CPT style. Moreover, several schemes are designed to extract various kinds of necessary information, such as the shortest path between the nominal predicate and the argument candidate, the support verb of the nominal predicate and the head argument modified by the argument candidate, from D-CPT. This largely reduces the noisy information inherent in D-CPT. Finally, a convolution tree kernel is employed to compute the similarity between two parse trees. Besides, we also implement a feature-based method based on D-CPT. Evaluation on Chinese NomBank corpus shows that our tree kernel based method on D-CPT performs significantly better than other tree kernel-based ones and achieves comparable performance with the state-of-the-art feature-based ones. This indicates the effectiveness of the novel D-CPT structure in representing various kinds of dependency relations in a CPT-style structure and our tree kernel based method in exploring the novel D-CPT structure. This also illustrates that the kernel-based methods are competitive and they are complementary with the feature- based methods on SRL.
基金supported by the National Natural Science Foundation of China(61175057)the USTC Key-Direction Research Fund(WK0110000028)
文摘A more natural way for non-expert users to express their tasks in an open-ended set is to use natural language. In this case,a human-centered intelligent agent/robot is required to be able to understand and generate plans for these naturally expressed tasks. For this purpose, it is a good way to enhance intelligent robot's abilities by utilizing open knowledge extracted from the web, instead of hand-coded knowledge. A key challenge of utilizing open knowledge lies in the semantic interpretation of the open knowledge organized in multiple modes, which can be unstructured or semi-structured, before one can use it.Previous approaches used a limited lexicon to employ combinatory categorial grammar(CCG) as the underlying formalism for semantic parsing over sentences. Here, we propose a more effective learning method to interpret semi-structured user instructions. Moreover, we present a new heuristic method to recover missing semantic information from the context of an instruction. Experiments showed that the proposed approach renders significant performance improvement compared to the baseline methods and the recovering method is promising.
文摘Predicate-Argument (PA) structure anal- ysis is often divided into three subtasks: predicate sense disambiguation, argument identification and argument classification mostly been modeled in To date, they have isolation. However, this approach neglects logical constraints between them. We therefore exploite integrating predicate sense disambiguation with the latter two subtasks respectively, which verifies that the automatic predicate sense disambiguation could help the se- mantic role labeling task. In addition, a dual de- composition algorithm is used to alleviate the er- ror propagation between argument identification subtask and argument classification subtask by benefitting the argument identification subtask greatly. Experiment results show that our ap- proach leads to a better performance with PA a- nalysis than other pipeline approaches.
基金the National Natural Science Foundation of China(No.61375053)
文摘Nowadays,software requirements are still mainly analyzed manually,which has many drawbacks(such as a large amount of labor consumption,inefficiency,and even inaccuracy of the results).The problems are even worse in domain analysis scenarios because a large number of requirements from many users need to be analyzed.In this sense,automatic analysis of software requirements can bring benefits to software companies.For this purpose,we proposed an approach to automatically analyze software requirement specifications(SRSs) and extract the semantic information.In this approach,a machine learning and ontology based semantic role labeling(SRL) method was used.First of all,some common verbs were calculated from SRS documents in the E-commerce domain,and then semantic frames were designed for those verbs.Based on the frames,sentences from SRSs were selected and labeled manually,and the labeled sentences were used as training examples in the machine learning stage.Besides the training examples labeled with semantic roles,external ontology knowledge was used to relieve the data sparsity problem and obtain reliable results.Based on the Sem Cor and Word Net corpus,the senses of nouns and verbs were identified in a sequential manner through the K-nearest neighbor approach.Then the senses of the verbs were used to identify the frame types.After that,we trained the SRL labeling classifier with the maximum entropy method,in which we added some new features based on word sense,such as the hypernyms and hyponyms of the word senses in the ontology.Experimental results show that this new approach for automatic functional requirements analysis is effective.
基金the National Natural Science Foundation of China(No.61375053)the Project of Shanghai University of Finance and Economics(Nos.2018110565 and 2016110743)。
文摘Nowadays,the Internet has penetrated into all aspects of people's lives.A large number of online customer reviews have been accumulated in several product forums,which are valuable resources to be analyzed.However,these customer reviews are unstructured textual data,in which a lot of ambiguities exist,so analyzing them is a challenging task.At present,the effective deep semantic or fine-grained analysis of customer reviews is rare in the existing literature,and the analysis quality of most studies is also low.Therefore,in this paper a fine-grained opinion mining method is introduced to extract the detailed semantic information of opinions from multiple perspectives and aspects from Chinese automobile reviews.The conditional random field (CRF) model is used in this method,in which semantic roles are divided into two groups.One group relates to the objects being reviewed,which includes the roles of manufacturer,the brand,the type,and the aspects of cars.The other group of semantic roles is about the opinions of the objects,which includes the sentiment description,the aspect value,the conditions of opinions and the sentiment tendency.The overall framework of the method includes three major steps.The first step distinguishes the relevant sentences with the irrelevant sentences in the reviews.At the second step the relevant sentences are further classified into different aspects.At the third step fine-grained semantic roles are extracted from sentences of each aspect.The data used in the training process is manually annotated in fine granularity of semantic roles.The features used in this CRF model include basic word features,part-of-speech (POS) features,position features and dependency syntactic features.Different combinations of these features are investigated.Experimental results are analyzed and future directions are discussed.