Due to the fact that semantic role labeling (SRL) is very necessary for deep natural language processing, a method based on conditional random fields (CRFs) is proposed for the SRL task. This method takes shallow ...Due to the fact that semantic role labeling (SRL) is very necessary for deep natural language processing, a method based on conditional random fields (CRFs) is proposed for the SRL task. This method takes shallow syntactic parsing as the foundation, phrases or named entities as the labeled units, and the CRFs model is trained to label the predicates' semantic roles in a sentence. The key of the method is parameter estimation and feature selection for the CRFs model. The L-BFGS algorithm was employed for parameter estimation, and three category features: features based on sentence constituents, features based on predicate, and predicate-constituent features as a set of features for the model were selected. Evaluation on the datasets of CoNLL-2005 SRL shared task shows that the method can obtain better performance than the maximum entropy model, and can achieve 80. 43 % precision and 63. 55 % recall for semantic role labeling.展开更多
Previous studies have shown that there is potential semantic dependency between part-of-speech and semantic roles.At the same time,the predicate-argument structure in a sentence is important information for semantic r...Previous studies have shown that there is potential semantic dependency between part-of-speech and semantic roles.At the same time,the predicate-argument structure in a sentence is important information for semantic role labeling task.In this work,we introduce the auxiliary deep neural network model,which models semantic dependency between part-of-speech and semantic roles and incorporates the information of predicate-argument into semantic role labeling.Based on the framework of joint learning,part-of-speech tagging is used as an auxiliary task to improve the result of the semantic role labeling.In addition,we introduce the argument recognition layer in the training process of the main task-semantic role labeling,so the argument-related structural information selected by the predicate through the attention mechanism is used to assist the main task.Because the model makes full use of the semantic dependency between part-of-speech and semantic roles and the structural information of predicate-argument,our model achieved the F1 value of 89.0%on the WSJ test set of CoNLL2005,which is superior to existing state-of-the-art model about 0.8%.展开更多
为高效地自动挖掘开源异构大数据中的威胁情报实体和关系,提出一种威胁情报实体关系抽取(TIERE)方法。首先,通过分析开源网络安全报告的特点,研究并提出一种数据预处理方法;然后,针对网络安全领域文本复杂度高、标准数据样本集少的问题...为高效地自动挖掘开源异构大数据中的威胁情报实体和关系,提出一种威胁情报实体关系抽取(TIERE)方法。首先,通过分析开源网络安全报告的特点,研究并提出一种数据预处理方法;然后,针对网络安全领域文本复杂度高、标准数据样本集少的问题,提出基于改进自举法的命名实体识别(NER-IBS)算法和基于语义角色标注的关系抽取(RE-SRL)算法。利用少量样本和规则构建初始种子,通过迭代训练挖掘非结构化文本中的实体,并通过构建语义角色的策略挖掘实体之间的关系。实验结果表明,在少样本网络安全信息抽取数据集上,NER-IBS算法的F1值为84%,与RDF-CRF(Regular expression and Dictionary combined with Feature templates as well as Conditional Random Field)算法相比提高了2个百分点,且RE-SRL算法对于无类别关系抽取的F1值为94%,说明TIERE方法具有高效的实体关系抽取能力。展开更多
This paper explores a tree kernel based method for semantic role labeling (SRL) of Chinese nominal predicates via a convolution tree kernel. In particular, a new parse tree representation structure, called dependenc...This paper explores a tree kernel based method for semantic role labeling (SRL) of Chinese nominal predicates via a convolution tree kernel. In particular, a new parse tree representation structure, called dependency-driven constituent parse tree (D-CPT), is proposed to combine the advantages of both constituent and dependence parse trees. This is achieved by directly representing various kinds of dependency relations in a CPT-style structure, which employs dependency relation types instead of phrase labels in CPT (Constituent Parse Tree). In this way, D-CPT not only keeps the dependency relationship information in the dependency parse tree (DPT) structure but also retains the basic hierarchical structure of CPT style. Moreover, several schemes are designed to extract various kinds of necessary information, such as the shortest path between the nominal predicate and the argument candidate, the support verb of the nominal predicate and the head argument modified by the argument candidate, from D-CPT. This largely reduces the noisy information inherent in D-CPT. Finally, a convolution tree kernel is employed to compute the similarity between two parse trees. Besides, we also implement a feature-based method based on D-CPT. Evaluation on Chinese NomBank corpus shows that our tree kernel based method on D-CPT performs significantly better than other tree kernel-based ones and achieves comparable performance with the state-of-the-art feature-based ones. This indicates the effectiveness of the novel D-CPT structure in representing various kinds of dependency relations in a CPT-style structure and our tree kernel based method in exploring the novel D-CPT structure. This also illustrates that the kernel-based methods are competitive and they are complementary with the feature- based methods on SRL.展开更多
提出一种基于特征组合和支持向量机(support vector machine,简称SVM)的语义角色标注(semantic role labeling,简称SRL)方法.该方法以句法成分作为基本标注单元,首先从当前基于句法分析的语义角色标注系统中选出高效特征,构成基本特征集...提出一种基于特征组合和支持向量机(support vector machine,简称SVM)的语义角色标注(semantic role labeling,简称SRL)方法.该方法以句法成分作为基本标注单元,首先从当前基于句法分析的语义角色标注系统中选出高效特征,构成基本特征集合.然后提出一种基于统计的特征组合方法.该方法能够根据正反例中组合特征的分布状况,以类间距离和类内距离之比作为统计量来衡量组合特征对分类所产生的效果,保留分类效果较好的组合特征.最后,在Chinese PropBank(CPB)语料上利用支持向量机进行分类实验,结果表明,引入该特征组合方法后,语义角色标注整体F值达91.81%,提高了近2%.展开更多
研究了中文名词性谓词的语义角色标注(semantic role labeling,简称SRL).在使用传统动词性谓词SRL相关特征的基础上,进一步提出了名词性谓词SRL相关的特征集.此外,探索了中文动词性谓词SRL对中文名词性谓词SRL的影响,并且联合谓词自动...研究了中文名词性谓词的语义角色标注(semantic role labeling,简称SRL).在使用传统动词性谓词SRL相关特征的基础上,进一步提出了名词性谓词SRL相关的特征集.此外,探索了中文动词性谓词SRL对中文名词性谓词SRL的影响,并且联合谓词自动识别实现了全自动的中文名词性谓词SRL.在中文NomBank上的实验结果表明,中文动词性谓词的SRL合理使用能够大幅度提高中文名词性谓词的SRL性能;基于正确句法树和正确谓词识别,中文名词性谓词的SRL性能F1值达到了72.67,大大优于目前国内外的同类系统;基于自动句法树和自动谓词识别,性能F1值为55.14.展开更多
基金The National Natural Science Foundation of China(No60663004)the PhD Programs Foundation of Ministry of Educa-tion of China (No20050007023)
文摘Due to the fact that semantic role labeling (SRL) is very necessary for deep natural language processing, a method based on conditional random fields (CRFs) is proposed for the SRL task. This method takes shallow syntactic parsing as the foundation, phrases or named entities as the labeled units, and the CRFs model is trained to label the predicates' semantic roles in a sentence. The key of the method is parameter estimation and feature selection for the CRFs model. The L-BFGS algorithm was employed for parameter estimation, and three category features: features based on sentence constituents, features based on predicate, and predicate-constituent features as a set of features for the model were selected. Evaluation on the datasets of CoNLL-2005 SRL shared task shows that the method can obtain better performance than the maximum entropy model, and can achieve 80. 43 % precision and 63. 55 % recall for semantic role labeling.
基金The work of this article is supported by Key Scientific Research Projects of Colleges and Universities in Henan Province(Grant No.20A520007)National Natural Science Foundation of China(Grant No.61402149).
文摘Previous studies have shown that there is potential semantic dependency between part-of-speech and semantic roles.At the same time,the predicate-argument structure in a sentence is important information for semantic role labeling task.In this work,we introduce the auxiliary deep neural network model,which models semantic dependency between part-of-speech and semantic roles and incorporates the information of predicate-argument into semantic role labeling.Based on the framework of joint learning,part-of-speech tagging is used as an auxiliary task to improve the result of the semantic role labeling.In addition,we introduce the argument recognition layer in the training process of the main task-semantic role labeling,so the argument-related structural information selected by the predicate through the attention mechanism is used to assist the main task.Because the model makes full use of the semantic dependency between part-of-speech and semantic roles and the structural information of predicate-argument,our model achieved the F1 value of 89.0%on the WSJ test set of CoNLL2005,which is superior to existing state-of-the-art model about 0.8%.
文摘为高效地自动挖掘开源异构大数据中的威胁情报实体和关系,提出一种威胁情报实体关系抽取(TIERE)方法。首先,通过分析开源网络安全报告的特点,研究并提出一种数据预处理方法;然后,针对网络安全领域文本复杂度高、标准数据样本集少的问题,提出基于改进自举法的命名实体识别(NER-IBS)算法和基于语义角色标注的关系抽取(RE-SRL)算法。利用少量样本和规则构建初始种子,通过迭代训练挖掘非结构化文本中的实体,并通过构建语义角色的策略挖掘实体之间的关系。实验结果表明,在少样本网络安全信息抽取数据集上,NER-IBS算法的F1值为84%,与RDF-CRF(Regular expression and Dictionary combined with Feature templates as well as Conditional Random Field)算法相比提高了2个百分点,且RE-SRL算法对于无类别关系抽取的F1值为94%,说明TIERE方法具有高效的实体关系抽取能力。
基金Supported by the National Natural Science Foundation of China under Grant Nos.61331011 and 61273320the National High Technology Research and Development 863 Program of China under Grant No.2012AA011102the Natural Science Foundation of Jiangsu Provincial Department of Education under Grant No.10KJB520016
文摘This paper explores a tree kernel based method for semantic role labeling (SRL) of Chinese nominal predicates via a convolution tree kernel. In particular, a new parse tree representation structure, called dependency-driven constituent parse tree (D-CPT), is proposed to combine the advantages of both constituent and dependence parse trees. This is achieved by directly representing various kinds of dependency relations in a CPT-style structure, which employs dependency relation types instead of phrase labels in CPT (Constituent Parse Tree). In this way, D-CPT not only keeps the dependency relationship information in the dependency parse tree (DPT) structure but also retains the basic hierarchical structure of CPT style. Moreover, several schemes are designed to extract various kinds of necessary information, such as the shortest path between the nominal predicate and the argument candidate, the support verb of the nominal predicate and the head argument modified by the argument candidate, from D-CPT. This largely reduces the noisy information inherent in D-CPT. Finally, a convolution tree kernel is employed to compute the similarity between two parse trees. Besides, we also implement a feature-based method based on D-CPT. Evaluation on Chinese NomBank corpus shows that our tree kernel based method on D-CPT performs significantly better than other tree kernel-based ones and achieves comparable performance with the state-of-the-art feature-based ones. This indicates the effectiveness of the novel D-CPT structure in representing various kinds of dependency relations in a CPT-style structure and our tree kernel based method in exploring the novel D-CPT structure. This also illustrates that the kernel-based methods are competitive and they are complementary with the feature- based methods on SRL.
文摘研究了中文名词性谓词的语义角色标注(semantic role labeling,简称SRL).在使用传统动词性谓词SRL相关特征的基础上,进一步提出了名词性谓词SRL相关的特征集.此外,探索了中文动词性谓词SRL对中文名词性谓词SRL的影响,并且联合谓词自动识别实现了全自动的中文名词性谓词SRL.在中文NomBank上的实验结果表明,中文动词性谓词的SRL合理使用能够大幅度提高中文名词性谓词的SRL性能;基于正确句法树和正确谓词识别,中文名词性谓词的SRL性能F1值达到了72.67,大大优于目前国内外的同类系统;基于自动句法树和自动谓词识别,性能F1值为55.14.