摘要
在基于依存的语义角色标注研究中,大多数系统采用机器学习方法进行论元识别和分类。该文分析了依存树的特点,发现论元集中分布于依存树上的特定局部范围内,因此提出一种基于依存树距离的论元识别方法。该方法将候选论元限制在与目标动词的依存树距离不超过3的范围内,通过制订规则,提取目标动词的最佳候选论元集合。在CoNLL2009中文语料上采用正确的依存树,识别出了98.5%的论元。在此基础上,结合基于机器学习的角色分类,系统F值达到89.46%,比前人的方法 (81.68%)有了较为显著的提升。
In research on the semantic role labeling based on dependency,most systems apply machine learning to arguments identification and arguments classification.This paper analyses the characteristics of the dependency tree,and find that arguments distribute in specific area of dependency tree.Therefore,we propose a novel rule based method for the semantic role identification according to the dependency tree distance.The maximal distance from candidate arguments to verb is limited to no more than three.We also obtain best candidate arguments related to the verb.For the gold syntactic dependency tree,this method recognizes 98.5% of arguments on CoNLL 2009 Chinese dataset.Combined with arguments classification based on machine learning,the F measure of the system finally reaches 89.46%,which is a significant improvements compared with the previous work(81.68%).
出处
《中文信息学报》
CSCD
北大核心
2012年第2期40-45,共6页
Journal of Chinese Information Processing
基金
国家自然科学基金资助项目(60873156
61075067)
国家社会科学基金资助项目(09BYY032)
关键词
论元识别
基于依存树距离的方法
语义角色标注
argument identification
dependency tree distance based method
semantic role labeling