摘要
该文提出了一种基于树核的英文代词消解方法。针对结构化信息在指代消解中的重要作用,该文使用SVM提供的卷积树核函数自动获取句法结构信息,将句法树作为一个特征,和其他基本特征相结合。该文系统的分析了训练用例的过滤及不同的剪枝策略对模型性能的影响,同时还分析了树核函数对于几句之内的代词消解有比较好的结果。在ACE2004 NWIRE基准数据上进行实验的结果说明树核能显著地提高代词消解系统的性能,并且对一句之内的代词消解有较好的效果。
This paper proposes a tree kernel-based approach to anaphora resolution of English pronoun. In our method, the convolution kernel of SVM is first used to obtain structured information, and then such achieved feature of the syntax is combined with other basic features in the literature. A system analysis of the impact of the filtering of training instances and different pruning strategies on the results is conducted. Further examination on the pronoun resolution performances in regard to the sentence distances is also carried out. Evaluation on the ACE2004 NWIRE benchmark corpus shows that tree kernel can improve the performance significantly, especially for the pronoun resolution within a sentence.
出处
《中文信息学报》
CSCD
北大核心
2009年第5期33-39,共7页
Journal of Chinese Information Processing
基金
国家自然科学基金资助项目(60673041)
国家863高技术资助项目(2006AA01Z147)
关键词
计算机应用
中文信息处理
指代消解
句法结构
树核函数
修剪策略
computer application
Chinese information processing
coreference resolution
structured syntax
tree kernel
pruning strategy