期刊文献+

维基百科语义背景知识的共指消解研究

Coreference Resolution Research with the Semantic Background Knowledge of Wikipedia
下载PDF
导出
摘要 文章以突发事件新闻语料为研究背景,深度挖掘维基百科作为消解的背景语义知识,提炼出四类基于维基百科的特征,分别是解释网页内容的特征、同义词的特征、链接图的特征和分类图的特征。在标注的20万字语料上进行训练与测试,经过实验测试,证明将维基百科引入突发事件共指消解是一个有效的方法,系统F值为66.7%,其中,基于维基百科链接图的特征对系统贡献最大。利用爬山算法的SBS算法做特征选择,在剔除掉7个特征之后,使得系统F值提升了3.58%。 In this paper we mainly took emergency news corpus as the research background, With the help of a deep excavation of Wikipedia's semantic knowledge, we had extracted four categories of characteristics based on the Wikipedia, which respectively is the Wikipedia page text features, the hyperlink graph features and categories graph features. We trained and tested on the tagging 200000 Chinese characters corpus. Through experimental tests, introducing Wikipedia to co reference resolution was an effective method and F value of system was 66.7%, the hyperlink graph features made the largest contribution to the system.We selected mountain climbing algorithm(SBS algorithm) as feature selection to weed out the seven characteristics, which made the F value of system to increase by 3.58%.
作者 张贵军
出处 《信息通信》 2018年第1期4-8,共5页 Information & Communications
关键词 突发事件 共指消解 维基百科 语义特征 最大熵模型 Paroxysmal event co reference resolution semantic features Wikipedia maximum entropy mode
  • 相关文献

参考文献5

二级参考文献40

共引文献17

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部