摘要
事件往往围绕主题展开,相互间存在相关性。在大数据时代,从海量信息中筛选出和某个主题相关的事件,有助于信息抽取、文本摘要、文本生成等自然语言处理任务。首先提出一种相关事件的标注方法,并标注了一个中文事件相关性语料库。然后,初步提出了一个基于多种特征的相关性事件识别方法。在标注语料上的实验表明,性能在基准系统上F1值提高了4.08%。
There are many relevant events concerning a topic. In the era of big data, extracting those events which are relevant to a specific topic is helpful for many natural language processing applications, such as information extraction, text summarization, and text generation. We propose a method to anno- tate relevant events and construct a Chinese relevant event corpus. We then put forward a relevant event recognition approach based on various distances and semantic features. Experimental results on the annotated corpus show that the proposed approach outperforms the baseline by 4.08% in Fl-measure.
出处
《计算机工程与科学》
CSCD
北大核心
2015年第12期2306-2311,共6页
Computer Engineering & Science
基金
国家自然科学基金资助项目(61472265)
国家自然科学基金重点资助项目(61331011)
江苏省前瞻性联合研究资助项目(BY2014059-08)
软件新技术与产业化协同创新中心部分资助项目
关键词
相关事件语料库
标注
相关性
事件关系
relevant event corpus
annotation
relevance
event relation