摘要
事件抽取可以帮助人们从海量的文本中快速、准确地获取感兴趣的事件知识。然而,目前事件抽取的研究主要集中在从单一句子中抽取事件,由于事件构成的复杂性和语言表述的多样性,多数情况下多句才能完整地描述一个事件。因此,从篇章中抽取出完整的结构化事件信息,显得更有价值和意义。该文首先利用基于注意力机制的序列标注模型联合抽取句子级事件的触发词和实体,与独立进行实体抽取和事件识别相比,联合标注的方法在F值上提升了1个百分点。然后利用多层感知机判断实体在事件中扮演的角色。最后,在句子级事件抽取的基础上,利用整数线性规划的方法进行全局推理,融合句子级事件信息,实现篇章级事件抽取,与基线模型相比,这种基于全局推理的篇章级事件抽取在F值上提升了3个百分点。
Current research on automatic event extraction focuses on sentence-level corpus. However, due to the complexity and the diversity of event description in texts, a complete event is mentioned by multiple sentences in many cases. This paper first proposes an Attention-based Sequence Labeling model for joint extraction of entities and events. Compared with the pipeline of entity extraction plus event recognition, this joint labeling model improves the F-score by 1 %. Then, we use Multi-Layer Perception to label the entities in the events and identify their roles. Finally, based on the labeling and identification results, this paper leverages integer linear programming for global reasoning, improving the F-score of document-level event extraction by 3% compared to the baseline.
作者
仲伟峰
杨航
陈玉博
刘康
赵军
ZHONG Weifeng;YANG Hang;CHEN Yubo;LIU Kang;ZHAO Jun(College of Automation,Harbin University of Science and Technology,Harbin,Heilongjiang 150080,China;State Key Laboratory of Pattern Recognition,Institute of Automation,Chinese Academy of Sciences,Beijing 100190,China)
出处
《中文信息学报》
CSCD
北大核心
2019年第9期88-95,106,共9页
Journal of Chinese Information Processing
关键词
篇章级事件抽取
联合标注
全局推理
document-level event extraction
joint labeling
global reasoning