期刊文献+

文档级无触发词事件抽取联合模型 被引量:7

Joint Model for Document-Level Event Extraction Without Triggers
下载PDF
导出
摘要 当前广为研究的在单个句子范围内的事件抽取方法,难以扩展到从分布在一篇文章里的多个句子中抽取同一事件的完整论元。对此,提出了一种基于深度学习的文档级事件抽取联合模型。首先,利用基于多头自注意力机制的实体识别模块逐句识别文档中的实体并输出其类型。然后,通过定义不同论元角色对事件类型的重要度训练事件类型检测模块,实现在无触发词条件下定位事件表述中心句并判断事件类型。最后,事件论元抽取模块通过在实体语义向量中嵌入实体的类型信息和实体到事件中心句的距离信息,并输入Transformer网络与上下文交换信息,实现在文档范围内抽取全部事件论元。通过对上述三个子模块进行联合训练,进一步实现了端到端的事件抽取,避免了管道式方法的误差传递。在公开数据集上的实验结果表明:在单事件条件下,该模型取得了86.3%的F1值,优于当前最佳的文档级事件抽取方法,并且具有优秀的模型训练速度。 The widely researched sentence-level event extraction methods struggle to extract all arguments of the same event from a whole document. To solve this problem, this paper proposes a joint model for document-level event extraction based on deep learning. Firstly, an entity recognition module based on multi-head self-attention mechanism is used to identify entities and their types sentence by sentence. Then, an event type detection module trained by defining the importance of different argument roles, is used to locate the event mention sentence and predict the event type without the help of event triggers. Finally, an event argument extraction module embeds every entity’s semantic vector with its type information and its distance to the event mention sentence before feeding into a context-aware Transformer, in order to extract arguments within the document scope. In addition, by training the three modules mentioned above jointly, this paper realizes an end-to-end event extraction model and avoids error propagation problems in traditional pipeline models. The experimental results on a public dataset shows that, when each document contains only one event, the proposed model achieves a 86.3% F1-score, which outperforms stateof-the-art methods, and the training process completes rather quickly.
作者 王雷 李瑞轩 李玉华 辜希武 杨琪 WANG Lei;LI Ruixuan;LI Yuhua;GU Xiwu;YANG Qi(School of Computer Science and Technology,Huazhong University of Science and Technology,Wuhan 430074,China)
出处 《计算机科学与探索》 CSCD 北大核心 2021年第12期2327-2334,共8页 Journal of Frontiers of Computer Science and Technology
基金 国家重点研发计划(2016QY01W0202,2016YFB0800402) 国家自然科学基金(U1836204,U1936108,61572221,61433006,U1401258,61572222,61502185) 国家社会科学基金(16ZDA092)。
关键词 文档级事件抽取 无触发词 联合模型 实体识别 事件检测 document-level event extraction triggers free joint model entity recognition event detection
  • 相关文献

参考文献2

二级参考文献9

  • 1Naomi Daniel,Dragomir Radev and Timothy Allison.Sub-event based Multi-document Summarization[A].In:Proceedings of the HLT-NAACL Workshop on Text Summarization[C].2003.9-16.
  • 2Elena Filatova and Vasileios Hatzivassiloglou.Event-based Extractive summarization[A].In:Proceedings of ACL Workshop on Summarization[C]].2004.104-111.
  • 3Wenjie Li,Mingli Wu and Qin Lu.Extractive Summarization using Inter-and Intra-Event Relevance[A].In:Proceedings of the 44th Annual Meeting of the Association for Computational Liguistics[C].2006.369-376.
  • 4David Ahn.The stages of event extraction[A].In:Proceedings of the Workshop on Annotations and Reasoning about Time and Events[C].2006.1-8.
  • 5ACE (Automatic Content Extraction) Chinese Annotation Guidelines for Events.National Institute of Standards and Technology[R].2005.
  • 6Mihai Surdeanu,Sanda Harabagiu,John Williams,et al.Using Predicate-Argument Structures for Information Extraction[A].In:Proceedings of ACL[C].2003.8-15.
  • 7Mihai Surdeanu and Sanda Harabagiu.Infrastructure for Open-Domain Information Extraction[A].In:Proceedings of the Human Language Technology Conference[C].2002.325-330.
  • 8Hai Leong Chieu,Hwee Tou Ng.A Maximum Entropy Approach to Information Extraction from SemiStructured and Free Text[A].In:Proceedings of the 18th National Conference on Artificial Intelligence[C].2002.786-791.
  • 9来自ACE标准标注结果,分别对应着ACE的三项标注任务:实体识别、时间表达式识别和属性词识别.

共引文献121

同被引文献28

引证文献7

二级引证文献7

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部