摘要
事件抽取旨在从非结构化的文本中抽取出事件的信息,并以结构化的形式予以呈现。监督学习作为基础的事件抽取方法往往受制于训练语料规模小、类别分布不平衡和质量参差不齐的问题。同时,传统基于特征工程的事件抽取方法往往会产生错误传递的问题,且特征工程较为复杂。为此,该文提出了一种联合深度学习和主动学习的事件抽取方法。该方法将RNN模型对触发词分类的置信度融入在主动学习的查询函数中,以此在主动学习过程中提高语料标注效率,进而提高实验的最终性能。实验结果显示,这一联合学习方法能够辅助事件抽取性能的提升,但也显示,联合模式仍有较高的提升空间,有待进一步思考和探索。
Event extraction aims at extracting event information from raw texts and representing them as a structured text. As a basic event extraction method,supervised learning often suffers from small scale,imbalanced distribution and uneven quality of training corpus. Moreover, traditional event extraction methods based on feature engineering are complicated and will always cause error propagation. To address these issues,this paper presents a method to combine deep learning and active learning by the confidence of the query function based on RNN's trigger classifica- tion, in order to improve the quality and efficiency of corpus annotation as well as the ultimate performance. The ex- perimental results show that this joint learning method can improve the event extraction, with substantial room for further exploration.
作者
邱盈盈
洪宇
周文瑄
姚建民
朱巧明
QIU Yingying;HONG Yu;ZHOU Wenxuan;YAO Jianmin;ZHU Qiaoming(Provincial Key Laboratory of Computer Information Processing Technology,Soochow University,Suzhou,Jiangsu 215006,China)
出处
《中文信息学报》
CSCD
北大核心
2018年第6期98-106,共9页
Journal of Chinese Information Processing
基金
国家自然科学基金(61373097
61672367
61672368)
江苏省科技计划(BK20151222)
教育部-中国移动基金(MCM20150602)