摘要
信息提取技术是自然语言处理技术的关键技术之一其中最主要的任务是事件元素提取。本文利用深度学习网络模型实现信息提取任务进行了深入研究。训练数据来源于上海大学构建的CEC已标注的语料库。相比于采用手工设立规则的识别方式和BiLSTM网络模型本文通过对数据进行预处理和搭建BERT-BiLSTM-CRF深度网络模型,对文本数据训练实现标注,在时间、报道时间、参与对象的识别准确率上均有所提升。
Information Extraction is one of the most important technology in Natural Language Process,which mainly job is extract the events element.This paper proposes a deep learning network method to solve this task.The training data comes from CEC corpus which was built by Shanghai University.In this experiment,compared with rule-based annotation method and Bi-LSTM network method,showing that using BERT+BiLSTM+CRF model can improve the efficiency of event extraction effectively.
作者
杨芷婷
马汉杰
YANG Zhiting;MA Hanjie(School of Information Science and Technology,Zhejiang Sci-Tech University,Hangzhou 310018,China)
出处
《智能计算机与应用》
2021年第6期14-19,共6页
Intelligent Computer and Applications
关键词
BERT
中文突发事件
自动标注
信息提取
BERT
Chinese emergency event
automatic-annotation
information extraction