期刊文献+

基于半监督学习与CRF的应急预案命名实体识别 被引量:1

Entity Identification Based on Semi-supervised Learning and CRF
下载PDF
导出
摘要 传统基于统计的命名实体识别方法存在需要大量人工标注的缺陷,导致识别准确率较低。为了提升识别效果,提出一种基于条件随机场的半监督学习方法(S-CRF)对命名实体进行识别。该方法将实体识别看作序列标注问题,对少量数据进行人工标注并构建实体集,通过K-means聚类算法选取有代表性的未标注数据文本进行自动标注,采用条件随机场对语料进行训练测试。选取中文应急预案文档进行实验,该方法在各个标签上的识别效果分别达到93.52%、93.04%、95.81%。实验结果表明,该方法优于传统规则方法,能有效提高应急预案命名实体的识别效果。 The traditional statistical-based named entity recognition method requires large number of manual labeling defects,resulting in low recognition accuracy.In order to improve the recognition effect,we propose a method of conditional random field semi-supervised learning method(S-CRF)to identify and extract named entities.This method regards the entity recognition as the sequence labeling problem,manually label small amounts of data and constructed entity set.The K-means clustering algorithm is used to select representative unlabeled data texts for automatic labeling,and the conditional random field is used to sequence the corpus.The Chinese emergency plan document was selected for experiment.The accuracy of the B,M,and O labels reached 93.52%,93.04% and 95.81%,respectively.The experimental results show that the method is superior to the traditional rules method and can effectively improve the identification effect of named entity of the contingency plan.
作者 刘彤 魏静 倪维健 陈思源 LIU Tong;WEI Jing;NI Wei-jian;CHEN Si-yuan(College of Computer Science and Engineering,Shandong University of Science and Technology,Qingdao 266590,China)
出处 《软件导刊》 2020年第3期35-38,共4页 Software Guide
基金 国家自然科学基金项目(71704096,61602278) 青岛市社科规划项目(QDSKL1801122)。
关键词 应急预案 命名实体识别 条件随机场 半监督学习 emergency plan named entity identification conditional random field semi-supervised learning
  • 相关文献

参考文献13

二级参考文献81

共引文献356

同被引文献6

引证文献1

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部