期刊文献+

面向特定标注数据稀缺领域的命名实体识别 被引量:5

Named Entity Recognition for Specific Field with Annotated Data Scarcity
全文增补中
导出
摘要 针对传统命名实体识别需要大量标注数据的问题,提出了一种标注语料稀缺条件下的命名实体识别方法。首先,基于远程监督思想,使用2个特殊字典对特定领域文本进行伪标注;然后,使用BERT (来自Transformer的双向编码器表征)模型进行语义平滑扩展,并在含有噪音的伪标注语料中训练AutoNER(自动伪标注的命名实体识别)模型;最后,通过与传统机器学习方法条件的随机场进行试验对比,验证了该方法的有效性。 Aimed at the problem of requiring a large amount of annotated data in traditional named entity recognition (NER), a NER method in condition of specific field with annotated data scarcity is proposed. Firstly, based on the idea of distant supervision, two specific dictionaries are used to pseudo-annotate texts in the specific fields. Then, the bidirectional encoder representations from Transformer (BERT) model is adopted to smoothly extend the semantic, and the automaticNER (AutoNER) model is trained in the noised pseudo-annotated corpus.Finaly, experimen/ compared with the traditional machine learning method, conditional random field (CRF), verifies the validity of the method.
作者 刘哲宁 朱聪慧 郑德权 赵铁军 LIU Zhening;ZHU Conghui;ZHENG Dequan;ZHAO Tiejun(School of Computer Science and Technology, Harbin Institute of Technology, Harbin 150001, China)
出处 《指挥信息系统与技术》 2019年第5期14-18,共5页 Command Information System and Technology
基金 国家重点研发计划(2017YFB1002102)资助项目
关键词 命名实体识别 远程监督 语义向量 数据稀缺 named entity recognition (NER) distant supervision semantic vector data scarcity
  • 相关文献

参考文献5

二级参考文献66

共引文献1241

同被引文献31

引证文献5

二级引证文献17

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部