期刊文献+

特定领域的命名实体识别方法的研究 被引量:8

Research on Named Entity Recognition Method in Specific Fields
下载PDF
导出
摘要 在特定领域的命名实体识别技术中,针对不同领域有各种不同的识别方法。不同领域文本具有其独特的文本特征,这导致已有领域的识别方法难以适应新的特定领域。针对该问题,提出一种基于条件随机场、半监督学习和主动学习相结合的方法,将其形成一个统一的技术框架来适应各个特定领域的命名实体识别。该方法首先选取特定文本的基本通用特征构建特征集合,训练条件随机场对特定领域进行命名实体的初步识别,再通过主动选取置信度低于选定阈值的样本进行人工标注,并迭代扩展训练样本来达到高识别效果。为验证所提方法,针对轨道交通领域文本进行了实验,实验结果表明该方法行之有效,在轨道交通领域取得了较好的识别效果。 For named entity recognition technology in a specific domain,there are various identification methods corresponding to different fields.Different fileds of texts have their own unique textual features,which leads to the existing identification method is difficult to adapt to new specific domain.In order to solve this problem,this paper proposes a method based on conditional random field,semi-supervised learning and active learning,which forms a unified technical framework to adapt to the named entity recognition in each specific domain.This method constructs the feature set based on characteristics of rail transit text,then trains CRF to recognize named-entity of rail traffic text,and selects the samples with lower confidence level than the selected threshold,and then manually extends the training samples to achieve high goals.In order to validate the method,this paper carries on the experiment in the field of rail transit.The experimental results show that the method is effective and has a good recognition effect in the field of rail transit.
作者 张磊 ZHANG Lei(School of Computer and Information Technology,Beijing Jiaotong University,Beijing 100044,China)
出处 《计算机与现代化》 2018年第3期60-64,共5页 Computer and Modernization
关键词 主动学习 半监督学习 条件随机场 命名实体识别 特定领域 active learning semi-supervised conditional random field(CRF) named entity recognition(NER) specific domain
  • 相关文献

参考文献11

二级参考文献115

共引文献238

同被引文献70

引证文献8

二级引证文献24

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部