期刊文献+

命名实体识别在中药名词和方剂名词识别中的应用 被引量:1

Application of Named Entity Recognition in the Recognition of Words for Chinese Traditional Medicines and Chinese Medicine Formulae
下载PDF
导出
摘要 目的:利用命名实体识别(Named Entity Recognition)技术识别文本中出现的中药名词和方剂名词,并比较两种命名实体识别方法在识别中药名词和方剂名词时的表现。方法:方法一为利用现有的分词工具(如'结巴'中文分词工具等)对文本进行分词,之后使用分词后的结果进行中药名词和方剂名词的匹配。方法二为搭建并训练用于中药名词和方剂名词识别的双向长短期记忆(Bidirectional Long Short Term Memory,BLSTM)神经网络模型。首先,采用两种可行的方法实现命名实体识别。其次,比较这两种方法的表现。结果:现有分词工具对中药名词和方剂名词的分词不准确,因此,会导致接下来的匹配阶段出现错误。而通过BLSTM神经网络模型进行命名实体识别,不但可以避免分词错误,而且在实验中表现出较强的歧义处理能力。结论:在应用命名实体识别技术于识别中药名词和方剂名词时,相比使用分词工具先分词后识别,通过训练神经网络模型对中药名词和方剂名词直接识别的方法更合适。 Objective:To identify words of Chinese traditional medicines,and Chinese medicine formulae by using Named Entity Recognition(NER)and compare the performance of two NER methods.Methods:The first method was to use the off-the-shelf programming modules,like'Jieba'Chinese word segmentation module,to segment sentences into words,and then to recognize the target keywords through word-matching.The second method was to build and train a neural network model--Bidirectional Long Short-Term Memory(BLSTM)specially for recognizing the words of the Chinese traditional medicines,and the Chinese medicine formulae.The two possible methods were used to implement NER.Then,the performance of these two methods was compared.Results:The current off-the-shelf programming modules for Chinese word segmentation were unable to segment the words of the Chinese traditional medicines,and the Chinese medicine formulae accurately,which led to inaccurate word matching accordingly.By contrast,the trained BLSTM not only avoided the possibility of inaccurate word segmentation,but also surprisingly exhibited better capability in dealing with the ambiguity of words.Conclusion:When NER was applied to identifying the words,it is more suitable to recognize the words of Chinese traditional medicines and Chinese medicine formulae directly by training neural network model than to segment words before recognition by the off-the-shelf programming models.
作者 龚德山 梁文昱 张冰珠 马星光 Gong Deshan;Liang Wenyu;Zhang Bingzhu;Ma Xingguang(Beijing University of Chinese Medicine,Beijing 100029,China)
机构地区 北京中医药大学
出处 《中国药事》 CAS 2019年第6期710-716,共7页 Chinese Pharmaceutical Affairs
基金 中央高校基本科研业务费专项资金(编号2018-JYB-XSCXCY47)
关键词 自然语言处理 命名实体识别 BLSTM神经网络 中文分词 natural language processing Named Entity Recognition BLSTM neural network Chinese word segmentation
  • 相关文献

同被引文献14

引证文献1

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部