期刊文献+

基于字向量的BiGRU-CRF肺癌医案四诊信息实体抽取研究 被引量:4

Study on BiGRU-CRF Named Entity Extraction Based on Word Vector for Lung Cancer Medical Cases with Four Diagnostic Methods
下载PDF
导出
摘要 目的肺癌医案中蕴含丰富的四诊信息,这些四诊信息对肺癌的研究具有重要意义。本文通过基于字向量的BiGRU-CRF方法实现四诊信息实体抽取研究。方法研究利用BERT模型对基于自定义词典自动化标注后的肺癌临床数据进行预训练,得到包含上下文语义的字向量,再将其作为BiGRU-CRF模型输入,实现肺癌医案四诊信息命名实体抽取。结果本文方法对临床表现、舌象、脉象、身体部位、程度副词五类实体抽取的F1值分别为98.17%、99.74%、99.77%、94.72%、93.36%,对比模型BERT-BiLSTM-CRF、BERT模型和Word2vec-BiGRU-CRF模型抽取的F1值分别为(96.46%、99.31%、98.78%、94.95%、92.44%)、(94.38%、95.14%、94.99%、90.89%、91.82%)和(91.27%、97.95%、98.09%、87.01%、86.77%)。结论本文利用基于字向量的BiGRU-CRF方法具有更强的命名实体识别能力,可以更好地应用于中医医案命名实体抽取研究,进而为医案的关系抽取以及知识图谱构建提供支持。 Objective To achieve the study of entity extraction of information of four diagnostic methods through the word vector-based BiGRU-CRF method because medical cases of lung cancer are rich in four diagnostic information,which are of great importance to the study of lung cancer.Methods In the research,the BERT model was used to pretrain the lung cancer clinical data after automated annotation based on custom dictionaries to obtain word vectors containing contextual semantics,which were then used as input to the BiGRU-CRF model to achieve named entity extraction of lung cancer medical case Information with four diagnostic methods.Results A F1 value of 98.17%,99.74%,99.77%,94.72%,93.36%were selected for clinical manifestations,tongue,pulse,body parts and degree adverbs.The F1 values extracted from BERT-BiLSTM-CRF model,BERT model and Word2 vec-BiGRU-CRF model were(96.46%,99.31%,98.78%,94.95%,92.44%),(94.38%,95.14%,94.99%,90.89%,91.82%)and(91.27%,97.95%,98.09%,87.01%,86.77%).Conclusion BiGRU-CRF method based on word vector in this paper has stronger recognition ability of named entity and can be better applied to the study of entity extraction of TCM medical records.And then provide more support for the relationship extraction of medical records and the construction of knowledge map.
作者 屈丹丹 杨涛 朱垚 胡孔法 Qu Dandan;Yang Tao;Zhu Yao;Hu Kongfa(School of Artificial Intelligence and Information Technology,Nanjing University of Chinese Medicine,Nanjing 210023,China;The First Clinical Medical College,Nanjing University of Chinese Medicine,Nanjing 210023,China)
出处 《世界科学技术-中医药现代化》 CSCD 北大核心 2021年第9期3118-3125,共8页 Modernization of Traditional Chinese Medicine and Materia Medica-World Science and Technology
基金 国家科学技术部国家重点研发计划“中医药现代化研究”重点专项(2017YFC1703500):中医药大数据中心与健康云平台构建,负责人:李国正 国家自然科学基金委员会面上项目(82074580):基于知识图谱的现代名老中医诊治肺癌用药规律及其机制研究,负责人:胡孔法
关键词 BERT模型 BiGRU-CRF模型 肺癌 四诊信息 实体抽取 BERT model BiGRU-CRF model Lung cancer Information of four diagnostic methods Entity extraction
  • 相关文献

参考文献17

二级参考文献108

共引文献308

同被引文献39

引证文献4

二级引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部