期刊文献+

基于多重注意力机制的中文医疗实体识别 被引量:4

Chinese Medical Entity Recognition Based on Multiple Attention Mechanism
下载PDF
导出
摘要 医疗实体识别是从医疗文本中识别疾病、症状、药物等多种类型的医疗实体,能够为知识图谱、智慧医疗等下游任务的发展提供支持。针对现有命名实体识别模型提取语义特征较单一、对医疗文本语义理解能力不足的问题,提出一种基于多重注意力机制的神经网络模型MANM。为捕获文本中更丰富的语义特征,在模型输入中引入医疗词汇先验知识,通过自注意力机制获取医疗文本的全局语义特征,并利用双线性注意力机制获取词汇和字符层面的潜在语义特征,得到包含字词间依赖关系的特征向量。为提高模型的上下文信息捕捉能力,采用改进的长短时记忆网络提取文本时序特征,同时设计多头自注意力机制获取词语间隐含的关联语义特征。最后融合上述多层次语义特征,利用条件随机场进行实体识别。在公开数据集CMeEE、CCKS2019、CCKS2020上进行对比实验,实验结果表明,MANM模型在3个数据集上的F1值分别达到64.29%、86.12%、90.32%,验证了所提方法在医疗实体识别中的有效性。 Medical entity recognition aims to identify multiple medical entities,such as diseases,symptoms,and drugs,from medical texts.It can support the development of downstream tasks,such as knowledge graphs and smart medical treatment,with high theoretical and practical application value.To address the problems in which the current Named Entity Recognition(NER)model extracts relatively simple semantic features and cannot comprehend the semantics of medical texts,this study proposes a neural network model based on multiple attention mechanism,called MANM.To capture richer semantic features in the texts,prior knowledge of medical vocabulary is first introduced in the model input,and the global semantic features of the medical text are obtained through the self-attention mechanism.The implicit semantic features at the vocabulary and character levels are obtained through the bilinear attention mechanism to determine the feature vectors containing dependencies between characters and words.To improve the contextual information capture ability of the model,the timing sequence features of the texts are obtained through an improved Long and Short-Term Memory(LSTM)network,and a multi-head self-attention mechanism is designed to obtain the implicit associated semantic features between words.Finally,these multi-level semantic features are fused to perform entity recognition using a Conditional Random Field(CRF).This study conducts a comparative experiment based on public datasets CMeEE,CCKS2019,and CCKS2020.The experimental results show that the F1-score of the three datasets reach 64.29%,86.12%,and 90.32%,respectively,which verifies the effectiveness of the proposed method in medical entity recognition.
作者 陈明 刘蓉 张晔 CHEN Ming;LIU Rong;ZHANG Ye(College of Physics Science and Technology,Central China Normal University,Wuhan 430079,China)
出处 《计算机工程》 CAS CSCD 北大核心 2023年第6期314-320,共7页 Computer Engineering
基金 国家社会科学基金重点项目(22ATQ004)。
关键词 命名实体识别 医疗文本 注意力机制 长短时记忆网络 语义特征 Named Entity Recognition(NER) medical text attention mechanism Long and Short-Term Memory(LSTM)network semantic feature
  • 相关文献

参考文献2

二级参考文献2

共引文献50

同被引文献20

引证文献4

二级引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部