摘要
电子病历是临床治疗过程中患者病情及治疗流程的重要载体之一,其中各类实体间关系包含了大量与患者健康相关的医学信息.因此,对电子病历文本的深度挖掘是获取医学知识、分析患者病情的有效手段之一.实体的高密度分布以及实体间关系的交叉互联为电子病历实体关系的抽取带来了极大挑战,应用于通识领域的实体关系抽取方法也因此受到极大的限制.针对这一文本差异性,本文提出一种基于多通道自注意力机制的"recurrent+transformer"神经网络架构,相比于主流的"recurrent+CNN"架构,该架构可强化模型对句级别语义特征的捕捉,提升对电子病历专有文本特点的学习能力,同时显著降低模型整体复杂度.此外,本文提出在该网络架构下的两种基于权重的辅助训练方法:带权学习的交叉熵损失函数以及基于权重的位置嵌入,前者用于缓解实体关系类别不均衡所造成的训练偏置问题,从而提升模型在真实分布数据中的普适性,同时可加速模型在参数空间的收敛速率;后者则用于进一步放大文本字符位置信息的重要性,以辅助提升transformer网络的训练效果.对比实验选用目前主流方法的6个模型作为基线,相继在2010i2b2/VA及SemEval 2013DDI医学语料中进行验证.相较于传统自注意力机制,多通道自注意力机制的引入在模型整体F1指标中最高实现10.67%的性能提升,在细粒度单项对比实验中,引入类别权重的损失函数在小类别样本中的F1值最高提升近23.55%.
The electronic medical record is one of the important carrier for patient’s condition and treatment during the clinical treatment process.The relationship between various types of entities contains a large number of medical knowledge related to the information of the patient.Therefore,the deep mining of electronic medical records is one of the effective means to obtain medical knowledge and analyze the patient’s condition.The high-density distribution of entities and the cross-connection of relationships between entities pose great challenges for the relation extraction in electronic medical records.For that,the methods of relation extraction applied in the general fields are greatly limited.In view of the characteristics of that,this paper proposes a“recurrent+transformer”architecture with multi-channel self-attention mechanism to enrich the semantic features of the sentence level,thus improving the learning ability of the characteristics for electronic medical records and reducing model complexity.In addition,this paper also proposes two auxiliary training methods based on weight,which are weighted-based cross entropy loss function and weighted-based position embedding.The former is applied to avoid the problem of training bias caused by categories imbalance,thus improving the universality of the model in the real distribution and accelerating the convergence rate.The later enhances the importance of position information with each character,which helps improve the training effect of transformer network.We selected six models with the best performance in the two methods as the baselines,and verified them in the 2010 i2b2/VA and SemEval 2013 DDI medical corpus.Compared with the traditional self-attention mechanism,the highest performance improvement of 10.67%is achieved in the overall F1 score of the model with the multi-channel self-attention mechanism.In the fine-grained single-item comparison experiment,the weighted-based loss function increases the F1 value in the small category sample by nearly 23.55%.
作者
宁尚明
滕飞
李天瑞
NING Shang-Ming;TENG Fei;LI Tian-Rui(School of Information Science and Technology,Southwest Jiaotong University,Chengdu 611756;Institute of Artificial Intelligence,Southwest Jiaotong University,Chengdu 611756)
出处
《计算机学报》
EI
CSCD
北大核心
2020年第5期916-929,共14页
Chinese Journal of Computers
基金
国家自然科学基金(61572407)
四川省科技计划(2017SZYZF0002)资助。