摘要
针对来自不同病种的患者问题匹配数据,利用预训练的BERT构建问句对相似度计算模型,提出了一种以病种分类为辅助任务的联合学习模型,并结合多模型融合策略,实现了不同类别模型与不同数据划分、不同训练阶段下训练得到的同类模型间的集成,以此提升匹配模型的泛化能力。所提出的集成模型,在医疗问句匹配数据集上的实验结果优于其他现有的文本匹配模型。
In this paper,a pre-trained BERT matching model is proposed to compute the similarity of question pairs asked by patients.In order to better explore the characteristics of questions from different disease categories,disease classification is incorporated to assist question matching by joint learning.Moreover,3 model fusion strategies are adopted in the ensemble phase.Different models,the same models trained on the dataset with different distributions and the same models in different training epochs are combined to improve the generalization ability.Experiments on medical question matching dataset demonstrate that our proposed model outperforms existing matching models.
作者
徐诗瑶
向阳
雷健波
XU Shiyao;XIANG Yang;LEI Jianbo(Tongji University College of Electronic and Information Engineering,Shanghai 201804,China)
出处
《中国卫生信息管理杂志》
2021年第4期556-560,566,共6页
Chinese Journal of Health Informatics and Management
基金
国家自然科学基金面上项目《基于深度学习和迁移学习的非结构化临床文本挖掘的方法探索》(项目编号:81771937)。