基于混合动态掩码与多策略融合的医疗知识图谱问答

Medical Knowledge Graph Question-Answering System Based on Hybrid Dynamic Masking and Multi-strategy Fusion

下载PDF

导出

摘要医疗知识图谱问答结合医学知识和自然语言处理技术,为医疗从业者和患者提供准确、快速的问答服务。随着数据激增,现有的中文医疗知识图谱不够全面,并且医学问题复杂多义,准确识别实体信息、生成通俗易懂的回答仍有挑战。提出了一种基于混合动态掩码与多策略融合的医疗知识图谱问答框架。通过整合公开数据集与医药平台的疾病知识,构建了一个包含34167个实体和297463条关系的医疗知识图谱,涵盖疾病、药品、食物等多个类别。提出BERT-MaskAttention-BiLSTM-CRF混合动态掩码模型来精确识别输入的医疗实体信息,更有效地关注重要内容,去除冗余信息干扰。采用实体对齐策略将医疗实体进行统一和标准化,通过意图识别策略深入理解用户的查询意图,结合大型语言模型对知识图谱的输出进行润色,保证回答内容更加容易理解。实验结果表明,在实体识别对比实验中模型的宏观平均F1值达到0.9602,在问答测试实验中,平均准确率达到0.9656,且生成的内容更加通俗易懂,可解释性强。 Medical knowledge graph question-answering combines medical knowledge and natural language pro-cessing technology to provide accurate and fast question-answering services for medical practitioners and patients.However,the current Chinese medical knowledge graphs are not comprehensive enough due to the surge in data.Additionally,the complex and ambiguous nature of medical questions poses a significant challenge in accurately identifying entity information and generating answers that are both easily comprehensible and accessible to the public.This paper proposes a medical knowledge graph question-answering framework based on hybrid dynamic masking and multi-strategy fusion.Initially,a medical knowledge graph encompassing 34167 entities and 297463 relationships is constructed by integrating public datasets and disease knowledge from medical platforms,covering categories such as diseases,medications,and food.Subsequently,a BERT-MaskAttention-BiLSTM-CRF hybrid dynamic masking model is introduced to accurately identify medical entity information in the input,effectively focusing on essential content and eliminating interference from redundant information.Finally,entity alignment strategies are employed to unify and standardize medical entities,while intent recognition strategies delve into users’query intentions.This is coupled with the use of large language models to refine the output from the knowledge graph,ensuring that the responses are more readily comprehensible.Experimental results demonstrate that the model achieves a macro-average F1 score of 0.9602 in entity recognition comparative experiments and an average accuracy of 0.9656 in question-answering tests.The generated content is more easily comprehensible and interpretable.

作者王润周张新生 WANG Runzhou;ZHANG Xinsheng(School of Management,Xi'an University of Architecture and Technology,Xi'an 710055,China)

机构地区西安建筑科技大学管理学院

出处《计算机科学与探索》 CSCD 北大核心 2024年第10期2770-2786,共17页 Journal of Frontiers of Computer Science and Technology

基金陕西省重点产业创新链(群)-工业领域项目(2022ZDLGY06-04) 陕西省社科界重大理论与现实问题研究联合项目(2022HZ1522)。

关键词混合动态掩码多策略融合知识图谱医疗问答大语言模型 hybrid dynamic masking multi-strategy fusion knowledge graph medical question-answering large language model

分类号 TP391 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

1张朝阳,沈建辉,叶伟荣.融合预训练语言模型的知识图谱在政务问答系统中的应用研究[J].数字通信世界,2024(9):188-190.
2徐生炜.图书馆服务革命:基于大语言模型的智慧图书馆未来发展趋势[J].图书馆学刊,2024,46(7):1-5.
3卢梦奇,江海龙,何晓亮.交通管理业务系统智能客服融合应用技术研究[J].警察技术,2024(5):85-88.
4赵盾,佘学兵,邬昌兴.基于BERT-BiLSTM-CRF党建领域命名实体识别[J].计算机与现代化,2024(9):91-94.
5卢雨昆,游温娇,李国军.大型活动外围治安风险下无人机布控策略研究[J].信息系统工程,2024(9):43-47.
6李纯阳.杭州浦乐单元小学设计策略研究[J].福建建筑,2024(8):1-5.
7焦艺凡.多维角度分析工程英语中多名词化现象的原因[J].现代语言学,2024,12(9):455-460.
8林萃青.中国诗乐原真表演的多样性及其建设性分析与论证[J].音乐探索,2024(3):3-13.
9黎万峡.“孤岛”历史电影的真实限度——《孔夫子》的创作策略与表达话语[J].视听理论与实践,2024(5):53-61.

计算机科学与探索

2024年第10期

浏览历史

内容加载中请稍等...

基于混合动态掩码与多策略融合的医疗知识图谱问答

相关作者

相关机构

相关主题

浏览历史