摘要
[目的/意义]近年来,重大慢性疾病严重危害着人们的健康。利用知识图谱挖掘慢性病防治因素与治疗药物可以合理地对慢性病进行健康管理。[方法/过程]提出一种半联合知识抽取方法,即以生物医学英文文献的摘要与标题为数据集,通过BioBERT等预训练模型同时抽取文本中的实体及其关系,并利用基于规则的方法获取三元组。最终构建慢病健康管理的知识图谱,并利用RotatE等图嵌入模型进行知识推理。[结果/结论]以糖尿病为实证研究,发现半联合抽取方法的抽取效果较好,融合多源数据后,共构建了含有约55032个实体和1010233个三元组的糖尿病健康管理知识图谱。深入挖掘了糖尿病的健康生活方式等预防因素及潜在治疗药物,为患者提供更加全面的防治建议,为科研人员提供新的药物研发方向。
s and titles of biomedical English literature as a dataset,simultaneously extracts entities and their relationships in the text through pre trained models such as BioBERT,and uses a rule-based approach to obtain the triplets.Finally,the paper constructed a knowledge graph for chronic disease health management and utilized graph embedding models such as RotatE for knowledge reasoning.[Result/conclusion]Taking diabetes as an empirical study,we found that the semi-joint extraction method is more effective in extraction,and after fusing the data from multiple sources,a total of diabetes health management knowledge graph containing about 55032 entities and 101023 triplets was constructed.The preventive factors such as the healthy lifestyle of diabetes and potential therapeutic drugs were deeply excavated,providing more comprehensive prevention and treatment recommendations for patients and a new direction for drug research and development for researchers.
作者
洪怡敏
张晗
白智瑛
Hong Yimin;Zhang Han;Bai Zhiying(School of Health Management,China Medical University,Shenyang Liaoning 110122)
出处
《情报理论与实践》
北大核心
2024年第8期180-189,210,共11页
Information Studies:Theory & Application
基金
2021年辽宁省教育厅基本科研项目(面上项目)人文社科类“基于深度学习的线上用户慢性病健康教育问答模型研究”的成果,项目编号:LJKR0275。
关键词
重大慢性病
知识图谱
自然语言处理
知识抽取
健康管理
major chronic diseases
knowledge graph
natural language processing
knowledge extraction
health management