摘要
目的针对处方数据特点,提出一种融合共现和语义信息的药对提取算法。方法将处方数据转化为矩阵数据,计算药物之间的关联信息作为初始筛选指标,再根据处方数据构建词向量,计算药物之间的语义相似度,作为第二筛选指标,从而提取潜在药对。将本文算法与经典的Apriori算法分别对1090条肺癌门诊处方用药数据进行实验,对比分析实验提取结果,从而验证本算法提取的有效性及实用性。结果与Apriori算法相比,本算法提取药对效果较好,可以在药物频次相差较大的情况下合理地缩小潜在药对选择范围,此外对阈值结果进行比对,针对本文数据根据数量变化与专家经验,分别推出两组建议阈值,在推荐阈值设置的范围下分别成功提取出医案中的88组与33组药对。结论词频结合语义信息用于筛选潜在药对的方法是可行且有效的,可为中医临床处方用药经验挖掘提供方法参考。
Objective To propose a drug pair extraction algorithm integrating co-occurrence and semantic information for prescription data.Methods The prescription data were transformed into matrix data,and the association information between drugs was calculated as the initial screening index.Then the word vector was constructed based on the prescription data,and the semantic similarity between drugs was calculated as the second screening index,so as to extract potential drug pairs.The algorithm of this paper and the classical Apriori algorithm were experimented on 1090 lung cancer outpatient prescriptions respectively,and the experimental extraction results were compared and analyzed,so as to verify the usability and effectiveness of this drug pair extraction algorithm.Results Compared with the Apriori algorithm,the present algorithm had better effect in extracting drug pairs,which could reasonably help to narrow down the range of options of potential drug pairs under the situation of large difference in drug frequencies,and successfully extracted 88 groups of drug pairs in medical cases under the range of recommended threshold settings.Conclusion The method of word frequency combined with semantic information for extracting potential drug pairs is feasible and effective,and can provide methodological reference for experience mining in clinical prescription medication.
作者
唐静
杨涛
朱垚
胡孔法
Tang Jing;Yang Tao;Zhu Yao;Hu Kongfa(School of Artificial Intelligence and Information Technology,Nanjing University of Chinese Medicine,Nanjing 210023,China;Jiangsu Provincial TCM Technology Engineering Research Center of Health and Health Preservation,Nanjing 210023,China;The First Clinical Medical College,Nanjing University of Chinese Medicine,Nanjing 210023,China;Jiangsu Collaborative Innovation Center of Traditional Chinese Medicine in Prevention and Treatment of Tumor,Nanjing 210023,China)
出处
《世界科学技术-中医药现代化》
CSCD
北大核心
2024年第1期88-98,共11页
Modernization of Traditional Chinese Medicine and Materia Medica-World Science and Technology
基金
国家自然科学基金委员会面上项目(82074580):基于知识图谱的现代名老中医诊治肺癌用药规律及其机制研究,负责人:胡孔法
江苏省教育厅人才项目(2021):江苏高校“青蓝工程”优秀青年骨干教师项目,负责人:杨涛
关键词
药对筛选
药物共现
语义信息
词向量
数据挖掘
Drug pair screening
Drug co-occurrence
Semantic information
Word vector
Data mining