摘要
中医药处方数据挖掘是传承名医经验与新药研发的重要方法之一,然而目前相关工作存在研究方案待优化、统计不规范等问题。研究总结的主要问题与对应解决方案包括4个方面。①研究方案设计需考虑疗效与个案质量。②关联规则的置信度顺序差异的意义尚待进一步思考,并且不应忽视提升度。③聚类分析步骤复杂,聚类变量的选择应综合考虑中药频次、网络拓扑学参数与实际应用意义等方面;距离计算与聚类方法的选择应根据中医药临床数据特征进行改进,Jaccard距离及改进方案在今后应得到重视;不应呈现单一的、未经解释的聚类结果,而应综合中医临床特征与聚类客观评价指标选择最终聚类方案。④计算相关性系数时,不应将仅适配于连续变量的算法应用于二分类变量。该文基于中医临床研究特征与统计学原理阐述了上述问题的内涵,并给出对应建议,为今后数据挖掘研究提供重要参考。
Mining data from traditional Chinese medicine(TCM)prescriptions is one of the important methods for inheriting the experience of famous doctors and developing new drugs.However,current research work has problems such as to be optimized research plans and non-standard statistics.The main problems and corresponding solutions summarized by the research mainly include four aspects.The research plan design needs to consider the efficacy and quality of individual cases.②The significance of the difference in confidence order of association rules needs to be further considered,and the lift should not be ignored.③The clustering analysis steps are complex.The selection of clustering variables should comprehensively consider factors such as the frequency of TCM,network topology parameters,and practical application significance.The selection of distance calculation and clustering methods should be improved based on the characteristics of TCM clinical data.Jaccard distance and its improvement plan should be given attention in the future.A single,unexplained clustering result should not be presented,but the final clustering plan should be selected based on a comprehensive consideration of TCM clinical characteristics and objective evaluation indicators for clustering.When calculating correlation coefficients,algorithms that are only suitable for continuous variables should not be applied to binary variables.This article explained the connotations of the above problems based on the characteristics of TCM clinical research and statistical principles and proposed corresponding suggestions to provide important references for future data mining research work.
作者
但文超
赵国桢
何庆勇
张辉
李博
张广中
DAN Wen-chao;ZHAO Guo-zhen;HE Qing-yong;ZHANG Hui;LI Bo;ZHANG Guang-zhong(Department of Dermatology,Beijing Hospital of Traditional Chinese Medicine,Capital Medical University,Beijing 100010,China;Beijing Hospital of Traditional Chinese Medicine,Capital Medical University/Beijing Institute of Chinese Medicine/Beijing Center for Evidence-based Traditional Chinese Medicine,Beijing 100010,China;Department of Cardiovascular Medicine,Guang'anmen Hospital,China Academy of Chinese Medical Sciences,Beijing 100053,China)
出处
《中国中药杂志》
CAS
CSCD
北大核心
2023年第17期4812-4818,共7页
China Journal of Chinese Materia Medica
基金
国家自然科学基金面上项目(8227151381)
中西融合·焕新计划项目(HXJH2022-15)。
关键词
中医药
数据挖掘
统计规范
traditional Chinese medicine
data mining
statistical specifications