摘要
目的基于中药的化学元素含量,通过不同分析方法获得中药药性的分类准确率,并对不同方法的分类准确率进行比较。方法从《中药理论量化与应用研究》中获得目标中药,采用Excel对目标中药的化学元素数据进行提取,基于IBM SPSS Statistics 26软件对单个中药药性和化学元素进行2个独立样本非参数检验,将具有统计学意义的关联元素作为自变量,通过二元Logistic回归分析、决策树算法、人工神经网络等方法对因变量(药性)进行分类预测。运用此类算法得到中药药性的相关变量、分类准确率及模型函数系数,并比较不同方法的分类效果。结果建立了含有105味中药、42种化学元素的初步元素数据库,对中药药性进行统计,获得四气、五味、归经的药性变量。通过非参数检验得到药性的相关因素,寒性的相关因素有Be、Sr、Ca、La;苦味的相关因素是Mn、Ni、K、Ca、V、Si、Co、Zn;脾经的相关因素有Ni、Bi、Co、Be、Eu、Ce、Nd、V、Pr、Sm、La、Dy。几种算法对寒性、苦味、脾经的分类预测准确率:二元Logistic回归分析分别是87.6%、91.4%、81.4%;决策树模型训练集分别为77.8%、87.7%、78.1%,检验集分别为69.7%、65.0%、62.5%;人工神经网络模型训练集分别为74.1%、73.7%、74.0%,检验集分别为54.5%、72.4%、67.9%。结论基于单因素分析获得药性的相关因素,通过二元Logistic回归、决策树、人工神经网络分析,揭示了中药药性与化学元素间存在一定联系。从分类准确率来看,决策树与神经网络训练集的准确率均高于检验集。决策树训练集、检验集平均分类准确率均高于神经网络。二元Logistic回归分类的准确率虽高于神经网络和决策树,但二元Logistic回归没有区分训练集和检验集。
Objective Based on the content of chemical elements in traditional Chinese medicine(TCM),the classification accuracy of TCM medicinal properties was obtained through decision tree and neural network,and the classification accuracy of different methods was compared.Methods The target TCM was obtained from the Quantitative and Applied Research of TCM Theory.Excel was used to extract the chemical element data of the target TCM,and two independent samples of non-parametric tests were performed on the medicinal properties and chemical elements of a single TCM based on IBM SPSS Statistics 26 software,and statistically significant associated elements were taken as independent variables.Binary logistic regression analysis,decision tree algorithm,artificial neural network and other methods were used to classify and predict the dependent variable(medicinal properties).Such algorithms were used to obtain the relevant variables,classification accuracy and model function coefficients of TCM medicinal properties,and the classification effects of different methods were compared.Results A preliminary element database containing 105 TCMs and 42 chemical elements was established.The drug properties of TCM were statistically analyzed,and the drug properties variables of four qi,five flavours and channel tropism were obtained.The correlation factors of the medicinal properties were obtained by non-parametric tests.The cold nature related elements were Be,Sr,Ca,La.The bitter flavor related elements were Mn,Ni,K,Ca,V,Si,Co,Zn;the spleen meridian related elements were Ni,Bi,Co,Be,Eu,Ce,Nd,V,Pr,Sm,La,Dy.Binary logistic regression analysis was used to obtain regression models.The overall accuracies of classification were 87.6%,91.4%,81.4%for cold nature,bitter flavor,spleen meridian,respectively.In the training samples of the decision tree model,the classification accuracies were 77.8%,87.7%,78.1%for cold nature,bitter flavor and spleen meridian,respectively.The accuracies of the classification of the samples tested were 69.7%,65.0%,62.5%for cold nature,bitter flavor and spleen meridian,respectively.In the training samples of the artificial neural network,the classification accuracies were 74.1%,73.7%,74.0%for cold nature,bitter flavor and spleen meridian,respectively.In the tested samples,the classification accuracies were 54.5%,72.4%,67.9%for cold nature,bitter flavor and spleen meridian,respectively.Conclusion Based on the univariate analysis of the relevant factors of medicinal properties,binary logistic regression,decision tree and artificial neural network analysis revealed that there is a certain relationship between the medicinal properties of TCM and chemical elements.From the perspective of classification accuracy,the accuracy of the decision tree and neural network training set is higher than that of the test set.In the comparison of the two methods,the average classification accuracy of the decision tree training set and the test set is higher than that of the neural network.Although the accuracy of binary logistic regression classification is higher than that of neural network and decision tree,binary logistic regression does not distinguish between the training set and the test set.
作者
徐钦涌
黄志帮
姚思梦
陈远方
宁小英
侯政昆
陈新林
XU Qinyong;HUANG Zhibang;YAO Simeng;CHEN Yuanfang;NING Xiaoying;HOU Zhengkun;CHEN Xinlin(Department of Traditional Chinese Medicine,Binhaiwan Central Hospital of Dongguan,Dongguan 523900,China;The First Clinical Medical School of Guangzhou University of Chinese Medicine,Guangzhou 510405,China;The First Affiliated Hospital of Shantou University Medical College,Jieyang Haoze Hospital,Jieyang 522021,China;Foshan Hospital of Traditional Chinese Medicine,Foshan 528099,China;The Third Affiliated Hospital of Southern Medical University,Guangzhou 510630,China;Department of Spleen and Stomach Diseases,the First Affiliated Hospital of Guangzhou University of Chinese Medicine,Guangzhou 510405,China;College of Basic Medical Sciences,Guangzhou University of Chinese Medicine,Guangzhou 510006,China)
出处
《中草药》
CAS
CSCD
北大核心
2024年第17期5964-5971,共8页
Chinese Traditional and Herbal Drugs
基金
国家重点研发计划项目“基于证素辨识和状态可测原理的动静态中医临床评价方法学构建与示范研究”(2023YFC3503002)
广州中医药大学第一附属医院中青年骨干人才培育项目(09005650008)
广东省中医药信息化重点实验室项目(2021603)
广东省教育厅高校科研项目(2020ZDZX3011)。
关键词
中药药性量化
化学元素
数据挖掘
二元Logistic回归分析
决策树算法
人工神经网络
quantification of medicinal properties of traditional Chinese medicine
chemical elements
data mining
statistical analysis
binary Logistic regression analysis
decision tree algorithm
artificial neural network