摘要
目的基于活性成分属性,探究药材主归经络判别模型的构建及优化。方法收集药材归经属性及活性成分,以单经络药材所含活性成分为基础,构建活性成分结构特征、理化性质分类集合以及活性成分频数分类集合。应用余弦相似度算法,量化不同经络药材间活性成分的属性差异,计算药材所含活性成分组成的集合与不同活性成分类别间的相似性,构建集合相似度矩阵作为建模数据集。应用人工智能算法建立药材主归经络判别模型,并对模型进行优化及评价。结果收集209味药材并建立了药材-归经属性-活性成分数据集,通过对单经络药材活性成分分类,获取了34个活性成分分类集合。计算每味药所含成分与不同活性成分类别之间的余弦相似度,得到维度为209×34的集合相似度矩阵。使用K近邻算法、随机森林、极端梯度提升算法构建药材主归经络判别模型,并进行超参数调优,根据最优参数下测试集上的指标,评价模型的优劣。随机森林算法表现出了最优的性能,平衡准确率、曲线下面积(area under curve,AUC)分别为0.86、0.90。结论构建的药材主归经络判别模型达到了较好的结果并且运行稳定,探索了药材归经属性的理论基础,为揭示药材归经属性与药材活性成分间关系提供了新的研究方向,为药材归经判别研究提供一种新的思路。
Objective To explore the construction and optimization of a discrimination model for the main meridians of medicinal materials based on the properties of active components.Methods Collecting the meridian and active component information of medicinal components.Using the active components of single-meridian medicinal materials as a foundation,the structural features of the active ingredients and the classification set of physicochemical properties and the frequency classification set of the active components were constructed.By applying the cosine similarity algorithm,quantifying the differences in properties of active components among various meridians,calculating the similarity between the set of active components contained in medicinal materials and the different active component categories,and constructing a set similarity matrix as the modeling dataset.Artificial intelligence algorithms were then utilized to establish discrimination model for the main meridians of medicinal materials and to optimize and evaluate the model.Results Collecting 209 medicinal materials and establishing a dataset comprising medicinal materials,meridians,and active components.By categorizing the corresponding components of single-meridian medicinal materials,we obtained 34 classification sets.Calculating the cosine similarity between the components contained in each herb and different active component categories yielded a similarity matrix with dimensions of 209×34.K-nearest neighbor algorithm,random forest and extreme gradient lifting algorithm were used to construct the discrimination model for the main meridians of medicinal materials,hyperparameter tuning was carried out and the model’s performance based on the test set under the optimal parameters was evaluated.The performance of the random forest algorithm was found to be optimal,with balanced accuracy and area under curve(AUC)of 0.86,0.90,respectively.Conclusion This study successfully constructed a model for classifying the main meridians of medicinal materials,achieving favorable and stable results.It explored the theoretical foundation of meridian entry theory,providing a new direction for understanding the relationship between meridian and the active components of medicinal materials,and providing a new approach for the discrimination of meridians for medicinal materials.
作者
赵书言
王书睿
申镇华
肖珅
翟玉萱
姜希伟
ZHAO Shuyan;WANG Shurui;SHEN Zhenhua;XIAO Shen;ZHAI Yuxuan;JIANG Xiwei(School of Medical Equipment,Shenyang Pharmaceutical University,Shenyang 110016,China)
出处
《中草药》
CAS
CSCD
北大核心
2024年第16期5573-5582,共10页
Chinese Traditional and Herbal Drugs
基金
辽宁省教育厅基本科研项目(LJKFZ20220258)
辽宁省教育厅基本科研项目(JYTQN2023337)。
关键词
中药
归经判别
归经理论
机器学习
中药药性
traditional Chinese medicine
meridian discrimination
meridian entry theory
machine learning
properties of traditional Chinese medicines