摘要
目的基于随机森林算法建立一种可准确预测中医痰湿体质的模型,找出相关重要特征。方法数据预处理之后,共纳入2710名受试者,其中50%为痰湿体质者;利用RFECV方法进行特征选择,将筛选出来的特征子集用于构建基于随机森林的痰湿体质判定预测模型,通过准确率、精确率、灵敏度、特异度、F1-score和AUC六个指标来对预测模型的性能进行评价,并与SVM和逻辑回归两种模型进行比较。结果通过RFECV特征选择方法,共有16个特征被用于痰湿体质预测模型的构建;在建模组和测试组中,随机森林模型预测准确率分别为0.907、0.814;精确率分别为0.936、0.827;灵敏度分别为0.885、0.806;特异度分别为0.932、0.822;F1-score分别为0.910、0.816;AUC值分别为0.970、、0.901,均大于其余两种预测模型。结论基于随机森林的中医痰湿体质预测模型具有较好的性能。本文的研究为中医体质类型预测的客观化模型构建提供了方法学参考。
Objective To establish a model based on Random Forest algorithm that can accurately predict the phlegmdampness constitution in traditional Chinese medicine and find out the relevant important characteristics.Methods After data preprocessing,a total of 2710 subjects were included,50%of which were phlegm-dampness constitutions;The RFECV method was used to implement feature selection,and the selected feature subset was used to construct the prediction model of phlegm-dampness constitution based on random forest,and the performance of the prediction model was evaluated by six indicators:accuracy,precision,sensitivity,specificity,F1-score and AUC,and compared with SVM and logistic regression models.Results A total of 16 features were used in the prediction model of phlegm-dampness constitution by RFECV feature selection method.In the modeling group and the validation group,the prediction accuracy of the random forest model was 0.907,0.814,the accuracy was 0.936 and 0.827,the sensitivity was 0.885 and 0.806,the specificity was 0.932 and 0.822,the F1-score was 0.910 and 0.816,and the AUC value was 0.970 and 0.901,respectively,which were higher than those of the other two prediction models.Conclusion The prediction model of phlegm-dampness constitution of traditional Chinese medicine based on the random forest model has good performance.This study provides methodological reference for objectified model of TCM constitution type prediction.
作者
罗悦
周娟
LUO Yue;ZHOU Juan(Chengdu University of Traditional Chinese Medicine,Chengdu 611137,China)
出处
《世界科学技术-中医药现代化》
CSCD
北大核心
2024年第7期1906-1915,共10页
Modernization of Traditional Chinese Medicine and Materia Medica-World Science and Technology
基金
国家自然科学基金委员会青年基金(81904324):基于动态数据的灰色关联分析法构建中医体质动态变化规律知识图谱,负责人:罗悦。
关键词
随机森林
痰湿体质
预测模型
特征选择
Random Forest
Phlegm-dampness Constitution
Predictive Models
Feature Selection