摘要
目的:实现和比较5种常见的机器学习算法在脂肪肝分类预测研究中的应用。方法:通过主成分分析对数据的体检指标进行降维,然后应用决策树、神经网络、支持向量机、贝叶斯网络和随机森林算法,分别构建脂肪肝分类预测模型,对1956例体检数据进行脂肪肝的分类预测。结果:决策树分类模型在脂肪肝分类预测上的准确率最高,达到70.14%,其次是支持向量机和神经网络模型,处于68%左右的水平。结论:本文所研究的典型算法在脂肪肝分类的预测上具有较为可靠分类预测能力,但决策树模型在应用小样本数据上表现出了优势;同时,还发现臀围(HIP)和甘油三酯(TG)可能与脂肪肝分类关联密切。
Objective:To achieve and compare the application of five common machine learning algorithms in the classification and prediction research of fatty liver.Methods:Firstly,the data dimensions of the physical indicators were reduced by principal component analysis.Then the decision tree,neural network,support vector machine,bayesian network and random forest algorithm were applied to construct the fatty liver classification prediction model respectively,and predict the classification of the fatty liver for 1956 cases of physical examination data.Results:The accuracy rate of decision tree classification model was the highest in fatty liver classification prediction,reaching 70.14%,followed by support vector machine and neural network model,which was about 68%.Conclusion:The typical algorithms studied in this paper has a reliable classification prediction ability in the classification of fatty liver,but the decision tree model shows advantages in the application of small sample data.At the same time,it was found that hip circumference(HIP)and triglyceride(TG)may be closely related to the classification of fatty liver.
作者
余秋燕
赵莹
孙继佳
邵建华
Yu Qiuyan(Science Teaching and Research Section of Pharmacy School,Shanghai University of Traditional Chinese Medicine,Shanghai 201203)
出处
《数理医药学杂志》
2019年第1期1-3,共3页
Journal of Mathematical Medicine
基金
上海中医药大学第十六期课程建设一般项目:<高等数学>课程中融入数学建模教学的研究与实践(SHUTCMKCJSYB2017005)
国家自然基金面上项目:从非酒精性脂肪肝不同典型证候的尿代谢物探讨证相关物质谱(No.81473475)
关键词
脂肪肝
主成分分析
决策树
神经网络
支持向量机
贝叶斯网络
随机森林
fatty liver
principal component analysis
decision tree
neural network
support vector machine
bayesian network
random forest