摘要
为探讨不同机器学习模型在分类问题中的优劣,以UCI机器学习中的威斯康辛乳腺癌数据集为研究对象,使用梯度提升树(GBDT)、多层感知机(MLP)和支持向量机(SVM)分别建立乳腺癌预测模型。研究结果表明,三种模型在癌症分类问题中均有良好的表现,MLP模型预测精度更好,泛化能力更强,且参数方面更为简单。GBDT模型参数较多,需要进行调参。在今后研究中,可以采用网格搜索法对GBDT和MLP进行调参,并将这几种模型用于更多的分类问题。
In order to explore the advantages and disadvantages of different machine learning algorithms in classification problems,the research takes Wisconsin breast cancer data set in UCI machine learning as the research object,and uses Gradient boosting decision tree(GBDT),multi-layer perceptron(MLP)and support vector machine(SVM)to respectively establish breast cancer prediction model.The research results show that three kinds of models all play good performance in cancer classification.MLP has more prediction accuracy,stronger generalization and simpler parameter.GBDT has more model parameters,and needs the parameter adjustment.In the future study,we can adopt grid search method to adjust GBDT and MLP parameters,and use the models in more classification problems.
作者
王亚林
陈忍忍
Wang Yalin;Chen Renren(Jiangsu Second Geological Engineering Survey Institute, Xuzhou 221000, China)
出处
《黑龙江科学》
2021年第4期16-18,22,共4页
Heilongjiang Science
关键词
梯度提升树
多层感知机
支持向量机
机器学习
分类问题
癌症
Gradient boosting decision tree
Multi-layer perceptron
Support vector machine
Machine learning
Classification problem
Cancer