摘要
为了克服传统机器学习算法及其框架的弊端,深入分析了K-均值算法与随机森林分类算法,提出了改进的AKM与ARF算法,建立了基于Spark平台技术的AMLF机器学习应用框架。由验证结果可知,AKM算法在各数据集中的分类准确率皆接近100%,具有较强的数据聚类能力,再者AKM算法在各数据集中的加速比皆较高,因而可升级性亦较强。而ARF验证结果显示,其不仅分类准确率较高,且可升级性较强。
In order to overcome the disadvantages of traditional machine learning algorithms and their frameworks,the K-means algorithm and random forest classification algorithm are deeply analyzed,the improved AKM and ARF algorithms are proposed,and the AMLF machine learning application framework based on Spark platform technology is established.The verification results show that the classification accuracy of AKM algorithm in each data set is close to 100%,and it has strong data clustering ability.In addition,the acceleration ratio of AKM algorithm in each data set is high,so it has strong scalability.The ARF verification results show that it not only has high classification accuracy,but also has strong scalability.
作者
查道贵
ZHA Daogui(Computer Information Department,Suzhou Vocational Technical College,Suzhou Anhui 234101,China)
出处
《佳木斯大学学报(自然科学版)》
CAS
2022年第1期56-59,共4页
Journal of Jiamusi University:Natural Science Edition
基金
安徽省教育厅自然科学重点项目(KJ2019A1058)
安徽省质量工程教研项目(2018jxtd051)
教育部《职业教育提质培优行动计划(2020-2023年)》高水平专业群《大数据(电子信息)专业群》建设
教育部《职业教育提质培优行动计划(2020-2023年)》精品在线开放课程《Python程序设计》。