期刊文献+

基于云计算的数据挖掘系统设计 被引量:28

Design of Data Mining System Based on Cloud Computing
下载PDF
导出
摘要 为了高效、快速地解决呈指数增长的数据处理问题,提高数据储存、运算能力,文中提出了基于云计算的数据挖掘系统的设计。该系统首先分析了主流云计算平台Spark的组件构成和运行机制,深入研究其计算架构的编程原理。同时利用Spark进行了C4.5算法和K-medoids聚类算法的并行化设计,有效提高算法的运行速度、收敛速度和结果的稳定性。测试表明,在进行海量数据的分析处理时,文中提出的云计算平台在分类误差内,可有效提高整体系统的运算速度,分类效率也大幅提高。 In order to solve exponentially increasing data processing problems and improve data storage and computing power efficiently and quickly, this paper proposed a cloud computing-based data mining system design. The system first analyzed the component composition and operation mechanism of the mainstream cloud computing platform Spark, and deeply studied the programming principle of its computing architecture. At the same time, Spark was used to parallelize the C4.5 algorithm and K-medoids clustering algorithm, which effectively improved the running speed, convergence speed and stability of the algorithm. The test showed that in the analysis and processing of massive data, the cloud computing platform proposed in this paper could effectively improve the computing speed of the whole system and improve the classification efficiency.
作者 蓝机满 LAN Jiman(Huizhou Engineering Vocational College,Huizhou 516001,China)
出处 《电子科技》 2019年第8期70-74,共5页 Electronic Science and Technology
关键词 云计算 数据挖掘 SPARK C4.5算法 K-medoids聚类算法 cloud computing data mining Spark C4.5 algorithm K-medoids clustering algorithm
  • 相关文献

参考文献13

二级参考文献129

共引文献253

同被引文献318

引证文献28

二级引证文献77

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部