摘要
论文设计并实现了一种可以用于存档、分析、和挖掘大型分布式数据集的高性能云。文中定义云为一种可以提供互联网资源与(或)服务的基础设施。存储云提供存储服务,计算云则提供计算服务。高性能且能保持这些服务自身的有效性和效率不变,自然很合理地被预期作为实现大规模数据挖掘的中间步骤。论文提出了一种使用Sector/Sphere框架和关联规则的云数据挖掘方法,同时给出了由Sphere计算云和关联规则支持的编程范例。
This paper describes the design and implementation of a high-performance cloud to archive, analyze and mine large distributed data sets. By a cloud, an infrastructure that provides resources and/or services over the Internet. A storage cloud provides storage services, while a compute cloud provides compute services. High-performance can be reasonably intended as a intermediate step of high-performance data mining activities over large-scale amounts of data, while still keeping unaltered the primary and self-contained focus of achieving effectiveness and efficiency in these task themselves. In this paper an algorithm is proposed to mine the data from the cloud using Sector/Sphere framework and association rules, and also describe the programming paradigm supported by the Sphere compute cloud and association rules.
作者
昂朝群
胡炜
胡冉
ANG Chaoqun HU Wei HU Ran(Department of Management and Engineering, Naval University of Engineering, Wuhan 430033 No. 91919 Troops ofPLA, Huanggang 438000)
出处
《计算机与数字工程》
2017年第9期1724-1730,共7页
Computer & Digital Engineering