摘要
数据挖掘是用来发现数据库中隐含的各个数据之间的关系和特性,聚类分析是数据挖掘所要完成的工作之一.选取了三个并行聚类分析算法并研究了与之对应的并行算法,然后讨论了并行算法的性能,并得到了一些实验结果.最后提出了一个新的并行算法,相比较其它并行聚类算法,本文所提出的算法是最有效的.
Data mining is the discovery of relationships and characteristics that may exist implicitly in databases. Cluster analysis is a task of data mining. In this paper, three sequential clustering methods are selected as the basis of our algorithm and develop the corresponding parallel algorithms. Then we discuss the performance of the parallel algorithms and present some experimental results. Furthermore a new parallel algorithm is proposed and it is proved to be the most efficient parallel clustering algorithm compared with others.
出处
《南开大学学报(自然科学版)》
CAS
CSCD
北大核心
2008年第4期106-112,共7页
Acta Scientiarum Naturalium Universitatis Nankaiensis
基金
Tianjin funds(033800711, 04310761R)
Tianjin Municipal Information Office:High Performance Computation Project(051027014)
关键词
数据挖掘
聚类算法
并行聚类算法
分割方法
data mining
clustering algorithms
parallel clustering algorithms
partitioning methods