期刊文献+

xk-split:基于k-medoids的分裂式聚类算法 被引量:2

xk-split:A Split Clustering Algorithm Bases on k-medoids
下载PDF
导出
摘要 近年来互联网数据规模呈爆炸式增长,如何对大数据进行分析已成为热门话题。然而,采集的数据很难直接用于分析,需要进行一定程度的预处理,以提高大数据质量。通过使用分裂式的迭代过程,可以逐步将数据集分裂为子集,避免了传统聚类算法聚类开始时需要确定集群数的限制,并降低了算法的时间复杂度。此外,通过基于阈值的噪声数据过滤,可以在迭代过程中剔除噪音数据,提升了聚类算法对脏数据的忍耐力。 In recent years,the scale of internet data has explosive growth,which makes big data analysis become a hot topic.However,it is difficult to directly utilize the collected data,so a certain degree of pretreatment had to be made in order to improve the quality of big data.In this work,the data set will be gradually divided into smaller subsets by using the split iterative process,which can effectively avoid the limitation of traditional clustering algorithm and reduce the time complexity.In addition,by thresholdbased noise data filtering,the dirty data can be eliminated during the iterative process so as to enhance the tolerance of the clustering algorithm to the dirty data.
出处 《华东理工大学学报(自然科学版)》 CSCD 北大核心 2017年第6期849-854,862,共7页 Journal of East China University of Science and Technology
关键词 数据挖掘 聚类 K-MEANS k-medoids 分裂 data mining clustering k-means k-medoids split
  • 相关文献

参考文献1

二级参考文献4

  • 1Kurniawan A, Benech N, Tao Yufei. Towards High-dimensional Clustering [ J ]. COMP, November 1999 : 1-2.
  • 2MacQueen J. Some Methods for Classification and Analysis of Multivariate Observations [ J ]. In: Proceedings of 5th Berkeley Syrup. Math. Statist,Prob. ,1967,1:281-297.
  • 3Jolla L. Alternatives to the k-means algorithm that find better clustering [ J ]. In : Proceeding of ACM SIGMOD, 1992: 192-195.
  • 4Schrimpf. Migration of Processes, files and virtual devices in the MDX operating system[ J ]. ACM SIGOPS, 1995:70-81.

共引文献40

同被引文献12

引证文献2

二级引证文献5

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部