基于距离关联性动态模型的聚类改进算法

Clustering Improved Algorithm Based on Distance-Relatedness Dynamic Model

下载PDF

导出

摘要针对大部分聚类算法无法高效地发现任意形状及不同密度的簇的问题,提出了一种高效的基于距离关联性动态模型的聚类改进算法。首先,为提高聚类效率,使用层次聚类算法对数据集进行初始聚类,并剔除样本点含量过低的簇;其次,为发现任意形状及不同密度的簇,以初始聚类结果的簇的质心作为代表点,利用距离关联性动态模型进行聚类,并利用层次聚类的树状结构进行有效的剪枝计算;最后,检验算法的有效性。实验采用Chameleon数据集进行测试,结果表明,该算法能够有效识别任意形状及不同密度的簇,且与同类算法相比,时间效率有显著的提高。 In view of the fact that most of clustering algorithms fail to find arbitrary shaped and different density clusters efficiently, this paper proposes an efficient clustering improved algorithm based on distance-relatedness dynamic model. Firstly, in order to improve the efficiency of clustering, using hierarchical clustering algorithms for the data set to get the initial clusters and remove abnormal clusters. Secondly, in order to obtain arbitrary shaped clusters, taking the centroid of initial clusters as the representative point of all points in it, then running the distance- relatedness dynamic model for clustering, and using the tree structure of hierarchical clustering for pruning. Finally, verifying the effectiveness of the proposed algorithm. The algorithrn is tested on the Chameleon dataset, the experi- mental results show that the algorithm can obtain arbitrary shape and different density clusters, and compared with the same algorithms, the time efficiency is improved significantly.

作者陈雄韬闫秋艳

机构地区中国矿业大学计算机科学与技术学院

出处《计算机科学与探索》 CSCD 北大核心 2016年第2期248-256,共9页 Journal of Frontiers of Computer Science and Technology

基金江苏省自然科学基金No.BK20140192 中国矿业大学青年科技基金项目No.2013QNB16~~

关键词聚类任意形状的簇不同密度的簇距离关联性动态模型 clustering arbitrary shaped clusters different density clusters distance-relatedness dynamic model

分类号 TP301.6 [自动化与计算机技术—计算机系统结构]

引文网络
相关文献

参考文献14

1Mai S T, He Xiao, Feng Jing, et al. Anytime density-based clustering of complex data[J]. Knowledge and Information Systems, 2015, 45(2): 319-355.
2He Jinyuan, Zhao Gansen, Zhang Haolan, et al. An effec- tive clustering algorithm for auto-detecting well-separated clusters[C]//Proceedings of the 2014 IEEE International Conference on Data Mining Workshop, Shenzhen, China, Dec 14, 2014. Piscataway, USA: IEEE, 2014: 867-874.
3Zhang Tian, Ramakrishnan R, Livny M. BIRCH: an effi- cient data clustering method for very large databases[C]// Proceedings of the 1996 ACM SIGMOD International Con- ference on Management of Data, Montreal, Canada, Jun 1996. New York, USA: ACM, 1996: 103-114.
4Karypis G, Han E H, Kumar V. Chameleon: hierarchical clustering using dynamic modeling[J]. Computer, 1999, 32 (8): 68-75.
5Yousri N A, Kamel M S, Ismail M A. A distance-relatedness dynamic model for clustering high dimensional data of arbi- trary shapes and densities[J]. Pattern Recognition, 2009, 42 (7): 1193-1209.
6Ester M, Kriegel H P, Sander J, et al. A density-based algo- rithm for discovering clusters in large spatial databases with noise[C]//Proceedings of the 2nd International Confer- ence on Knowledge Discovery and Data Mining, 1996: 226-231.
7Liu Peng, Zhou Dong, Wu Naijun. VDBSCAN: varied den- sity based spatial clustering of applications with noise[C]// Proceedings of the 2007 International Conference on Ser-vice Systems and Service Management, Chengdu, China, Jun 9-1l, 2007. Piscataway, USA: IEEE, 2007: 1-4.
8Chowdhury A K M R, Mollah M E, Rahman M A. An effi- cient method for subjectively choosing parameter 'k' auto- matically in VDBSCAN (varied density based spatial clus- tering of applications with noise) algorithm[C]//Proceedings of the 2nd International Conference on Computer and Auto- mation Engineering, Singapore, Feb 26-28, 2010. Piscat- away, USA: IEEE, 2010: 38-41.
9Ram A, Jalal S, Jalal A S, et al. A density based algorithm for discovering density varied clusters in large spatial data- bases[J]. International Journal of Computer Applications, 2010, 3(6): 1-4.
10Kriegel H P, Krrger P, Sander J, et al. Density based clus-tering[J]. Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, 2011, 1 (3): 231-240.

1蒲蓬勃,王鸽,刘太安.基于粒子群优化的模糊C-均值聚类改进算法[J].计算机工程与设计,2008,29(16):4277-4279. 被引量：18
2李小展.基于半监督的K-means聚类改进算法[J].东莞理工学院学报,2011,18(1):29-32. 被引量：1
3王家耀,谢明霞,郭建忠,陈科.基于相似性保持和特征变换的高维数据聚类改进算法[J].测绘学报,2011,40(3):269-275. 被引量：8
4李石君,张瑞,卢哲延,金索哲.基于粗糙近似的Web事务聚类改进算法[J].武汉大学学报（理学版）,2008,54(1):77-80.
5张洁玲,白清源.一种高效的K-means聚类改进算法[J].福州大学学报（自然科学版）,2014,42(4):537-542. 被引量：5
6陈利跃,杭钟灵,余亮,黄剑,何星.基于马氏距离的双层聚类电力远动异常检测[J].控制工程,2015,22(2):360-364. 被引量：2
7郑超,苗夺谦,王睿智.基于密度加权的粗糙K-均值聚类改进算法[J].计算机科学,2009,36(3):220-222. 被引量：25
8王海起,王劲峰.一种基于空间邻接关系的k-means聚类改进算法[J].计算机工程,2006,32(21):50-51. 被引量：15
9刘岩,王存睿.基于抽样融合改进的大数据聚类方法[J].微电子学与计算机,2017,34(4):17-21. 被引量：12
10徐丽,丁世飞,郭锋锋.基于改进属性约简的粗核聚类算法[J].广西师范大学学报（自然科学版）,2011,29(3):105-109. 被引量：2

计算机科学与探索

2016年第2期

浏览历史

内容加载中请稍等...

基于距离关联性动态模型的聚类改进算法

参考文献14

相关作者

相关机构

相关主题

浏览历史