一种动态获取簇核心的自动聚类tNN-MEANS算法

A Dynamic Cluster Core Algorithm for Automatic Clustering Based on tNN-MEANS

导出

摘要 K-means算法是一种非常重要的聚类算法,然而算法的聚类效果受簇的个数、初始中心点位置的影响很大.提出基于优化初始中心集合和中心移动算法tNN-MEANS,算法有效解决了以下三个问题:1)准确确定大规模数据集中簇的个数;2)精确确定全局高密度的核心区域;3)克服了簇中存在多个高密度区域的问题.运用UCI数据集分别对X-means算法、DBSCAN算法和tNN-MEANS算法进行对比实验,实验结果验证了tNN-MEANS算法的聚类精度、确定簇的个数、蔟划分的正确率等性能均优于与之对比的其它算法. K-means algorthm is a kind of important clustering algorithm, howere, the cluster- ing effect of x-means algorithm is greatly affected by the number of clusters and initial center position[1]. In this paper,we put forward the tNN-MEANS algorithm based on the algorithm of the optimization of the initial center and the center moving of clusters, the algorithm effec- tively solves the following three questions： 1） Accurately determine the number of clusters for massive data set; 2） Accurately determine the cluster core region for global high density data set; 3） effectively overcome the problems of the multiple high density area of the problems in a clustering cluster. We compare tttNN-MEANS algorithm with the alogrithm of P（ - means and DBSCAN by experiment using UCI data set, the result of experiment prove tNN-MEANS algorithm proposed in the paper is superior to the others which are x-means and DBSCAN in clustering accuracy, determining the number of clusters and the accuracy of cluster partition.

作者郭颂黄俊郭立新

机构地区信阳师范学院计算机与信息技术学院信阳师范学院综合档案室

出处《数学的实践与认识》 CSCD 北大核心 2013年第13期174-181,共8页 Mathematics in Practice and Theory

关键词自动聚类簇核心 tNN-MEANS算法 automatic clustering cluster core tNN-MEANS

分类号 TP311.13 [自动化与计算机技术—计算机软件与理论]

引文网络
相关文献

参考文献6

1Huang Z, Ng M, Lin T, D. Cheung. An interactive approach to building classification models by clustering and cluster validation[J]. Intelligent Data Engineering and Automated Learning iDEAL, 2000: 23-28.
2Li M, Ng M, Cheung Y, and Huang J. Agglomerative fuzzy k-means clustering algorithm with selection of number of clusters[J]. IEEE Transaction of knowlegde and data engineering (TKDE), 2008. 20(11): 1519-1534.
3Li X, Ye Y, Li M, and Ng M. On cluster tree for nested and multi-density data clustering[J]. Pattern Recognition, 2010. 43(9): 3130-3143.
4Li Y, Hung E, Chung K, and Huang J. Building a decision cluster classification model for high dimensional data by a variable weighting k-means method[C]// AI 2008: Advances in Artificial Intelligence, 2008: 337-347.
5尹光,朱玉全,陈耿.一种新的分类器选择集成算法[J].计算机工程,2012,38(8):167-169. 被引量：3
6Sun H, Wang S, and Jiang Q. Fcm-based model selection algorithms for determining the number of clusters[J]. Pattern Recognition, 2004, 37(10): 2027-2037.

二级参考文献11

1Albert H R K,Sabourina R,Britto A S,et al.From Dynamic Classifier Selection to Dynamic Ensemble Selection[J].Pattern Recognition,2008,41(5):1718-1731.
2Woloszynski T,Kurzynski M.On a New Measure of Classifier Competence Applied to the Design of Multiclassifier Systems[C]//Proc.of ICIAP’09.Berlin,Germany:Springer-Verlag,2009:995-1004.
3Zhou Zhihua,Wu Jianxin,Tang Wei.Ensembling Neural Networks:Many Could Be Better Than All[J].Artificial Intelligence,2002,137(1/2):239-263.
4Tumer K,Ghosh J.Classifier Combining:Analytical Results and Implications[C]//Proc.of AAAI Workshop on Integrating Multiple Learned Models for Improving and Scaling Machine Learning Algorithms.[S.l.]:AAAI Press,1996.
5Hoque S S,Fairhurst M C.Diversity in Multiple Classifier Ensembles Based on Binary Feature Quantization with Application to Face Recognition[J].Applied Soft Computing,2008,8(1):437-445.
6Meynet J,Jean-Philippe T.Information Theoretic Combination of Pattern Classifiers[J].Pattern Recognition,2010,43(10):3412-3421.
7Dietterich T G.Machine Learning Research:Four Current Directions[J].AI Magazine,1997,18(4):97-136.
8Shipp C A,Kuncheva L I.Relationships Between Combination Methods and Measures of Diversity in Combining Classifiers[J].Information Fusion,2002,3(2):135-148.
9Brown G.A New Perspective on Information Theoretic Feature Ranking[C]//Proc.of International Conf.on Artificial Intelligence and Statistics.Fort Lauderdale,USA:[s.n.],2009.
10Brown G.An Information Theoretic Perspective on Multiple Classifier Systems[C]//Proc.of International Workshop on Multiple Classifier Systems.Berlin,Germany:Springer-Verlag,2009:344-353.

共引文献2

1郭颂,刘亮亮,周鹏.一种基于凝聚K-means的决策簇分类器[J].信阳师范学院学报（自然科学版）,2013,26(4):612-615.
2王军,刘三民,刘涛.面向概念漂移的数据流分类研究分析[J].绵阳师范学院学报,2017,36(5):80-89.

1辛晚霞.改进的k'-means算法及其在图像分割中的应用[J].商丘师范学院学报,2014,30(6):7-11.
2舒海翅,王新洲,花向红,田玉刚.模糊ISODATA中分类数C的确定[J].模糊系统与数学,2004,18(z1):318-322. 被引量：2
3马张华,陈文广.查询优化与动态自动聚类系统[J].大学图书馆学报,2005,23(3):34-40. 被引量：7
4喻孜,张贵清,李小华,胡涛平.温度对前中子星物质的影响[J].四川大学学报（自然科学版）,2012,49(1):141-145. 被引量：1
5喻孜,张贵清,李小华,张新阳,戴琴.δ介子对热前中子星物质的影响[J].南开大学学报（自然科学版）,2012,45(1):27-30.
6陈锐,邹书蓉,张洪伟,冯忠田.改进遗传算法及其在聚类分析上的应用[J].西南民族大学学报（自然科学版）,2009,35(6):1176-1179. 被引量：1
7王玉峰,葛红.资源受限核聚类人工免疫网络的研究与实现[J].现代计算机,2009,15(2):45-46.
8代雪珍,常在斌.基于矩阵的模糊决策系统的属性约简算法[J].纺织高校基础科学学报,2015,28(2):224-229. 被引量：2
9葛蓉,胡勤友,涂兴华,徐铁.水上交通加权安全评价中权重向量的异常分析[J].上海海事大学学报,2014,35(1):14-17. 被引量：1
10岳宝增,唐勇.球形贮箱中三维液体大幅晃动数值模拟[J].宇航学报,2016,37(12):1405-1410. 被引量：12

数学的实践与认识

2013年第13期

浏览历史

内容加载中请稍等...

一种动态获取簇核心的自动聚类tNN-MEANS算法

参考文献6

二级参考文献11

共引文献2

相关作者

相关机构

相关主题

浏览历史