可变阈值的K-Means初始中心选择方法被引量：8

Approach to selecting initial centers for K-Means with variable threshold

下载PDF

导出

摘要 K-Means算法随机选择聚类中心初始点,导致聚类器性能不稳定。对此,提出基于可变阈值的初始聚类中心选择方法(VTK-Means)。该算法选择距已有初始点距离大于一个阈值的样例作为初始聚类中心,并根据满足条件的初始聚类中心个数适当调整阈值。在10个UCI数据集上的实验结果表明,该算法性能明显优于K-Means算法。 The K-Means algorithm selects the initial clustering centers randomly,which results in the performance of the clustering instability.In order to improve the limitation,a novel clustering algorithm（VTK-Means） based on variable threshold to select initial cluster centers is proposed in this paper.The algorithm tries to select the points whose distances to the existing initial points are longer than a threshold as the initial cluster centers, and then it appropriately adjusts the threshold accord- ing to the number of the points meeting the condition in the first step.The experimental results on UCI machine learning data sets indicate that it yields better stability compared with the typical K-means algorithm.

作者刘一鸣张化祥

机构地区山东师范大学信息科学与工程学院

出处《计算机工程与应用》 CSCD 北大核心 2011年第32期56-58,共3页 Computer Engineering and Applications

基金山东省科技研究计划项目(No.2007ZZ17 No.2008GG10001015 No.2008B0026) 山东省教育厅科研项目(No.J09LG02)

关键词 K-MEANS 聚类可变阈值初始聚类中心 K-Means clustering variable threshold initial cluster center

分类号 TP391.4 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献13

1Theodoridis S, Koutroumbas K.Pattem recognition[M].[S.1.] : Aca- demic Press, 2006.
2Xu R.Survey of clustering algorithms[J].IEEE Transactions onNeural Networks, 2005,16(3) : 645-678.
3Kang P, Cho S.K-means clustering seeds initialization based on centrality, sparsity, and isotropy[C]//Intelligent Data Engineering and Automated Learning-IDEAL.Heidelberg, Berlin: Springer-Verlag, 2009: 109-117.
4MacQueen J B.Some methods for classification and analysis of multivariate observations[C]//Proc of the 5th Berkeley Sympo- sium on Mathematical Statistics and Probability, 1967: 281-297.
5Wang W, Yang J,Muntz R.STING:a statistical information grid approach to spatial data mining[C]//Proc of the 23rd International Conference on Very Large Data Bases,1997:l-18.
6Agrawal R, Gehrke J, Gunopulcs D.Automatic subspace cluster- ing of high dimensional data for data mining application[C]// Proc of ACM SIGMOD International Conference on Manage- ment of Data, Seattle,WA, 1998:94-105.
7k Guha S, Rastogi R, Shim K.Cure: an efficient clustering algo- rithm for large database[C]//Information Systems,2001,26( 1 ).
8He J,Tan A H,Tan C L.ART-C:a neural architecture for self- organization under constraints[C]//Proc of International Joint Conference on Neural Networks (IJCNN 2002), Hawaii, USA, 2002:2550-2555.
9Kaufman L,Rousseeuw P J.Finding groups in data:an introduc- tion to cluster analysis[C]//Applied Probability and Statistics. New York:Wiley, 1990.
10Khan S S, Ahmad A.Cluster center initialization algorithm for k-means clustering[J].Pattem Recognition Letters, 2004,25 ( 11 ) : 1293-1302.

同被引文献83

1杨善林,李永森,胡笑旋,潘若愚.K-MEANS算法中的K值优化问题研究[J].系统工程理论与实践,2006,26(2):97-101. 被引量：188
2任江涛,施潇潇,孙婧昊,黄焕宇,印鉴.一种改进的基于特征赋权的K均值聚类算法[J].计算机科学,2006,33(7):186-187. 被引量：10
3胡京爽.一类指派问题的数学模型及解法[J].青岛理工大学学报,2006,27(4):125-128. 被引量：5
4于勇前,赵相国,王国仁,陈衡岳.一种基于密度单元的自扩展聚类算法[J].控制与决策,2006,21(9):974-978. 被引量：7
5王琼华,王刚.指派问题数学建模的匈牙利解法[J].昆明冶金高等专科学校学报,2006,22(5):82-84. 被引量：5
6钱线,黄萱菁,吴立德.初始化K-means的谱方法[J].自动化学报,2007,33(4):342-346. 被引量：32
7王玲,薄列峰,焦李成.密度敏感的谱聚类[J].电子学报,2007,35(8):1577-1581. 被引量：61
8Han J W Kamber M 范明孟小峰译.数据挖掘概念与技术[M].北京:机械工业出版杜,2001.147-158.
9ESTER M, KRIEGEL H-P, SANDER J, et al. A density-based algorithm for discovering clusters in large spatial databases with noise[C]// KDD-96: Proceedings of 2nd International Conference on Knowledge Discovery and Data Mining. Menlo Park: AAAI Press, 1996: 226-231.
10Guha S, Rastogir, Shmk.Cure: an efficient clustering algorithm for large databases[C]//Proceedings of the ACM SIGMOD International Conference on Management of Data.New York: ACM Press, 1998 : 73-84.

引证文献8

1郑丹,王潜平.K-means初始聚类中心的选择算法[J].计算机应用,2012,32(8):2186-2188. 被引量：35
2何云斌,肖宇鹏,万静,李松.基于密度期望和有效性指标的K-均值算法[J].计算机工程与应用,2013,49(24):105-111. 被引量：10
3陈亚峰.一种新的K-均值动态聚类算法[J].济源职业技术学院学报,2014,13(4):4-7.
4齐绪停,刘丽.基于最小距离的k－means初始中心点优化算法[J].山东师范大学学报（自然科学版）,2015,30(1):38-40.
5吴杨,王韬,李进东.基于密度的划分式聚类过程参数选择算法[J].控制与决策,2016,31(1):21-29. 被引量：5
6万静,孙永倩,董怀国,肖宇鹏,齐坡.空间聚类与方向关系的融合技术研究[J].计算机工程与应用,2016,52(9):56-61. 被引量：5
7崔云鹏,谢科,周漾,黄惠.面向复杂动态场景的多无人机协同摄影[J].计算机辅助设计与图形学学报,2021,33(7):1113-1125.
8李汉波,魏福义,张嘉龙,刘志伟,黄杰,方月宜.基于结构系数的K-means初始聚类中心选择算法[J].计算机与数字工程,2023,51(5):993-996.

二级引证文献55

1杨捷,李沛霖,罗成臣,洪锋.基于数据挖掘的电网用户行为分析[J].云南大学学报（自然科学版）,2020,42(S02):38-43. 被引量：22
2郭伟,王西闯,肖振久.基于K均值和双支持向量机的P2P流量识别方法[J].计算机应用,2013,33(10):2734-2738. 被引量：4
3张宝华,刘鹤,侯贺.基于多聚类中心和PCNN的医学图像融合算法[J].激光与红外,2014,44(4):452-456. 被引量：1
4田腾浩.优化初始聚类中心的K-Means算法[J].网络安全技术与应用,2014(9):42-43. 被引量：3
5袁汉宁,周彤,韩言妮,陈媛媛.基于MI聚类的协同推荐算法[J].武汉大学学报（信息科学版）,2015,40(2):253-257. 被引量：11
6李今.大数据分析在城市照明管理系统中的应用[J].软件导刊,2015,14(5):1-4. 被引量：3
7江京亚,郭庆胜,陈旺,周贺杰,陈勇.一种K-均值聚类的改进算法及其应用[J].测绘工程,2015,24(5):42-46. 被引量：3
8卜旭松,刘立波,石磊.基于PAM和簇阈值的改进K-Means聚类算法[J].湖北工程学院学报,2015,35(3):36-39. 被引量：2
9齐绪停,刘丽.基于最小距离的k－means初始中心点优化算法[J].山东师范大学学报（自然科学版）,2015,30(1):38-40.
10毛秀,冒纯丽,丁岳伟.基于密度和聚类指数改进的K-means算法[J].电子科技,2015,28(11):47-50. 被引量：10

1党文辉,张德育,孙汝江.一种改进的阈值选取方法在网络监控中的应用[J].科技视界,2012(31):89-89. 被引量：1
2刘悦,张凤斌.一种可变阈值检测器生成算法的研究[J].自动化技术与应用,2007,26(5):72-74. 被引量：1
3YANG Xiaodong,WANG Caifen,LAN Caihui,WANG Biao.Flexible Threshold Proxy Re-signature Schemes[J].Chinese Journal of Electronics,2011,20(4):691-696. 被引量：7
4张玉娟,刘丹丹.基于可变阈值图像平滑算法研究[J].林业科技情报,2013,45(3):132-134.
5张培珍,杨根源,周祖华,马良.一种基于小波变换的图像压缩算法[J].舰船电子工程,2008,28(11):96-99.
6祝晓东,徐济惠,郁松年.视频分析中利用可变阈值的运动估计算法研究[J].计算机工程与应用,2015,51(6):181-187. 被引量：1
7张学阳,项军华.基于运动信息的视频图像空间目标检测[J].海军航空工程学院学报,2016,31(2):113-116. 被引量：1
8谭勇,荣秋生.一个基于K-means的聚类算法的实现[J].湖北民族学院学报（自然科学版）,2004,22(1):69-71. 被引量：19
9刘兰英,路红.基于阴影抑制的运动目标有效分割方法[J].机电产品开发与创新,2008,21(5):22-24.
10张冬妍,贾明伟.基于空中连续红外图像的可视化处理研究[J].机电产品开发与创新,2013,26(2):107-109. 被引量：1

计算机工程与应用

2011年第32期

浏览历史

内容加载中请稍等...

可变阈值的K-Means初始中心选择方法被引量：8

参考文献13

同被引文献83

引证文献8

二级引证文献55

相关作者

相关机构

相关主题

浏览历史

可变阈值的K-Means初始中心选择方法 被引量：8

参考文献13

同被引文献83

引证文献8

二级引证文献55

相关作者

相关机构

相关主题

浏览历史

可变阈值的K-Means初始中心选择方法被引量：8