期刊文献+

基于k-means聚类算法的研究 被引量:87

Research of Clustering Algorithm Based on K-means
下载PDF
导出
摘要 分析研究聚类分析方法,对多种聚类分析算法进行分析比较,讨论各自的优点和不足,同时针对原k-means算法的聚类结果受随机选取初始聚类中心的影响较大的缺点,提出一种改进算法。通过将对数据集的多次采样,选取最终较优的初始聚类中心,使得改进后的算法受初始聚类中心选择的影响度大大降低;同时,在选取初始聚类中心后,对初值进行数据标准化处理,使聚类效果进一步提高。通过UCI数据集上的数据对新算法Hk-means进行检测,结果显示Hk-means算法比原始的k-means算法在聚类效果上有显著的提高,并对相关领域有借鉴意义。 Analyze and research the method of cluster analysis,analyze and compare many kinds of algorithms of cluster analysis,discuss their respective strengths and weaknesses.At the same time,according to the weaknesses of the cluster result of original k-means algorithm is significant influence by selecting the initial cluster centers randomly,a modified algorithm is proposed.Through taking sample many times to data set,choose final superior cluster center,bring down the impact of initial cluster centers to improved algorithm greatly.Simultaneously,the initial data is standadized once the initial cluster center is selected,makes cluster effect improved furthermore.Detecting new algorithm Hk-means through the date of UCI data set,the result shows that Hk-means algorithm is more prominent improved than initial k-means algorithm in cluster effect,and it's useful for conference to relative field.
出处 《计算机技术与发展》 2011年第7期54-57,62,共5页 Computer Technology and Development
基金 哈尔滨市后备带头人基金项目(2004AFXXJ039)
关键词 数据挖掘 聚类算法 K-MEANS算法 data mining clustering algorithm k-means algorithm
  • 引文网络
  • 相关文献

参考文献12

二级参考文献71

  • 1何中胜,刘宗田,庄燕滨.基于数据分区的并行DBSCAN算法[J].小型微型计算机系统,2006,27(1):114-116. 被引量:16
  • 2李洁,高新波,焦李成.基于特征加权的模糊聚类新算法[J].电子学报,2006,34(1):89-92. 被引量:114
  • 3Valiant L G. A theory of learnable. Communications of the ACM, 1984, 27(11): 1134-1142
  • 4Kearns M, Valiant L G. Learning Boolean formulae or finite automata is as hard as factoring. Cambridge, MA: Harvard University Aiken Computation Laboratory. Technical Report TR-14-88, 1988
  • 5Kearns M, Valiant L G. Cryptographic limitations on learning Boolean formulae and finite automata. Journal of the ACM, 1994, 41(1): 67-95
  • 6Schapire R E. The strength of weak learnability. Machine Learning, 1990, 5(2): 197-227
  • 7Dietterich T G. Ensemble methods in machine learning// Proceedings of the Multiple Classifier Systems. Cagliari, Italy, 2000:1-5
  • 8Freund Y, Schapire R E. Experiments with a new Boosting algorithm//Proceedings of the Thirteenth International Conference on Machine Learning (ICML). Bari, Italy, 1996: 148-156
  • 9Breiman L. Prediction games and arcing classifiers. Neural Computation, 1999, 11(7): 1493-1517
  • 10Breiman L. Bagging predictors. Machine Learning, 1996, 24 (2) : 123-140

共引文献1304

同被引文献733

引证文献87

二级引证文献532

;
使用帮助 返回顶部