期刊文献+

改进的混合属性数据聚类算法 被引量:8

Improved clustering algorithm for mixture data sets
下载PDF
导出
摘要 k-prototypes是目前处理数值属性和分类属性混合数据主要的聚类算法,但其聚类结果对初值有明显的依赖性。对k-prototypes初值选取方法进行了分析和研究,提出一种新的改进方法。该方法有更高的稳定性和较强的伸缩性,可减少一定程度的上随机性。实际数据集仿真结果表明,改进算法是正确和有效的。 The k-prototypes algorithm has become popular technique in solving mixed numeric and categorical data clustering problems in different application domains. However, it requires random selection of initial points for the clusters. So it is obvious that outputs are especially sensitive to initial. Different initial points often lead to considerable distinct clustering results. The method of random selection is analysed and a method of searching initial starting points is proposed through grouping data sets. Experiments show that new initialization method leads to better accurate and scalable.
出处 《计算机工程与设计》 CSCD 北大核心 2007年第20期4850-4852,共3页 Computer Engineering and Design
基金 国家自然科学基金项目(70171033) 江苏省高校自然科学基础研究基金项目(07KJ520216) 江苏省计算机处理技术重点实验室基金项目(X2100112049811) 徐州师范大学青年科研基金项目(03X1B18)
关键词 数据挖掘 聚类 k-原型算法 混合型数据 相异度 data mining clustering k-prototypes mixture data dissimilarity
  • 相关文献

参考文献8

  • 1HanJiawei KamberM.Data Mining Concepts and Techniques[M].北京:机械工业出版社,2001..
  • 2Huang Zhexue.Extensions to the k-means algorithm for clustering large data sets with categorical values[J].Data Mining and Knowledge Discovery,1998(2):283-304.
  • 3Daniel Barbara.Using self-similarity to cluster large data sets[J].Data Mining and Knowledge Discovery,2003(7):123-152.
  • 4Dharmendra S Modha,Scott Spangler W.Feature weighting k-means cluytering[J].Machine Learning,2003,52(3):217-237.
  • 5Sun Y,Zhu Q,Chen Z.An iterative initial-points refinement algorithm for categorical data clustering[J].Pattern Recognition Letters,2002,23 (7):875-884.
  • 6Gan G,Yang Z,Wu J.A genetic k-modes algorithm for clustering categorical data[C].Wuhan:Proc of ADMA'05,2005:195-202.
  • 7赵立江.基于数值型和分类型混合属性数据集的聚类算法研究[D].杭州:浙江大学硕士学位论文,2005.
  • 8Blake C,Merz J.UCI repository of machine learning databases[EB/OL].http://www.ics.uci.edu/-mlearn/MLRepository.html.

共引文献12

同被引文献51

引证文献8

二级引证文献33

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部