期刊文献+

基于属性重要性的加权聚类融合 被引量:12

Weighted Cluster Ensemble Based on Significance of Attribute
下载PDF
导出
摘要 聚类融合是数据挖掘研究的一个热点。当前相关研究大多没有考虑进行融合的聚类成员的质量,因此较差的成员和噪声会对融合结果产生不良的影响。提出了一种对聚类成员进行加权的融合方法。该方法引入粗糙集理论中的属性重要性度量,根据聚类成员对融合的重要性赋予其权重,生成加权共生矩阵,进而产生融合结果。实验结果表明,提出的方法能较好地处理聚类成员间的质量差异,并能有效地消减噪声对融合的影响,从而得到更好的聚类融合结果。 Cluster ensemble is a hot topic in data mining research. Resent research mostly pays little attention to the qualities of cluster members. However, bad cluster members and noise may affect the ensemble result. A weighted cluster ensemble approach was proposed. This approach set weights to all cluster members according to the significance of them relative to the ensemble result. The significance of each cluster member was evaluated through information measures of significance of attribute in rough set theory. Then weighted co-association matrix was generated and the final ensemble result was obtained. The experimental results show that the proposed approach can handle well different-quality of cluster members and lessen the affect of noise effectively. Therefore,it can afford better ensemble result compared with general cluster ensemble methods.
出处 《计算机科学》 CSCD 北大核心 2009年第4期243-245,249,共4页 Computer Science
关键词 聚类融合 共生矩阵 属性重要性度量 Cluster ensemble,Co-association matrix, Measure of significance of attribute
  • 相关文献

参考文献9

  • 1Jain A K, Flynn P J. Data Clustering, A Review. ACM Computing Surveys, 1999,31(3) :264-323
  • 2阳琳贇,王文渊.聚类融合方法综述[J].计算机应用研究,2005,22(12):8-10. 被引量:28
  • 3Fred A L. Finding Consistent Clusters in Data Partitions//Proceedings of the Second International Workshop on Multiple Classifier Systems, 2001. Volume 2096 of Lecture Notes in Computer Science. Springer, 2001:309-318
  • 4Strehl A,Ghosh J. Cluster ensembles-a knowledge reuse framework for combining multiple partitions. Journal of Machine Learning Research, 2003,3 (3) : 583-617
  • 5Karypis G,Kumar V. A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM Journal on Scientific Computing, 1998,20(1) : 359-392
  • 6Fred A L,Jain A K. Data clustering using evidence accumulation ffProceedings of the 16th International Conference on Pattern Recognition (ICPR 2002). volume 4,2002 ; 276-280
  • 7Ayad H,Kamel M. Finding natural clusters using multi-clusterer combiner based on shared nearest neighbors//Proceedings of the 4th International Workshop on Multiple Classifier Systems (MCS'03), 2003. Volume 2709 of Lecture Notes in Computer Science. Springer, 2003 : 166 175
  • 8Merz C, Murphy P. UCI repository of machine learning databases. http://archive. ics. uci. edu/ml/
  • 9Larson B, Aone C. Fast and effective text mining using lineartime document clustering//Conference on Knowledge Discovery in Data, Proceeding of the fifth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 1999:16- 22

二级参考文献24

  • 1R O Duda, P E Hart, D G Stork. Pattern Classification (2nd Edition) [M]. New York: Wiley, 2001. 454-458.
  • 2A Strehl, J Ghosh. Cluster Ensembles: A Knowledge Reuse Framework for Combining Multiple Partitions[J]. Journal of Machine Learning Research, 2003, 3(3): 583-617.
  • 3A L Fred. Finding Consistent Clusters in Data Partitions[C]. Proceedings of the 2nd International Workshop on Multiple Classifier Systems, Volume 2096 of Lecture Notes in Computer Science, Springer, 2001. 309-318.
  • 4A Topchy, A K Jain, W Punch. A Mixture Model for Clustering Ensembles [C]. Proceedings of the 4th SIAM International Conference on Data Mining, 2004. 379-390.
  • 5B Minaei-Bidgoli, A Topchy, W F Punch. A Comparison of Resampling Methods for Clustering Ensembles [C]. Intl. Conf. on Machine Learning, Models, Technologies and Applications(MLMTA 2004), 2004. 939-945.
  • 6B Minaei-Bidgoli, A Topch, W F Punch. Ensembles of Partitions via Data Resampling [C]. Proceedings International Conference on Information Technology, Coding and Computing(ITCC 2004),Volume 2, 2004. 188-192.
  • 7S Dudoit, J Fridlyand. Bagging to Improve the Accuracy of a Clustering Procedure [J]. Bioinformatics, 2003, 19(9): 1090-1099.
  • 8B Fischer, J M Buhmann. Path-based Clustering for Grouping of Smooth Curves and Texture Segmentation [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2003, 25(4): 513-518.
  • 9A Topchy, A K Jain, W F Punch. Combining Multiple Weak Cluste-rings [C]. Proceedings of the 3rd IEEE International Conference on Data Mining(ICDM03), 2003. 331-338.
  • 10X Z Fern, C E Brodley. Random Projection for High Dimensional Data Clustering: A Cluster Ensemble Approach [C]. Proceedings of the 20th International Conference on Machine Learning, 2003. 186-193.

共引文献27

同被引文献146

引证文献12

二级引证文献43

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部