期刊文献+

基于添加人工数据的高差异性聚类集体生成方法

Clustering Ensemble with High Diversity Based on Adding Artificial Data
原文传递
导出
摘要 集体差异性被认为是集成学习中的一个关键因素.在聚类集成的研究中,生成聚类集体的方法有许多种,但就专门致力于生成高差异性聚类集体的方法研究较少.基于此,本文提出生成高差异性聚类集体的方法 CEAN和 ICEAN,在算法中通过引入人工数据来增加聚类集体的差异性.用实验比较了 CEAN 和 ICEAN 与文献中出现的常用聚类集体生成方法,实验表明 CEAN 和 ICEAN 确实能增加生成集体的差异性,从而在相似平均集体成员准确度情况下使得聚类集成的效果更好. Ensemble diversity is considered as a key factor in ensemble learning. There are many methods for constructing clustering collection or ensemble, but a few of them focus on the production of high ensemble diversity. Two methods are proposed for generating clustering ensembles with high diversity--constructing clustering ensemble by adding noise (CEAN) and improved CEAN (ICEAN). By adding artificial data, they can obtain clustering ensembles with high diversity. Compared with other commonly used methods for generating clustering ensembles, CEAN and ICEAN increase the ensemble diversity, and thus they get better clustering integration results with the same average ensemble member accuracy.
出处 《模式识别与人工智能》 EI CSCD 北大核心 2008年第5期682-688,共7页 Pattern Recognition and Artificial Intelligence
基金 江西省教育厅科技资助项目(No.教技字[2007]208号 GJJ08285)
关键词 聚类集成 集体差异性 人工数据 Clustering Ensemble, Ensemble Diversity, Artificial Data
  • 相关文献

参考文献16

  • 1Topchy A P, Jain A K, Punch W F. Combining Multiple Weak Clusterings//Proc of the 3rd IEEE International Conference on Data Mining. Melbourne, USA, 2003:331 -338
  • 2Fred A L N. Finding Consistent Clusters in Data Partitions// Proc of the 2nd International Workshop on Multiple Classifier Systems. Cambridge, UK, 2001:309-318
  • 3Fred A L N, Jain A K. Data Clustering Using Evidence Accumulation//Proc of the 16th International Conference on Pattern Recognition. Quebec, Canada, 2002, IV: 276-280
  • 4Strehl A, Ghosh .l. Cluster Ensembles - A Knowledge Reuse Framework for Combining Multiple Partitions. Journal of Machine Learning Research, 2002, 3(3): 583 -617
  • 5Dudoit S, Fridlyand J. Bagging to Improve the Accuracy of a Clustering Procedure. Bioiuformatics, 2003, 19(9) : 1090 - 1099
  • 6Law M H C, Topchy A P, Jain A K. Muhiobjective Data Clustering //Proc of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Washington, USA, 2004, Ⅱ: 424 - 430
  • 7Fern X Z, Brodley C E. Random Projection for High Dimensional Data Clustering: A Cluster Ensemble Approach//Proc of the 20th International Conference on Machine Learning. Washington, USA, 2003 : 186 - 193
  • 8Topchy A, Jain A K, Punch W. Clustering Ensembles: Models of Consensus and Weak Partitions. IEEE Trans on Pattern Analysis and Machine Intelligence, 2005, 27(12) : 1866 - 1881
  • 9Frossyniotis D, Likas A, Stafylopatis A. A Clustering Method Based on Boosting. Pattern Recognition Letters, 2004, 25(6) : 641 -654
  • 10唐伟,周志华.基于Bagging的选择性聚类集成[J].软件学报,2005,16(4):496-502. 被引量:95

二级参考文献14

  • 1Estivill-Castro V. Why so many clustering algorithms-A position paper. SIGKDD Explorations, 2002,4(1):65-75.
  • 2Dietterich TG. Machine learning research: Four current directions. AI Magazine, 1997,18(4):97-136.
  • 3Breiman L. Bagging predicators. Machine Learning, 1996,24(2):123-140.
  • 4Zhou ZH, Wu J, Tang W. Ensembling neural networks: Many could be better than all. Artificial Intelligence, 2002,137(1-2):239-263.
  • 5Strehl A, Ghosh J. Cluster ensembles-A knowledge reuse framework for combining partitionings. In: Dechter R, Kearns M,Sutton R, eds. Proc. of the 18th National Conf. on Artificial Intelligence. Menlo Park: AAAI Press, 2002. 93-98.
  • 6MacQueen JB. Some methods for classification and analysis of multivariate observations. In: LeCam LM, Neyman J, eds. Proc. of the 5th Berkeley Symp. on Mathematical Statistics and Probability. Berkeley: University of California Press, 1967,1:281-297.
  • 7Blake C, Keogh E, Merz CJ. UCI Repository of machine learning databases. Irvine: Department of Information and Computer Science, University of California, 1998. http://www.ics.uci.edu/~mlearn/MLRepository.html
  • 8Modha DS, Spangler WS. Feature weighting in k-means clustering. Machine Learning, 2003,52(3):217-237.
  • 9Zhou ZH, Tang W. Clusterer ensemble. Technical Report, Nanjing: AI Lab., Department of Computer Science & Technology,Nanjing University, 2002.
  • 10Fern XZ, Brodley CE. Random projection for high dimensional data clustering: A cluster ensemble approach. In: Fawcett T, Mishra N, eds. Proc. of the 20th Int'l Conf. on Machine Learning. Menlo Park: AAAI Press, 2003. 186-193.

共引文献94

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部