期刊文献+

半监督约束集成的快速密度峰值聚类算法 被引量:23

Semi-supervised constraint ensemble clustering by fast search and find of density peaks
下载PDF
导出
摘要 为了解决2014年在Science上提出的快速密度峰值聚类(CFDP)算法存在的自动选择时误选和漏选中心点、簇的数量需要主观先验判断、算法使用受场景局限的缺陷,从半监督角度出发,结合集成学习思想提出半监督约束集成的快速密度峰值聚类(SiCE-CFDP)算法. SiCE-CFDP算法使用相对密度方式度量节点密度,从多角度分析决策图,自动选择候选中心点,并最终自动确定簇的数量.在只标注有限约束关系的前提下,算法能以集成学习指导约束信息的扩充,提升聚类性能.在方法验证中,通过3个人工数据集、4个公开数据集以及1个空调系统数据集进行仿真研究.结果表明,在相同的约束量前提下,针对大样本数据,SiCE-CFDP算法相比其他半监督聚类算法具有更高的聚类精度. Aming at the weaknesses of clustering by fast search and find of density peaks(CFDP)proposed on Science in 2014 in selection of the cluster centers,subjective judgment of class number,limitation in some application scenarios,a semi-supervised constraint ensemble clustering by fast search and find of density peaks(SiCE-CFDP)was proposed.Relative density was used in SiCE-CFDP,the decision graph was analyzed from different perspectives to extract cluster centers,and the class number was decided by itself eventually.When facing finite constraint information,SiCE-CFDP enlarged constraint information by ensemble learning to improve clustering performance.Experiments were conducted on three synthetic datasets,four open datasets and one air conditioning system simulation dataset.For large-scale datasets,the clustering accuracy of SiCE-CFDP was higher than other well-known semi-supervised clustering algorithms.
作者 刘如辉 黄炜平 王凯 刘创 梁军 LIU Ru-hui;HUANG Wei-ping;WANG Kai;LIU Chuang;LIANG Jun(College of Control Science and Engineering,Zhejiang University,Hangzhou 310027,China)
出处 《浙江大学学报(工学版)》 EI CAS CSCD 北大核心 2018年第11期2191-2200,2242,共11页 Journal of Zhejiang University:Engineering Science
基金 国家自然科学基金资助项目(U1664264 U1509203)
关键词 聚类 半监督约束 集成学习 快速密度峰值聚类 决策图 clustering semi-supervised constraint ensemble learning clustering by fast search and find of density peaks(CFDP) decision graph
  • 相关文献

参考文献1

二级参考文献14

  • 1Estivill-Castro V. Why so many clustering algorithms-A position paper. SIGKDD Explorations, 2002,4(1):65-75.
  • 2Dietterich TG. Machine learning research: Four current directions. AI Magazine, 1997,18(4):97-136.
  • 3Breiman L. Bagging predicators. Machine Learning, 1996,24(2):123-140.
  • 4Zhou ZH, Wu J, Tang W. Ensembling neural networks: Many could be better than all. Artificial Intelligence, 2002,137(1-2):239-263.
  • 5Strehl A, Ghosh J. Cluster ensembles-A knowledge reuse framework for combining partitionings. In: Dechter R, Kearns M,Sutton R, eds. Proc. of the 18th National Conf. on Artificial Intelligence. Menlo Park: AAAI Press, 2002. 93-98.
  • 6MacQueen JB. Some methods for classification and analysis of multivariate observations. In: LeCam LM, Neyman J, eds. Proc. of the 5th Berkeley Symp. on Mathematical Statistics and Probability. Berkeley: University of California Press, 1967,1:281-297.
  • 7Blake C, Keogh E, Merz CJ. UCI Repository of machine learning databases. Irvine: Department of Information and Computer Science, University of California, 1998. http://www.ics.uci.edu/~mlearn/MLRepository.html
  • 8Modha DS, Spangler WS. Feature weighting in k-means clustering. Machine Learning, 2003,52(3):217-237.
  • 9Zhou ZH, Tang W. Clusterer ensemble. Technical Report, Nanjing: AI Lab., Department of Computer Science & Technology,Nanjing University, 2002.
  • 10Fern XZ, Brodley CE. Random projection for high dimensional data clustering: A cluster ensemble approach. In: Fawcett T, Mishra N, eds. Proc. of the 20th Int'l Conf. on Machine Learning. Menlo Park: AAAI Press, 2003. 186-193.

共引文献94

同被引文献163

引证文献23

二级引证文献52

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部