期刊文献+

基于半监督K-means的K值全局寻优算法 被引量:11

Global Optimising K Value for Semi-Supervised K-means Algorithm
下载PDF
导出
摘要 提出一种基于半监督K-means的K值全局寻优算法,该算法打破传统方法中采用样本类别作为K值的限定,利用少量标记数据即可指导和规划大量无监督数据.结合数据集自身的分布特点及聚类后各个簇内的监督信息,根据投票方法来指导簇中数据集的类别标记.实验表明,本文所提出的方法可以有效的寻找适合数据集的最佳K值和聚类的中心,提高聚类性能. In this paper, we propose a global optimising K value for semi-supervised K-means algorithm. It has broken the limits that traditional methods have in selecting samples as the K value. It can direct and plan a great amount of supervision data by using only a small amount of labled data. Combining the distribution characteristics of data sets and monitoring information in each cluster after clustering, we use the voting rule to guide the cluster labeling in the data sets. The experiments show that the method proposed in this paper can effectively find the best data sets for K values and clustering center and enhancing the performance of clustering.
出处 《北京交通大学学报》 CAS CSCD 北大核心 2009年第6期106-109,共4页 JOURNAL OF BEIJING JIAOTONG UNIVERSITY
基金 国家自然基金资助项目(60773062 60873100) 河北省科技支撑计划项目资助(072135188) 河北省教育厅科研计划项目资助(2008312)
关键词 半监督聚类 constrained—K均值 K均值算法 投票 阈值 semi-supervised clustering constrained- K means K-means voting threshold
  • 相关文献

参考文献15

  • 1米歇尔.机器学习[M].北京:机械工业出版社,2003.
  • 2Wagstaff K, Cardie C, Rogers S, et al. Constrained K-Means Clustering with Background Knowledge[ C] //Brodley CE, Danyluk AP,eds. Proc.of the 18th lnt'l Conf. on Machine Learning. Williamstown: Morgan Kaufmann Publishers, 2001 : 577 - 584.
  • 3杨剑,王珏,钟宁.流形上的Laplacian半监督回归[J].计算机研究与发展,2007,44(7):1121-1127. 被引量:15
  • 4Mathias M, Adankon, Mohamed Cheriet. Learning Semi- Supervised SVM with Genetic Algorithm[ C]//Proceedings of International Joint Conference on Neural Networks, 2007:1825 - 1830.
  • 5Noureddine G L, Farid M. Semi-Supervised Muhitemporal Classification with Support Vector Machines and Genetic Algorithms [ C ] // International Geoscience and Remote Sensing Symposium. Spain, 2007 : 2577 - 2580.
  • 6李志圣,孙越恒,何丕廉,侯越先.基于k-means和半监督机制的单类中心学习算法[J].计算机应用,2008,28(10):2513-2516. 被引量:4
  • 7高滢,刘大有,齐红,刘赫.一种半监督K均值多关系数据聚类算法[J].软件学报,2008,19(11):2814-2821. 被引量:22
  • 8Brian Kulis, Sugato Basu, Inderjit Dhillon, et al. Semi-Supervised Graph Clustering: A Kernel Approach [ J ]. Machine LearnInz, 2009,1 (74) : 1 - 22.
  • 9MacQueen J. Some Methods for Classification and Analysis of Multivariate Observations [ C ]//Proc. of the 5th Berkeley Symp. on Mathematical Statistics and Probability. Berkeley: University of California Press, 1967:281 -297.
  • 10孙才志,王敬东,潘俊.模糊聚类分析最佳聚类数的确定方法研究[J].模糊系统与数学,2001,15(1):89-92. 被引量:84

二级参考文献39

  • 1杨剑,李伏欣,王珏.一种改进的局部切空间排列算法[J].软件学报,2005,16(9):1584-1590. 被引量:36
  • 2罗四维,赵连伟.基于谱图理论的流形学习算法[J].计算机研究与发展,2006,43(7):1173-1179. 被引量:76
  • 3张伟.Fuzzy聚类算法中的一个新算法--Fuzzy PFS聚类法[J].模糊数学,1987,3(4):51-56.
  • 4Dzeroski S. Multi-Relational data mining: An introduction. ACM SIGKDD Explorations Newsletter, 2003,5(1):1-16.
  • 5Dzeroski S, Lavrac N. Relational Data Mining. Berlin: Springer-Verlag, 2001. 339-364.
  • 6Domingos P. Prospects and challenges for multi-relational data mining. ACM SIGKDD Explorations Newsletter, 2003,5(1):80-83.
  • 7Bouchachia A. Learning with partly labeled data. Neural Computing and Applications, 2007,16(3):267-293.
  • 8Zhu XJ. Semi-Supervised learning literature survey. Technical Report, Computer Sciences TR 1530, University of Wisconsin- Madison, 2007. 1-42.
  • 9Chapelle O, Seholkopf B, Zien A. Semi-Supervised Learning. Cambridge: MIT Press, 2006. 3-14.
  • 10Long B, Zhang F, Wu XY, Yu PS. Spectral clustering for multi-type relational data. In: Cohen WW, Moore A, eds. Proc. of the 23rd Int'l Conf. on Machine Learning. New York: ACM Press, 2006. 585-592.

共引文献131

同被引文献84

引证文献11

二级引证文献123

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部