期刊文献+

有约束的半监督聚类方法 被引量:2

Semi-supervised clustering method with constrains
下载PDF
导出
摘要 在数据挖掘领域的很多实际应用中,获取大量的无标签样本非常容易,而获取有标签的样本通常需要付出较大的代价,并且有时不可能得到所有的数据的标签,半监督聚类就是使用一小部分的标签数据对无标签数据的聚类过程进行指导。提出了一种新的半监督聚类算法,它利用标签数据提供的信息来初步确定数据的相似性和不相似性标准,并在聚类过程中对其进行自动调整,利用它们对聚类过程进行约束和指导。通过在标准数据集高斯数据集上的测试,该算法相对于无指导聚类来说有更高的精度和更快的速度。 In many data mining domains,there is a large supply of unlabeled data but limited labeled data,which can be expensive to generate.Consequently,semi-supervised clustering,which uses a small amount of labeled data to aid unlabeled clustering,has become a topic of significant recent interest.This paper presents a new algorithm,called semi-supervised clustering algorithm based on constrains learning,which obtains the similarity and dissimilarity criterions of data objects,adjusts them in the process of clustering,and uses them to constrain and supervise clustering.Demonstrated the clustering algorithm with Gaussian dataset,and the experimental results confirm that the clustering algorithm significantly improves the accuracy and speed of clustering when given a relatively small amount of supervision.
作者 刘应东
出处 《计算机工程与应用》 CSCD 北大核心 2009年第22期100-102,共3页 Computer Engineering and Applications
关键词 数据挖掘 标签数据 约束 半监督聚类 data mining labeled data constrains semi-supervised clustering
  • 相关文献

参考文献1

二级参考文献6

  • 1ELKAN C . Using the triangle inequality to accelerate k - means [C]//Proceedings of the Twentieth International Conference on Machine Learning (ICML-2003). Menlo Park: AAAI Press, 2003: 147 -153.
  • 2BRADLEY P S, FAYYAD U M. Refining initial points for k-means clustering [C]//Proceedings of the 15th International Conference on Machine Learning (ICML'98). San Francisco: Morgan Kanfmann, 1998:91-99.
  • 3KANUNGO T, MOUNT D M. A local search approximation algorithm for k-means clustering [J]. Computational Geometry, 2004, 28(2/3): 89-112.
  • 4PENA J M, LOZANO J A, LARRANAGA P. An empirical comparison of four initialization methods for the k-means algorithm [J]. Pattern Recognition Letters, 1999, 20( 10): 1027-1040.
  • 5RAY S, TURI R H. Determination of number of clusters in k-means clustering and application in colour image segmentation [ C]// Proceedings of the 4th International Conference on Advances in Pattern Recognition and Digital Techniques ( ICAPRDT'99). Calcutta, India: [s.n], 1999: 137-143.
  • 6PELLEG D, MOORE A. X-means: extending k-means with efficient estimation of number of clusters [C]// Proceedings of the Seventeenth International Conference on Machine Learning. San Francisco: Morgan Kanfmann, 2000:727-734.

共引文献3

同被引文献15

  • 1郑明秀,杨明根.一阶马尔可夫链在点击流分析中的应用[J].西南民族大学学报(自然科学版),2007,33(1):174-177. 被引量:4
  • 2滕少冬,王志良,王莉,刘冀伟,解仑.基于马尔可夫链的情感计算建模方法[J].计算机工程,2005,31(5):17-19. 被引量:14
  • 3陈杰,蒋祖华.领域本体的概念相似度计算[J].计算机工程与应用,2006,42(33):163-166. 被引量:34
  • 4罗晓清,王士同.基于辅助空间与极大熵的半监督聚类方法[J].计算机工程与应用,2007,43(1):173-177. 被引量:2
  • 5Han J W,Kamber M.范明,孟小峰,译.数据挖掘:概念与技术(第二版).北京:机械工业出版社,2007:339-345.
  • 6Du Jun,Ling C X.Asking generalized queries to domain ex-perts to improve learning. IEEE ICDM . 2009
  • 7Garruzzo S,Rosaci D.Agent clustering based on semantic negoti-ation. ACM Transactions on Autonomous and Adaptive Sys-tems . 2008
  • 8Safarkhani B,Mohsenzadeh M.Deriving semantic sessions from semantic clusters. 2009International Conference on Informa-tion Management and Engineering . 2009
  • 9Cheng Yang.Ontology-based fuzzy semantic clustering. Third2008International Conference on Convergence and Hybrid In-formation Technology . 2008
  • 10Wagstaff K,Cardie C,Rogers S,et al.Constrained K-means Clustering with Background Knowledge. Proceedings of the Eighteenth International Conference on Machine Learning . 2001

引证文献2

二级引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部