期刊文献+

结合弱监督信息的凸聚类研究

Convex Clustering Combined with Weakly-Supervised Information
下载PDF
导出
摘要 基于目标函数的聚类是一类重要的聚类分析技术,其中几乎所有算法均是经非凸目标的优化建立,因而难以保证全局最优并对初始值敏感.近年提出的凸聚类通过优化凸目标函数克服了上述不足,同时获得了相对更稳定的解.当现实中存在辅助信息(典型的如必连和/或不连约束)可资利用时,通过将其结合到相应目标所得优化模型已证明能有效提高聚类性能,然而,现有通过在目标函数中添加约束惩罚项的常用结合方式往往会破坏其原有凸目标的凸性.鉴于此,提出了一种新的结合此类弱监督辅助信息的凸聚类算法.其实现关键是代替在目标函数中添加约束,而是通过对目标函数中距离度量的改造以保持凸性,由此既保持了原凸聚类的优势同时有效提高了聚类性能. Objective function-based clustering is a class of important clustering analysis techniques,of which almost all the algorithms are built by optimization of non-convex objective.Thus,these algorithms can hardly get global optimal solution and are sensitive to the provided initialization.Recently,convex clustering has been proposed by optimizing a convex objective function,not only does it overcome the insufficiency illustrated above,but it also obtains a relatively stable solution.It has been proven that clustering performance can be improved effectively by combining useful auxiliary information(typically must-links and/or cannot-links)obtained from reality with the corresponding objective.To the best of our knowledge,all such semi-supervised objective function-based clustering algorithms are based on non-convex objective,semi-supervised convex clustering has not been proposed yet.Thus,we attempt to combine pairwise constraints with convex clustering.However,the existing methods usually make the original convex objectives lose their convexity,which add constraint penalty terms to the objective function.In order to deal with such problem,we introduce a novel semi-supervised convex clustering model by using the weakly-supervised information.In particular,the key idea is to change distance metric instead of adding constraint penalty terms to the objective function.As a result,the proposed method not only maintains the advantages of convex clustering,but also improves the performance of convex clustering.
出处 《计算机研究与发展》 EI CSCD 北大核心 2017年第8期1763-1771,共9页 Journal of Computer Research and Development
基金 国家自然科学基金项目(61672281)~~
关键词 基于目标函数的聚类 凸聚类 弱监督信息 约束 距离度量 半监督聚类 objective function-based clustering convex clustering weakly-supervised information constraints distance metric semi-supervised clustering
  • 相关文献

参考文献2

二级参考文献39

  • 1倪巍伟,孙志挥,陆介平.k-LDCHD——高维空间k邻域局部密度聚类算法[J].计算机研究与发展,2005,42(5):784-791. 被引量:18
  • 2Azcarraga A P,Yap T N Jr,Tan J,et al. Evaluatingkeyword selection methods for WEBSOM text archives [J].IEEE Trans on Knowledge and Data Engineering,2004,16(3):380-383.
  • 3Peter P,Patricia S,Marti H,et al. Scatter/gather browsingcommunicates the topic structure of a very large textcollection [C] //Proc of ACM S1GCHI Conf on HumanFactors in Computing Systems. New York:ACM,1996:213-220.
  • 4Beil F,Ester M,Xu X. Frequent term-based text clustering[C] //Proc of ACM SIGKDD Conf on Knowledge Discoveryand Data Mining. New York:ACM,2002:436-442.
  • 5Zamir O,Etzioni O. Web document clustering:A feasibilitydemonstration [C] //Proc of ACM SIGIR Conf on Researchand Development in Information Retrieval. New York:ACM,1998:46-54.
  • 6Valdes P R,Pericliev V,Pereira F. Concise,intelligible,and approximate profiling of multiple classes [J].International Journal of Human Computer Systems,2000,53(3); 411-436.
  • 7Li Yuanhong,Dong Ming,Hua Jing. Localized featureselection for clustering [J]. Pattern Recognition Letters,2008,29(1):10-18.
  • 8Xu Yongdong,Xu Zhiming,Wang Xiaolong,et al. Usingmultiple features and statistical model to calculate text unitssimilarity [C] //Proc of Int Conf on Machine Learning andCybernetics. Piscataway,NJ:IEEE,2005:3834-3839.
  • 9Tsutsumi K,Nakajima K. Maximum/minimum detection bya module-based neural network with redundant architecture[C] //Proc of Int Joint Conf on Neural Networks.Piscataway. NJ:IEEE,1999:558-561.
  • 10Kohonen T. Self-Organizing Maps [ M]. 2nd ed. Berlin:Springer,1997.

共引文献13

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部