期刊文献+

基于理想点的星型高阶联合聚类一致融合策略 被引量:2

Research on Consistent Ensemble of Star-Structure High-Order Co-Clustering Based on Ideal Point
下载PDF
导出
摘要 高阶联合聚类一般被转化为多对二阶联合聚类结果的一致融合问题,将多个二阶聚类目标函数的加权线性组合作为高阶联合聚类的目标函数,通过交替迭代方法得到聚类结果.然而,现有算法仍根据专家经验预设权值,自动的确定线性组合的最优权值仍是一个经典难题.文中针对星型高阶异构数据,提出一种基于理想点的自动确定权值的一致融合策略,将各二阶聚类目标函数的最优值构成的空间中的点称为理想点.通过将二阶聚类结果与其理想结果间的相对距离作为聚类质量的度量标准,解决了各二阶聚类质量不可公度的问题,最终使得高阶聚类目标函数与理想点的相对距离最小.基于理想点的方法能够解决多种星型高阶联合聚类算法的一致融合问题,因此具有一定的普适性.实验结果表明该方法有效地提高了5种经典高阶聚类算法的效果. The problem of high-order co-clustering is converted in to the problem of consistent ensemble of multiple pair two-order co-clustering.Clustering results are obtained by an alternate iterative method,which is used to optimize a weighted combination of objective functions of each pair of two-order co-clustering.However,existing algorithms set the weights according to artificial expert expertise.So far how to automatically determine the optimal weights is a classic problem.Based on ideal point which is the point in space that composed of optimal value of each two-order co-clustering objective function,a strategy of consistent ensemble which can automatically determine the weights is developed for star-structure high-order heterogeneous data.By taking the relative distance between two-order co-clustering results and ideal results as criterion,we solve the problem of incommensurability,and finally minimum the relative distance between high-order co-clustering objective function value and ideal point.Because the strategy based on ideal pointcan solve the problem of consistent ensemble of multiple algorithms of high-order co-clustering,it is a general method.Experimental results show that the method can improve the clustering effect of five algorithms of high-order co-clustering.
出处 《计算机学报》 EI CSCD 北大核心 2015年第7期1460-1472,共13页 Chinese Journal of Computers
基金 国家自然科学基金(71272216 60903080 60093009) 国家科技支撑计划(2009BAH42B02 2012BAH08B02) 博士后科学基金(2012M510480) 中央高校基本科研业务费专项资金资助项目(HEUCFZ1212 HEUCFT1208)资助~~
关键词 异构数据 高阶联合聚类 理想点 一致融合 heterogeneous data high-order co-clustering ideal point consistent ensemble
  • 相关文献

参考文献18

  • 1Long Bo, Wu Xiao-Yun, Zhang Zhong-Fei, et al. Unsuper- vised learning on k-partite graphs//Proceedings of the 12th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Philadelphia, USA, 2006:317-326.
  • 2Chen Yan-Hua, Wang Li-Jun. Non-negative matrix factori- zation for semisupervised heterogeneous data eoelustering. IEEE Transaetions on Knowledge and Data Engineering, 2010, 22(10) 1459-1474.
  • 3Wang Hua, Nie Fei-Ping, Huang Heng, et al. Nonnegative matrix tri-factorization based high-order co-clustering and its fast implementation//Proceedings of the 11th IEEE Interna- tional Conference on Data Mining. Arlington, USA, 2011: 174-183.
  • 4Gao Bin Liu Tie-Yan, Zheng Xin, et al. Consistent bipartite graph co-partitioning for star-structured high-order heteroge- neous data co-clustering//Proeeedings of the 11th ACM SIGKDD International Conference on Knowledge Discovery in Data Mining. Chicago, USA, 2005:41-50.
  • 5Dhillon I S. Co-clustering documents and words using bipartite spectral graph partitioning//Proceedings of the 7th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. San Francisco, USA, 2001:269-274.
  • 6Dhillon I S, Mallela S, Modha D S. Information-theoretic co-clustering//Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Washington, USA, 2003:89-98.
  • 7Long Bo, Zhang Zhong-Fei, Wu Xiao-Yun, et al. Spectral clustering for multi-type relational data//Proeeedings of the 23rd International Conference on Machine Learning. Pittsburgh, USA, 2006:584-592.
  • 8Liu Tie-Yan, Ma Wei-Ying. Star-structured high-order heterogeneous data co-clustering based on consistent Infor- mation Theory//Proceedings of the 6th IEEE International Conference on Data Mining. Hong Kong, China, 2006z 880- 884.
  • 9Greco G, Guzzo A. Coclustering multiple heterogeneous domains.. Linear combinations and agreements. IEEE Trans- actions on Knowledge and Data Engineering, 2010, 22(12): 1649-1663.
  • 10Mei Jian-Ping, Chen Li-Hui. A fuzzy approach for multitype relational data clustering. IEEE Transactions on Fuzzy Systems, 2012, 20(2): 358-371.

同被引文献11

引证文献2

二级引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部