期刊文献+

二元数据子空间聚类算法的初始化研究 被引量:2

Research of initialization of subspace clustering algorithm in binary data
下载PDF
导出
摘要 针对二元数据空间高维稀疏性的特点而提出的有限混合伯努利模型,能够快速寻找映射簇的模型框架;EM算法是数学模型进行参数迭代的重要方法,其算法的优劣很大程度上取决于其初始参数。对于运用EM算法来实现有限混合伯努利模型聚类算法已有许多研究,EM算法中参数的选取直接影响聚类算法的性能。引入Binning法和改变数据之间相似度测量方式、中心点的选取方式来进行初始化,从而大大减少聚类结果对初始参数的依赖,实验证明该算法是高效的、正确的。 Aiming at the characteristic of high-dimensionality and sparseness in binary data set, proposes the finite mixtures of Bernoulli distributions model for finding projected clusters fast. EM algorithm is the important method of iterative parameters, and the degree of good or bad with EM algorithm lies on initial parameters. As far as the finite mixtures of Bernoulli distributions model, there have been lots of researches about it. However, in EM algorithm, the initial parameters affect the clustering performance directly. Therefore, this paper introduced Binning method and computed parameters through changing the comparability measurement between dates and selection style about core-point,in order to reduce the dependence of the clustering for initial parameters. Experiment demonstrates the algorithm is efficient and accurate.
出处 《计算机应用研究》 CSCD 北大核心 2009年第1期47-49,共3页 Application Research of Computers
基金 国家"863"计划资助项目(2007AA12Z238)
关键词 子空间聚类 二元数据 有限混合伯努利模型 EM算法 subspace clustering binary data the finite mixtures of Bernoulli distributions model EM algorithm
  • 相关文献

参考文献9

  • 1CHENG C, FU A W, ZHANG Yi. Entropy-based subspace clustering for mining numerical data[ C]//Proc of the 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York:ACM Press, 1999:84-93.
  • 2AGRAWAL R, GEHRKE J, GUNOPULOS D,et al. Automatic subspace clustering of high dimensional data for data mining applications [J]. ACM SIGMOD Record, 1998,27(2):94-105.
  • 3AGGARWAL C C, WOLF J L, YLIP S, et al. Fast algorithms for projected clustering[ J]. ACM SIGMOD. 1999,28 (2) :61-72.
  • 4AGGARWAL C C, YU P S. Finding generalized projected clusters in high dimensional space[J]. ACM SIGMOD. 2000,29(2) :70-81.
  • 5FRALEY C. Algorithms for model-based Gaussian hierarchical clustering[J]. SIAM Journal on Scientific Computing, 1999,20( 1): 270- 281.
  • 6PATRIKAINEN A. Projected clustering of high-dimensional binary data[ D ]. Helsinki: Helsinki University of Technology, 2002.
  • 7岳佳,王士同.高斯混合模型聚类中EM算法及初始化的研究[J].微计算机信息,2006,22(11X):244-246. 被引量:51
  • 8BIERNACKI C. Initializing EM using the properties of its trajectories in Gaussian mixtures [ J ]. Statistics and Computing, 2004, 14 (3) :267-279.
  • 9SCOTT D W. On optimal and data-based histograms[ J]. Biometrika, 1979,66(3 ) :605-610.

二级参考文献10

  • 1Dempster, A. P, Laird, N. M, Rubin, D. B. Maximum likelihood for incomplete data via the EM algorithm.[J] .J.R. Stat. Soc,1977,B, 39:1-38.
  • 2Liu C, Sun D X. Acceleration of EM Algorithm for Mixtures Models using ECME[J]. ASA Proceedings of the Stat. Comp. Session, 1997, 109-114.
  • 3Christophe Biemacki.Initializing EM Using the Properties of its Trajectories in Gaussian Mixtures [J]. Statistics and Computing,2004, 14, 3:267-279.
  • 4Patricia McKenzie, Michael Alder. Initializing the EM Algorithm for use in Gaussian Mixture Modelling [J]. Amsterdam Esevier Science BV, 1994:91-105.
  • 5Biernacki C, Celeux G, Govaert G. Choosing Starting Values for the EM Algorithm for Getting the Highest Likelihood in Multivariate Gaussian Mixture Models[J]. Computational Statistics and Data analysis, 2002.
  • 6Banfield J. D, Raftery A. E. Model-based Gaussian and non-Gaussian clustering [J]. Biometrics, 1993, 49:803-821.
  • 7Fraley C, A.E. Raftery.How many clusters? Which clustering method? -Answers via model-based cluster analysis [J]. The Computer Journal, 1998, 41:578-588.
  • 8D.W.Scott. On optimal and data-based histograms [J]. Biometrika, 1979, 66:605-610.
  • 9Fraley C.Algorithms for model-based Gaussian hierarchical clustering [J].SIAM J.Sci.Computer, 1999, 20:270-281.
  • 10汤效琴,戴汝源.数据挖掘中聚类分析的技术方法[J].微计算机信息,2003,19(1):3-4. 被引量:87

共引文献50

同被引文献10

引证文献2

二级引证文献4

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部