期刊文献+

双重高斯混合模型的EM算法的聚类问题研究 被引量:14

Study of EM Algorithm Based on Dual Gaussian-Mixture-Model
下载PDF
导出
摘要 EM算法是参数估计的重要方法,其算法核心是根据已有的数据来迭代计算似然函数,使之收敛于某个最优值。半监督聚类是利用少部分标签的数据辅助大量未标签的数据进行的聚类分析。文章介绍了一种基于双重高斯混合模型的EM算法,在无监督学习中增加一些已标记的样本,利用已标记的样本得到初始参数,研究了半监督条件下的双重高斯混合模型的EM聚类算法。实验表明,该算法较无监督学习而言,提升了样本的识别率,有良好的聚类性能。这种算法模型也可以作为一种基础模型,具有一定的应用领域。 EM algorithm is an important parameter estimation method. Its core idea is to iteratively compute the likelihood function until it converges to some optimal value for the given data. Semi - supervised clustering employs a small amount of labeled data to aid clustering analysis. The EM algorithm based on dual Gaussian mixture model with the added labeled samples as the initial parameters has been studied in this paper. The experimental results demonstrate that the algorithm increases the recognition rate for samples compared with the unsupervised study and has good clustering ability. Furthermore, the algorithm model can be used as a basic model in other application fields.
作者 岳佳 王士同
出处 《计算机仿真》 CSCD 2007年第11期110-113,共4页 Computer Simulation
关键词 双重高斯混合模型 期望最大化算法:半监督聚类 Dual gaussian mixture model Expectation maximum algorithm Semi - supervised clustering
  • 相关文献

参考文献8

  • 1C Fraley, A E Raftery. How many clusters? Which clustering method?-Answers via model-based cluster analysis [J].The Computer Journal,1998,41:578-588.
  • 2C Fraley, A E Raftery. Model-based clustering, discriminant analysis and density estimation [J].Journal of the American Statistical Association, 2002,97:611-631.
  • 3B Sugato. Semi-supervised clustering by seeding[J].The 19th Int' 1 Conf on Machine Learning,Sydney,2002.
  • 4R Ghani. Combining labelled and unlabeled data for text classification with a large number of categories [C].Proceeding s of the IEEE International Conference on Data Mining, 2001.
  • 5K Bennett, A Demiriz & R Maclin. Exploiting unlabeled data in ensemble methods[C].Proceedings of the SIGKDD International Conference on Knowledge Discovery and Data Mining. 2002.
  • 6B Liu, W S Lee, P S Yu and X Li. Partially Supervised Classification of Text Documents[C]. Proc. 19th Intl. Conf. on Machine Learning. Sydney,Australia, 2002,387-394.
  • 7Kamal Nigam, Andrew Mccallum, Sebastian Thrun, Tom Mitchell. Text Classification from Labeled and Unlabeled Documents using EM[J]. Machine Learning.2000,39:103-134.
  • 8A Banerjee, I Dhillon, J Ghosh and S Sra. A comparative study of generative models for document clustering[C].In Proceedings of The Ninth ACM SIGKDD Conference on knowledge discovery and data mining. 2003.

同被引文献88

引证文献14

二级引证文献30

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部