摘要
EM算法是参数估计的重要方法,其算法核心是根据已有的数据来迭代计算似然函数,使之收敛于某个最优值。半监督聚类是利用少部分标签的数据辅助大量未标签的数据进行的聚类分析。文章介绍了一种基于双重高斯混合模型的EM算法,在无监督学习中增加一些已标记的样本,利用已标记的样本得到初始参数,研究了半监督条件下的双重高斯混合模型的EM聚类算法。实验表明,该算法较无监督学习而言,提升了样本的识别率,有良好的聚类性能。这种算法模型也可以作为一种基础模型,具有一定的应用领域。
EM algorithm is an important parameter estimation method. Its core idea is to iteratively compute the likelihood function until it converges to some optimal value for the given data. Semi - supervised clustering employs a small amount of labeled data to aid clustering analysis. The EM algorithm based on dual Gaussian mixture model with the added labeled samples as the initial parameters has been studied in this paper. The experimental results demonstrate that the algorithm increases the recognition rate for samples compared with the unsupervised study and has good clustering ability. Furthermore, the algorithm model can be used as a basic model in other application fields.
出处
《计算机仿真》
CSCD
2007年第11期110-113,共4页
Computer Simulation
关键词
双重高斯混合模型
期望最大化算法:半监督聚类
Dual gaussian mixture model
Expectation maximum algorithm
Semi - supervised clustering