摘要
聚类分析在工程领域如生物序列分析、图像分割、文本分析等广泛应用。聚类方法涉及广泛,而基于概率统计理论的方法是其中的一大类。从最基本的FCM模型出发,阐述了势函数(Potential)、山脉(Mountain)函数聚类方法、信息熵方法,分析比较了这些方法的适用范围和优缺点,介绍了当今流行的核聚类、谱聚类和高斯混合模型聚类方法及其求解过程,并分析了它们的优缺点、计算复杂性等指标。最后,介绍了一些崭新的聚类模型的研究方向。
Clustering analysis is widely applied to engineering fields, such as biology sequence analysis,image segmenta- tion, text analysis. Currently there have been many clustering methods and statistical learning based methods constitute a class of them. This paper started from FCM, introduced classical methods, such as potential and mountain functions, entropy method, and then analyzed their properties and applicability. Moreover, we also introduced the state-of-art clus- tering techniques,such as kernel clustering, spectral clustering and Gaussian mixture model based clustering, narrated the solving process and analyzed their properties, computation complexity. At last, this paper presented several research directions.
出处
《计算机科学》
CSCD
北大核心
2012年第7期18-24,共7页
Computer Science
基金
国家自然科学基金项目(41171341)
教育部新世纪优秀人才支持计划
河南省科技创新杰出青年计划(114100510006)
航空科学基金光电控制技术国防科技重点实验室资助项目(20095155008)
河南省科技厅科技攻关项目(122102210227)
河南省科技厅基础与前沿技术研究计划项目(092300410140)
河南省教育厅项目(2011B520038
2010B520032)
郑州市科技局项目(112PPTGY248-6)资助
关键词
聚类分析
统计学习
高斯混合模型
谱聚类
核聚类
Clustering analysis, Statistical machine learning, Gaussian mixture models, Spectral clustering, Kernel clustering