摘要
针对数据统计分布的随机性和复杂性,从统计聚类的角度出发,采用高斯混合模型来描述整个数据的概率密度函数,提出了一种基于高斯混合模型的粗糙聚类分析方法.该方法首先利用粗糙集理论的不可区分关系性质以及生成的逻辑规则来设定EM算法的初始近似参数,然后通过Expectation-M axim ization(EM)算法估计各分量概率密度分布的最大似然参数集,最后通过密度分布概率大小来确定类别的归属.与传统的k-m eans聚类算法的试验结果比较表明,该方法是有效的,并且具有较高的聚类精度,用规则集来描述聚类的结果具有可解释性和合理性.
Aiming at resolving randomness and complexity of data statistical distribution, the whole data probability density function is described by Gaussian mixture model in the sight of statistical clustering. A rough clustering analysis method based on Gaussian mixture model is proposed. Firstly, the initial parameters of EM obtained by indiscernibility relation and logic rules generated with rough set theory. Secondly, the maximum likelihood parameters of each component probability density distribution can be estimated by EM iterative computation. Finally, the classification is determined through density distribution probability value. Experimental results show that the new method is effective. Compared with conventional k -means clustering algorithm, it has higher clustering precision and the clustering resuhs described by the rule sets are interpretable and rational.
出处
《哈尔滨工业大学学报》
EI
CAS
CSCD
北大核心
2006年第2期256-259,322,共5页
Journal of Harbin Institute of Technology
基金
国家高技术研究发展计划资助项目(2003AA1Z2610)
关键词
高斯混合模型
粗糙集
EM算法
聚类
gaussian mixture model
rough set
EM algorithm
clustering