摘要
罚模型聚类实现了在聚类过程中精简变量的目标,同时如何识别聚类的有效变量成了一个新的问题.在这个问题上,已有的研究有成对罚模型,模型处理了各类数据同方差的情况.考察了异方差情况下的变量选择问题,针对异方差数据提出了两种新的模型,并给出模型的解和算法.模拟数据分析结果表明,异方差数据上两个新模型都有更好的表现.
Penalized model based clustering achieved the goal of deleting variables in the process of clustering, while how to recognize the clustering function of those undeleted variables becoming a new problem. Researches on this problem include the pairwise penalized model which dealt with the situation of homoscedasticity data. This paper investigated the problem of variables selection under the heteroscedasticity situation, and proposed two new models and their solution & algorithm respectively for the heteroscedasticity data. Simulation data analysis indicate that both two models perform better on the heteroscedasticity data.
出处
《数学的实践与认识》
CSCD
北大核心
2013年第5期163-170,共8页
Mathematics in Practice and Theory
基金
中国人民大学科学研究基金(中央高校基本科研业务费专项资金资助)项目成果
编号10XNI014
"人文社会科学成果评价指标体系"
关键词
高维异方差数据
EM算法
高斯混合模型
罚似然
high dimensional heteroscedasticity data
EM algorithm
gaussian mixture model
penalized likelihood