高维异方差高斯混合罚模型聚类

High-dimensional Heteroskedasticity Penalized Gaussian Mixture Model-Based Clustering

导出

摘要罚模型聚类实现了在聚类过程中精简变量的目标,同时如何识别聚类的有效变量成了一个新的问题.在这个问题上,已有的研究有成对罚模型,模型处理了各类数据同方差的情况.考察了异方差情况下的变量选择问题,针对异方差数据提出了两种新的模型,并给出模型的解和算法.模拟数据分析结果表明,异方差数据上两个新模型都有更好的表现. Penalized model based clustering achieved the goal of deleting variables in the process of clustering, while how to recognize the clustering function of those undeleted variables becoming a new problem. Researches on this problem include the pairwise penalized model which dealt with the situation of homoscedasticity data. This paper investigated the problem of variables selection under the heteroscedasticity situation, and proposed two new models and their solution ＆ algorithm respectively for the heteroscedasticity data. Simulation data analysis indicate that both two models perform better on the heteroscedasticity data.

作者李沐雨王星

机构地区中国人民大学应用统计科学研究中心中国人民大学统计学院

出处《数学的实践与认识》 CSCD 北大核心 2013年第5期163-170,共8页 Mathematics in Practice and Theory

基金中国人民大学科学研究基金(中央高校基本科研业务费专项资金资助)项目成果编号10XNI014 "人文社会科学成果评价指标体系"

关键词高维异方差数据 EM算法高斯混合模型罚似然 high dimensional heteroscedasticity data EM algorithm gaussian mixture model penalized likelihood

分类号 O212.1 [理学—概率论与数理统计]

引文网络
相关文献

参考文献9

1Tibshirani R. Regression shrinkage and selection via the lasso[J]. Journal of the Royal Statistical Society. Series B(Methodological), 1996, 58(1): 267-288.
2Pan W and Shen X. Penalized model-based clustering with application to variable selection[J]. The Journal of Machine Learning Research, 2007, 8: 1145-1164.
3Wang S and Zhu J. Variable selection for model-based high-dimensional clustering and its application to microarray data[J]. Biometrics, 2008, 64(2): 440-448.
4Xie B, Pan W and Shen, X. Penalized model-based clustering with cluster-specific diagonal covari- ance matrices and grouped variables[J]. Electronic journal of statistics, 2008, 2: 168.
5Xie B, Pan W and Shen X. Variable selection in penalized model-based clustering via regularization on grouped parameters[J]. Biometrics, 2008, 64(3): 921-930.
6Zhou H, Pan W and Shen X. Penalized model-based clustering with unconstrained covariance ma- trices[J]. Electronic journal of statistics, 2009, 3: 1473.
7Witten D and Tibshirani R. A framework for feature selection in clustering[J]. Journal of the American Statistical Association, 2010, 105(490): 713-726.
8Fan J and Li R. Variable selection via nonconcave penalized likelihood and its oracle properties[J]. Journal of the American Statistical Association, 2001, 96: 1348-1360.
9Tibshirani R, Saunders M, Rosset S, Zhu J and Knight K. Sparsity and smoothness via the fused lasso[J]. Journal of the Royal Statistical Society, Series B, 2005(67): 91-108.

1白磊,田立勤.基于TCBF算法的网络流信息统计[J].计算机应用研究,2014,31(12):3800-3803.
2王裴岩,蔡东风.普适性核度量标准比较研究[J].软件学报,2015,26(11):2856-2868. 被引量：2
3黄文倩,李江波,陈立平,郭志明.以高光谱数据有效预测苹果可溶性固形物含量[J].光谱学与光谱分析,2013,33(10):2843-2846. 被引量：11
4吴桦,龚俭,杨望.一种基于双重Counter Bloom Filter的长流识别算法[J].软件学报,2010,21(5):1115-1126. 被引量：25
5单文娟,童春发,施季森.基因芯片筛选差异表达基因方法比较[J].遗传,2008,30(12):1640-1646. 被引量：4
6温盛军,王艳,杨永明.基于PLC的水厂源水处理自动控制系统[J].重庆科技学院学报（自然科学版）,2008,10(1):69-72. 被引量：14
7张敏龙,王涛,王旭平,常红伟,王放.分步动态自回归核主元分析及其在故障诊断中应用[J].计算机应用,2016,36(5):1464-1468. 被引量：4
8张瑜,蒋璐璐,吴迪,谈黎虹,何勇.基于可见-近红外光谱技术的润滑油含水量无损检测方法研究[J].光谱学与光谱分析,2010,30(8):2111-2114. 被引量：5
9王卫阳,黄文静,乔雪娇,张伟伟.基于FLUENT的不同液体流经水嘴的嘴损数值模拟[J].石油工业技术监督,2016,32(3):5-8. 被引量：1
10Doostmohammadi R.,Mutschler Th.,Osan C..Modeling the complex and long term swelling behavior of argillaceous rocks[J].Mining Science and Technology,2011,21(5):655-659. 被引量：2

数学的实践与认识

2013年第5期

浏览历史

内容加载中请稍等...

高维异方差高斯混合罚模型聚类

参考文献9

相关作者

相关机构

相关主题

浏览历史