一种新的基于局部保持投影的高维数据聚类成员构造方法

New Ensemble Constructor Based on Locality Preserving Projection for High Dimensional Clustering

下载PDF

导出

摘要研究在高维数据中如何产生聚类成员,并提出一种新的构造聚类成员的方法。为解决高维数据的维度对构造成员带来的影响,新的构造方法在构造聚类成员之前利用局部保持投影先对高维数据进行维度约减,然后在约减后的子空间中用随机投影结合K均值方法构造聚类成员。最后讨论了局部保持投影子空间维度的选取。实验表明,新方法得到的结果要明显优于已有的主分量分析结合下采样方法和简单的随机投影方法。 This paper studied how to construct cluster ensembles for high dimensional data and proposed a new ensemble constructor.To ameliorate the effect caused by high dimensionality,the proposed method used Locality Preserving Projections（LPP） to reduce the dimensionality before constructing ensembles.Then constructed ensembles based on random projection combined with K means in LPP subspace.Finally,we discussed how to choose the dimensionality of LPP subspace.The experiments show that ensembles generated by new algorithms perform better than those by Principal Component Analysis with subsampling（PCASS） and simple Random Projection（RP） that was proposed before.

作者周静波殷俊金忠

机构地区南京理工大学计算机科学与技术学院

出处《计算机科学》 CSCD 北大核心 2011年第9期177-181,共5页 Computer Science

基金国家自然科学基金(60632050 60873151)资助

关键词聚类融合维度约减局部保持投影随机投影 Cluster ensembles Dimension reduction Locality preserving projections Random projection

分类号 TP391.4 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献15

1Jain A K. Data Clustering: 50 Years Beyond K-Means[J]. Pattern Recognition Letters,2010,31(8):651-666.
2Fern X Z, Brodley C E. Random Projection For High Dimensional Data Clustering: A Cluster Ensemble Approach[C]//Proceedings of the 20th International Conference on Machine Learning. Washington DC, 2003: 186-193.
3Turk M,Pentland A P. Face Recognition Using Eigenfaces[C]//IEEE Conference on Computer Vision and Pattern Recognition. Maui Marriott, Hawaii, 1991 : 586-591.
4Deng Cai, et al. Orthogonal Laplacianfaces for Face Recognition [J ]. IEEE Transactions on Image Processing, 2006, 15 ( 11 ) : 3608-3614.
5Roweis S T, Saul L K. Nonlinear Dimensionality Reduction by Locally Linear Embedding[J]. Science, 2000, 290 (5500) : 2323- 2326.
6Strehl A, Ghost J. Cluster Ensembles-A Knowledge Reuse Framework for Combining Multiple Partitions[J]. Journal of Machine Learning Research, 2002,3 : 583-617.
7罗会兰,孔繁胜,李一啸.聚类集成中的差异性度量研究[J].计算机学报,2007,30(8):1315-1324. 被引量：36
8Fred A L,Jain A K. Combining Multiple Clusterings Using Evidence Accumulation[J]. IEEE Transaction on Pattern Analysis and Machine Intelligence, 2005 : 835-850.
9Topchy A,Jain A K. Clustering Ensembles: Models of Consensus and Weak Partitions[J]. IEEE Transaction on Pattern Analysis and Machine Intelligence, 2005,27(6) - 1866-1881.
10Ayad H G,Kame[M]. Cumulative Voting Consensus Method for Partitions with A Variable Number of Clusters[J]. IEEE Transaction on Pattern Analysis and Machine Intelligence, 2008,30(1) : 160-173.

二级参考文献45

1唐伟,周志华.基于Bagging的选择性聚类集成[J].软件学报,2005,16(4):496-502. 被引量：95
2Keogh E, Kasetty S. On the Need for Time Series Data Mining Benchmarks: A Survey and Empirical Demonstration. Data Mining and Knowledge Discovery, 2003, 7(4): 349-371
3Guha S, Meyerson A, Mishra N, et al. Clustering Data Streams: Theory and Practice. IEEE Trans on Knowledge and Data Engineering, 2003, 15(3) : 515 -528
4Aggarwal C C, Han Jiawei, Wang Jianyong, et al. A Framework for Clustering Evolving Data Streams //Proc of the 29th International Conference on Very Large Data Base. Berlin, Germany, 2003: 81 -92
5Charikar M, O'Callaghan L, Panigrahy R. Better Streaming Algorithms for Clustering Problems // Proc of the 35th Annual ACM Symposium on Theory of Computing. San Diego, USA, 2003 : 30 - 39
6Beringer J, Hullermeier E. Online Clustering of Parallel Data Streams. Data & Knowledge Engineering, 2006, 58(2): 180 - 204
7Yeh M Y, Dai Biru, Chen M S. Clustering over Multiple Evolving Streams by Events and Correlations. IEEE Trans on Knowledge and Data Engineering, 2007, 19(10) : 1349 - 1362
8Johnson W B, Lindenstrauss J. Extensions of Lipschitz Mappings into a Hilbert Space. Contemporary Mathematics, 1984, 26 ( 1 ) : 189 -206
9Achlioptas D. Database-Friendly Random Projections//Proc of the 20th ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems. Santa Barbara, USA, 2001 : 274 -281
10Linial N, London E, Rabinovich Y. The Geometry of Graphs and Some of Its Algorithmic Applications. Combinatorica, 1995, 15 (2) : 215 -245

共引文献37

1张杰鑫,庞建民,张铮.拟态构造的Web服务器异构性量化方法[J].软件学报,2020,31(2):564-577. 被引量：10
2徐森,卢志茂,顾国昌.解决文本聚类集成问题的两个谱算法[J].自动化学报,2009,35(7):997-1002. 被引量：20
3徐森,卢志茂,顾国昌.文本聚类集成问题中的谱算法[J].控制与决策,2009,24(8):1277-1280. 被引量：1
4李岩,王东风,韩璞.基于核独立分量分析的模糊核聚类神经网络集成方法[J].计算机应用研究,2009,26(9):3318-3320. 被引量：1
5徐森,卢志茂,顾国昌.基于矩阵谱分析的文本聚类集成算法[J].模式识别与人工智能,2009,22(5):780-786. 被引量：6
6谢文彪,樊绍胜,樊晓平.一种可最优化计算特征规模的互信息特征提取[J].控制与决策,2009,24(12):1810-1815. 被引量：3
7谢文彪,樊绍胜,费洪晓,樊晓平.基于互信息梯度优化计算的信息判别特征提取[J].电子与信息学报,2009,31(12):2975-2979. 被引量：8
8丁艳辉,李庆忠,董永权,彭朝晖.基于集成学习和二维关联边条件随机场的Web数据语义标注方法[J].计算机学报,2010,33(2):267-278. 被引量：6
9徐森,卢志茂,顾国昌.使用谱聚类算法解决文本聚类集成问题[J].通信学报,2010,31(6):58-66. 被引量：15
10卢志茂,徐森,刘远超,顾国昌.使用“分裂-合并'策略改进文本聚类集成算法的研究[J].高技术通讯,2010,20(7):714-718. 被引量：1

1程有娥.C++构造函数构造类成员的方法剖析[J].华北工学院学报（社会科学版）,2003,19(3):91-92. 被引量：1
2梁伍七.C++构造函数构造类成员的方法剖析[J].安徽广播电视大学学报,2002(1):86-88. 被引量：2
3兰远东,高蕾,曾少宁,曾树洪.半监督边缘判别嵌入与局部保持的维度约简[J].计算机系统应用,2014,23(10):138-141. 被引量：3
4龚劬,华桃桃.基于改进的局部保持投影算法的人脸识别[J].计算机应用,2012,32(2):528-530. 被引量：7
5陈黎飞,郭躬德,姜青山.自适应的软子空间聚类算法[J].软件学报,2010,21(10):2513-2523. 被引量：33
6丁铭,贾维敏,姚敏立.基于L2范数的局部保持投影算法[J].西安交通大学学报,2016,50(2):33-37. 被引量：2
7张勇,党兰学.线性判别分析特征提取稀疏表示人脸识别方法[J].郑州大学学报（工学版）,2015,36(2):94-98. 被引量：11
8赵炜,陈俊杰,李海芳.融合LDA的多类SVM方法研究[J].计算机工程与设计,2009,30(19):4497-4499. 被引量：2
9韩鹏,张代远.一种基于光照方向估计的人脸识别方法[J].计算机技术与发展,2012,22(6):85-88. 被引量：2
10吴涛,陈黎飞.自适应熵的投影聚类算法[J].计算机科学与探索,2014,8(8):933-944. 被引量：4

计算机科学

2011年第9期

浏览历史

内容加载中请稍等...

一种新的基于局部保持投影的高维数据聚类成员构造方法

参考文献15

二级参考文献45

共引文献37

相关作者

相关机构

相关主题

浏览历史