摘要
Manifold学习算法已经被广泛应用到很多领域,如信息检索、模式识别、人工智能和数据挖掘等。已有的Manifold学习算法对局部临近区域选择很敏感,并且降维后的数据可分割性比较差。本文提出了一种自适应临近区域选择、具有良好信息可分割性的Manifold学习算法。这种方法在每一个数据点根据数据可估算的本质维度和局部正切方向选择临近区域。与此同时,在映射数据的时候使用聚类分析法聚集相似样本点集。这种方式能确保降维数据具有良好的可分割性,获得更好的降维效果。实验结果表明,新的方法在人工生成数据集上具有更好的嵌入效果。
Manifold learning algorithms have been widely used in many fields such as information retrieval, pattern recognition, artificial intelligence, and data mining. Existed Manifold learning algorithms are sensitive to the choice of local adjacent domains, and the reduced information does not have ideal partibility. This paper proposes an improved Manifold learning algorithm, the method can dy- namically choose the adjacent domains at each data point based on estimated intrinsic dimensionality of data and local tangent orientation. And it clusters the similar sample points by using clustering analysis when mapping data to low dimensional space, which guarantees the reduced data have good partibility and gets better results for dimensionality reduction. The experiment on Helix of threedimensional space shows that this method derives a better embedding result on the data sets generated artificiallv.
出处
《北华航天工业学院学报》
CAS
2016年第3期1-4,共4页
Journal of North China Institute of Aerospace Engineering
基金
河北省科技厅项目(15210908)
廊坊市科技局项目(2015011052)
关键词
数据降维
Manifold学习算法
自适应区域选择
聚类分析
dimension reduction, manifold learning algorithms, adaptive neighborhood selection method, cluster analysis