摘要
针对自组织映射(SOM)在学习和可视化高维数据内在的低维流形结构时容易产生"拓扑缺陷"的这一问题,提出了一种新的流形学习算法——动态自组织映射(DSOM)。该算法按照数据的邻域结构逐步扩展训练数据集合,对网络进行渐进训练,以避免局部极值,克服"拓扑缺陷"问题;同时,网络规模也随之动态扩展,以降低算法的时间复杂度。实验表明,该算法能更加真实地学习和可视化高维数据内在的低维流形结构;此外,与传统的流形学习算法相比,该算法对邻域大小和噪声也更加鲁棒。所提算法的网络规模和训练数据集合都将按照数据内在的邻域结构进行同步扩展,从而能更加简洁并真实地学习和可视化高维数据内在的低维流形结构。
Self-Organizing Map (SOM) tends to yield the topological defect problem when learning and visualizing the intrinsic low-dimensional manifold structure of high-dimensional data sets. To solve this problem, a manifold learning algorithm, Dynamic Self-Organizing MAP (DSOM), was presented in this paper. In the DSOM, the training data set was expanded gradually according to its neighborhood structure, and thus the map was trained step by step, by which local minima could be avoided and the topological defect problem could be overcome. Meanwhile, the map size was increased dynamically, by which the time cost of the algorithm could be reduced greatly. The experimental results show that DSOM can learn and visualize the intrinsic low-dimensional manifold structure of high-dimensional data sets more faithfully than SOM. In addition, compared with traditional manifold learning algorithms, DSOM can obtain more concise visualization results and be less sensitive to the neighborhood size and the noise, which can also be verified by the experimental results. The innovation of this paper lies in that DSOM expands the map size and the training data set synchronously according to its intrinsic neighborhood structure, by which the intrinsic low-dimensional manifold structure of high-dimensional data sets can be learned and visualized more concisely and faithfully.
出处
《计算机应用》
CSCD
北大核心
2013年第7期1917-1921,1934,共6页
journal of Computer Applications
基金
国家自然科学基金资助项目(61202285)
河南省基础与前沿技术研究项目(112300410201)
河南省教育厅科学技术研究重点项目基础研究计划(13B520899)
关键词
流形学习
自组织映射
拓扑缺陷
局部欧氏性
邻域结构
manifold learning Self-Organizing Map (SOM) topological defect locally Euclidean nature neighborhood structure