摘要
2000年以后新兴了一系列非线性降维的方法,流形学习中的Isomap就是其中的代表。该算法能够反映数据集的全局结构且简单高效,但是存在低维流形等距的欧氏子集必须是凸集和计算复杂度高等缺点。L-Isomap成功降低了算法的计算复杂度,但是对于地标点(landmark points)的选取大多采用随机的方法,致使该算法不稳定。依据拓扑学和泛函分析中有限维空间有界闭集与紧集(compact set)等价、紧集的任一开覆盖存在有限子覆盖等经典定理,分析数据集所在区域的拓扑结构,确定了一系列能够反映数据结构的地标点。这样的方法计算复杂度低,比L-Isomap稳定,且将数据集是凸集的要求弱化到紧集(有界闭集),避免了传统Isomap算法放大不完整流形中的"空洞"误差等问题。
Since 2000,a series of nonlinear dimensionality reduction methods have been emerging,and Isomap in manifold learning is one of the representatives.The algorithm can reflect the global structure of the data set and is simple and efficient.But there are some shortcomings that low-dimensional manifold must be convex set and the computational complexity is large.L-Isomap successfully reduces the computational complexity,but majority of the landmarks is selected by random method,which makes the algorithm unstable.In this paper,according to the classical theoremes that bounded closed set is equivalent to compact set in finite-dimensional space and there is finite sub-coverage covering the compact set,we analyzed the topology of the area of the data set and selected a series of landmarks.This method has low computational complexity and is more stable than L-Isomap.In addition,this method weakens the condition that the data set is a convex set to a compact set(bounded closed set),which avoids enlarging'the hollow'error in the incomplete manifold.
出处
《计算机科学》
CSCD
北大核心
2017年第S1期88-91,共4页
Computer Science
关键词
流形学习
等距映射
地标点
紧性
Manifold learning
Isomap
Landmark points
Compact