摘要
利用图代数计算数据之间的相关性,进而优化数据的局部邻域,并应用于改进局部线性嵌入.LLE算法。优化后的LLE算法考虑了数据集的聚类结构,但不需要分类信息或聚类算法做预处理,因而算法是无监督的,有通用性,简单易于实现。邻域优化后的局部线性嵌入算法克服了经典LLE不能很好地处理稀疏或含有噪音数据的缺陷。同时继承了经典LLE时间复杂度低的优点,可用于解决大规模数据问题。标准数据集上的实验结果证明了所提方法的有效性。
By using the graph algebra to deal with the relevance among data points, an approach was proposed to optimize the neighborhood. This approach was applied to optimize the locally linear embedding (LLE) for dimensionality reduction. The optimized LLE takes clustering structure of data into consideration, but does not require class labels of data points or clustering algorithms as preprocessing. The optimized LLE is unsupervised, general, and simple to be implemented. It is superior to the classic LLE in that it can nicely deal with sparsely sampled or noise contaminated data sets. It also inherits the advantage of the low time complexity from LLE so that it can be applied to deal with the large-scale data sets. The experimental results validate the proposed approach.
出处
《系统仿真学报》
EI
CAS
CSCD
北大核心
2007年第13期3119-3122,共4页
Journal of System Simulation
基金
湖北省科技攻关项目(2005AA101C17)
关键词
数据流形
局部线性嵌入
图代数
邻域结构
data manifold
local linear embedding
graph algebra
neighborhood structure