摘要
大量的高维数据在分布上表现为一低维流形,试图从这样的数据集中探测出奇异点,传统的奇异点挖掘算法可能失效。本文提出了一种带权重的多维尺度变化,算法通过局部的高维数据集和其低维重构的误差来设定数据点的局部权重,再利用权重之和得到的数据点置信度,以此来进行奇异值的判定。通过实验验证了算法的有效性。
Mining outliers from the data set which is distributed on a low dimensional manifold is a hard task. The existing algorithm may not be effective for the situation. So a novel approach called weighted multidimensionality scaling is proposed for outliers mining. It is based on multidimensionality scaling, MDS. Every data point will get a reliability score by the algorithm, then it can be determined whether it is a outlier through the value of its reliability score. The experiments show the efficiency of the algorithm.
出处
《计算机科学》
CSCD
北大核心
2008年第1期190-192,共3页
Computer Science
基金
国家自然科学基金项目(60495019)
教育部博士点专项基金(20060247039)