摘要
如何确定高维数据的固有维数是降维成功与否的关键。基于极大似然估计(MLE)的维数估计方法是一种新近出现的方法,实现简单,选择合适的近邻能取得不错的结果。但当近邻数过小或过大时,均有比较明显的偏差。其根本原因是没有考虑每个点对固有维数的不同贡献。在充分考虑数据集的分布信息之后,提出了一种改进的MLE——自适应极大似然估计(AMLE)。实验表明,无论在合成数据集还是真实数据集上,AMLE较MLE在估计准确度上均有很大的提高,对近邻数的变化也不甚敏感。
How to estimate the dimension of a dataset is very important to dimension reduction. Maximum likelihood estimation based method is a novel dimension estimation method, which is simple and performs well when appropriate neighbors are selected. But it is very sensitive to the neighbor number by reason of ignoring the distribution difference of each point. An improved maximum likelihood estimation method named AMLE was proposed in this paper, Considering the distribution of a dataset, AMLE adjusts the contribution of each point to the estimator by designing a weight function. By applying it to a number of simulated and real datasets, experimental results show that it performes better than MLE and other methods.
出处
《计算机应用》
CSCD
北大核心
2008年第8期2088-2090,共3页
journal of Computer Applications
基金
江苏省教育厅哲学社会科学基金指导项目(06SJD630042)
南京审计学院校级科研项目(NSK2008/B10)
关键词
固有维数估计
极大似然估计
降维
intrinsic dimension estimation
Maximum Likelihood Estimation (MLE)
dimension reduction