期刊文献+

稀疏编码的最近邻填充算法 被引量:2

K-nearest neighbor imputation based on sparse coding
下载PDF
导出
摘要 针对K最近邻填充算法(K-nearest neighbor imputation,KNNI)的参数K值固定问题进行了研究,发现对缺失值填充时,参数K值固定很大程度上影响了填充效果。为此,提出了基于稀疏编码的最近邻填充算法来解决这一问题。该算法是用训练样本重构每一缺失样本,在重构过程中充分考虑了样本之间的相关性;并用1范数来学习确保每个缺失样本用不同数目的训练样本填充,以此解决KNNI算法参数K值选取问题。基于数据性能分析指标RMSE和相关系数的实验比较结果表明,该算法比KNNI算法的效果要好。该算法能很好地避免了KNNI算法存在的缺陷,适用于数据预处理环节需要对缺失值进行填充的应用领域。 Aimed at the parameter K fixed issues of K-nearest neighbor imputation (KNNI) algorithm, it was found that when impute the missing values, the fixed value of the parameter K resuhed in a large extent influence to the imputation effect. Therefore, this paper proposed the K-nearest neighbor based on sparse coding (KNNI-SC) algorithm to solve this problem. This method reconstructed each missing sample with the training samples, fully considering the correlation between samples in the reconstruction process. And it used an l1 norm to learn to ensure each missing sample was imputed by different number of training samples, so it solved the parameter K selection problem of KNNI algorithm. Performance comparison based on the data analysis of the experimental results indicators RMSE and correlation coefficients show that the algorithm is better than KNNI algorithm. The algorithm can well avoid the defects of KNNI algorithm, it is available to data preprocessing step that needs missing values imputation' s applications.
出处 《计算机应用研究》 CSCD 北大核心 2015年第7期1942-1945,共4页 Application Research of Computers
基金 国家自然科学基金资助项目(61170131 61263035和61363009) 国家"863"计划资助项目(2012AA011005) 国家"973"计划资助项目(2013CB329404) 广西自然科学基金资助项目(2012GXNSFGA060004) 广西八桂创新团队和广西百人计划资助项目 广西高校科学技术研究重点资助项目(2013ZD041) 广西研究生教育创新计划项目(YCSZ2015095 YCSZ2015096)
关键词 缺失值填充 稀疏编码 重构 均方根误差 相关系数 数据预处理 missing value imputation sparse coding reconstruct RMSE correlation coefficient data preprocessing
  • 相关文献

参考文献22

  • 1Zhang Shichao, Jin Zhi,Zhu Xiaofeng. Missing data imputation by uti- lizing information within incomplete instances [ J ]. Journal of Sys- tems and Software ,2011,84 ( 3 ) :452-459.
  • 2Zhang Chengqi, Qin Yongsong, Zhu Xiaofeng, et al. Clustering-based missing value imputation for data preprocessing [ C ]//IEEE Interna- tional Conference on Industrial Informatics. 2006:108i-1086.
  • 3Zhang Shichao, Jin Zhi, Zhu Xiaofeng, et al. Missing data analysis : a kernel-based muhi-imputation approach [ M ]//Transactions on Com- putational Science 1[. Berlin : Springer,2009 : 122-142.
  • 4Zhang Shichao, Qin Yongsong, Zhu Xiaofeng, et al. Optimized parame- ters for missing data imputation [ C ]//Proc of the 9th Pacific Rim In- ternational Conference on Artieial Intelligence. 2006 : 1010-1016.
  • 5Zhang Chengqi, Zhu Xiaofeng, Zhang Jilian,et al. GBKII : an imputa- tion method for missing values [ C ]//Proc of the llth Pacific-Asia Conference. 2007 1080-1087.
  • 6Cover T, Hart P. Nearest neighbor pattern classification [ J ]. IEEE Trans on Information Theory,1967,13( 1 ) :21-27.
  • 7Lall U, Sharma A. A nearest neighbor bootstrap for resampling hydro- logic time series [ J ]. Water Resources Research, 1996,32 ( 3 ) : 679- 693.
  • 8Zhu Xiaofeng, Huang Zi, Cheng Hong, et al. Sparse hashing for fast multimedia search[J]. ACM Trans on Information System,2013, 31 (2) :1-24.
  • 9Zhu Xiaofeng, Huang Zi, Shen Hengtao, et al. Dimensionality reduc- tion by mixed kernel canonical correlation analysis[ J]. Pattern Re- cognition ,2012,45 ( 8 ) :3003-3016.
  • 10Jenatton R, Gramfort A, Michel V, et al. Mutial-scale mining of fMRI data with hierarchical structured sparsity[ J]. SlAM Journal on Ima- ging Sciences,2012,5(3) :835-856.

二级参考文献17

  • 1DUDA R O,HART P E,STORK D G.Pattern classification[M].2nded.New York:Wiley,2000.
  • 2DASH M,LIU H.Feature selection for classification[J].IntelligentData Analysis,1997,1(3):131-156.
  • 3BISHOP C M.Neural networks for pattern recognition[M].NewYork:Oxford University Press,1995.
  • 4HE Xiao-fei,CAI Deng,NIYOGI P.Laplacian score for feature selec-tion[C]//Advances in Neural Information Processing Systems.Cam-bridge,MA:MIT Press,2005:507-514.
  • 5ZHANG Dao-qiang,CHEN Song-can,ZHOU Zhi-hua.Constraintscore:a new filter method for feature selection with pairwise con-straints[J].Pattern Recognition,2008,41(5):1440-1451.
  • 6ZHAO Zheng,LIU Huan.Semi-supervised feature selection via spec-tral analysis[C]//Proc of the 7th SIAM International Conference onData Mining.2007.
  • 7LIANG Yi-xiong,WANG Lei,XIANG Yao,et al.Feature selection viasparse approximation for face recognition[C]//Proc of Computer Sci-ence and Pattern Recognition.2011.
  • 8MALLAT S G,ZHANG Zhi-feng.Matching pursuits with time-frequen-cy dictionaries[J].IEEE Trans on Signal Processing,1993,41(12):3397-3415.
  • 9DONOHO D L.Compressed sensing[J].IEEE Trans on InformationTheory,2006,52(4):1289-1306.
  • 10GAO Sheng-hua,TSANG I W H,CHIA L T.Kernel sparse representa-tion for image classification and face recognition[C]//Proc of the11th European Conference on Computer Vision.Berlin:Springer-Ver-lag,2010.

共引文献2

同被引文献19

引证文献2

二级引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部