期刊文献+

QENNI:一种缺失值填充的新方法 被引量:5

QENNI:A New Imputation Method for Missing Data
下载PDF
导出
摘要 针对k最近邻填充算法(kNNI)在缺失数据的k个最近邻的选择上可能存在偏好,提出一种新的缺失填充算法:象限近邻填充算法QENNI(quadrant-encapsidated-nearest-neighbor-based imputation),它仅仅使用缺失数据象限方向的最近邻数据填充该缺失值,避免了kNNI中选取的k个最近邻点有偏好这一情况。另外,此算法对于低维数据集可以是无参的,即消除了对参数的依赖。实验结果表明,QENNI算法的填充准确性要优于kNNI算法。 As the k-nearest neighbor imputation (kNNI) algorithm is often biased in choosing the k nearest neighbors of missing data,a new imputation method is put forward ,Quadrant-Encapsidated-Nearest- Neighbor based Imputation method (QENNI) ,for missing values. The algorithm uses the quadrant nearst neighbors (points of the encapsulant) around a missing datum to impute the missing datum. It is not biased in selecting nearest neighbors. Experiments demonstrate that QENNI is much better than the kNNI method in imputed accuracy.
出处 《广西师范大学学报(自然科学版)》 CAS 北大核心 2010年第1期72-76,共5页 Journal of Guangxi Normal University:Natural Science Edition
基金 国家973计划资助项目(2008CB317108) 国家自然科学基金资助项目(90718020) 澳大利亚ARC基金资助项目(DP0985456) 广西研究生教育创新计划项目(2009106020812M63)
关键词 缺失值 缺失填充 kNNI补值算法 QENNI补值算法 missing data missing data imputation kNNI method QENNI method
  • 相关文献

参考文献9

  • 1ZHANG Shi-chao.Parimputation:from imputation and null-imputation to partially imputation[J].IEEE Intelligent Informatics Bulletin,2008,9 (1):32-38.
  • 2ZHANG Shi-chao.Shell-neighbor method and its application in missing data imputation[J].Applied Intelligence,2010(待发).
  • 3QIN Yong-song,ZHANG Shi-chao,ZHU Xiao-feng,et al.Semi-parametric optimization for missing data imputation[J].Applied Intelligence,2007,27 (1):79-88.
  • 4BATISTA G,MONARD M C.An analysis of four missing data treatment methods for supervised learning[J].Applied Artificial Intelligence,2003,17 (5):519-533.
  • 5GEDIGA G,DUNTSCH I.Maximum consistency of incomplete data via non-invasive imputation[J].Artificial Intelligence Review,2003,19 (1):93-107.
  • 6WANG Qi-hua,RAO J N K.Empirical likelihood-based inference under imputation for missing response data[J].The Annals of Statistics,2002,30(3):896-924.
  • 7BATISTA G E,MONARD M C.A study of k-nearest neighbor as a model-based method to treat missing data[C]// Proceedings of the Argentine Symposium on Artificial Intelligence.Bering Germany:Springer,2001,30:1-9.
  • 8金自翔,戴新宇,陈家骏.一种基于贪婪算法的KNN参数选择策略[J].广西师范大学学报(自然科学版),2008,26(1):182-185. 被引量:1
  • 9朱晓锋.缺失值填充若干问题研究[D].桂林:广西师范大学计算机科学与信息工程学院,2007.

二级参考文献11

  • 1徐晓颖,王晓晔,杜太行.基于Fuzzy ART的K-最近邻分类改进算法[J].河北工业大学学报,2004,33(6):1-5. 被引量:4
  • 2陈振洲,李磊,姚正安.基于SVM的特征加权KNN算法[J].中山大学学报(自然科学版),2005,44(1):17-20. 被引量:52
  • 3钱晓东,王正欧.基于改进KNN的文本分类方法[J].情报科学,2005,23(4):550-554. 被引量:19
  • 4AAS K,EIKVIL L. Text categorization :a survey[R]. Oslo :Norwegian Computing Center, 1999.
  • 5YANG Yi-ming. An evaluation of statistical approaches to text categorization[J]. Information Retrieval, 1999,1 (1/ 2) :69-90.
  • 6MITCHELL T, Machine Learning[M]. New York:McCraw Hill ,1996.
  • 7LEWIS D D,SCHAPIRE R E,CALLAN J P ,et al. Training algorithms for linear text classifiers [C]//Proceedings of the Nineteenth Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. New York :ACM Press, 1996: 298-306.
  • 8YANG Yi-ming,PEDERSEN J O. A comparative study on feature selection in text categorization[C]//Proceeding of the Fourteenth International Conference on Machine Learning. San Francisco ,CA :Morgan Kaufmann Publishers Inc, 1997:412-420.
  • 9LEWIS D D,YANG Yi-ming,Rose T G,et al. RCV.1 : a new benchmark collection for text categorization research [J]. Journal of Machine Learning Research, 2004,5 : 361-397.
  • 10YANG Yi-ming. A study of thresholding strategies for text categorization[C]//Proceedings of the 24th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. New York :ACM Press,2001 : 137-145.

共引文献1

同被引文献60

引证文献5

二级引证文献15

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部