期刊文献+

一种基于Epanechnikov二次核的成分数据缺失值填补法 被引量:1

An Imputation Method for Missing Data in Compositional Based on Epanechnikov Kernel
下载PDF
导出
摘要 核函数方法已经被成功的用于各种函数的估计.本文利用核函数的思想,针对缺失数据造成现有的成分数据统计方法失效和k近邻填补法(KNNI)在利用缺失数据的k个近邻估计缺失数据时没有考虑到它们各自不同的贡献,提出了一种基于Epanechnikov二次核的成分数据缺失值填补法(EKI)和对其进行修正后的Epanechnikov核成分数据缺失值填补法(MEKI).实验结果表明,基于修正的Epanechnikov二次核的成分数据缺失值填补法比k近邻填补法能够得到更为准确的估计. Kernel function method has been successfully used for the estimation of a variety of function. By using the kernel function theory, an imputation method based on Epanechnikov kernel and its modification were proposed to solve the problem that missing data in compositional caused the failures of existing statistical methods and the k-nearest imputation didn't consider the different contributions of the k nearest samples when it used them to estimated the missing data. The experimental results illustrate that the modified imputation method based on Epanechnikov kernel get a more accurate estimation than k-nearest imputation for compositional data.
出处 《应用概率统计》 CSCD 北大核心 2014年第6期598-606,共9页 Chinese Journal of Applied Probability and Statistics
基金 国家自然科学基金重点项目(71031006) 国家自然科学基金项目(81173366)资助 国家青年基金项目(41101440) 山西省教育厅专项项目(20120301)
关键词 成分数据 缺失值填补 k近邻填补法 Epanechikov二次核 Aitchison距离 Compositional data, imputation for missing data, k-nearest imputation, Epanechnikovkernel, Aitchison distance.
  • 相关文献

参考文献11

  • 1Ferrers, N.M., An Elementary Treatise on Trilinear Coordinates, London: Macmillan, 1866.
  • 2Aitchison, J., The Statistical Analysis of Compositional Data, London: Chapman and Hall, 1986.
  • 3Egozcue, J.J., Pawlowsky-Glahn, V., Mateu-Figueras, G. and BarcelS-Vidal, C., Isometric logratio transformations for compositional data analysis, Mathematical Geology, 35(3)(2003), 279-300.
  • 4刘鹏,雷蕾,张雪凤.缺失数据处理方法的比较研究[J].计算机科学,2004,31(10):155-156. 被引量:24
  • 5Hron, K., Templ, M. and Filzmoser, P., Imputation of missing values for compositional data using classical and robust methods, Computational Statistics and Data Analysis, 54(12)(2010), 3095-3107.
  • 6孙志猛,张忠占,杜江.缺失数据下半参数单调回归模型的估计[J].数理统计与管理,2011,30(6):979-988. 被引量:5
  • 7Qin, Y.S., Zhang, S.C., Zhu, X.F., Zhang, J.L. and Zhang, C.Q., Semi-parametric optimization for missing data imputation, Applied Intelligence, 27(1)(2007), 79-88.
  • 8范明,柴玉梅,昝红英等译.统计学习基础-数据挖掘,推理与预测,北京:电子工业出版社,2004.
  • 9何亮,宋擒豹,沈钧毅,海振.一种新的组合k-近邻预测方法[J].西安交通大学学报,2009,43(4):5-9. 被引量:4
  • 10Aitchison, J., A concise guide to compositional data analysis, in Compositional Data Analysis Work- shop, Girona, 2003.

二级参考文献34

  • 1杨立,左春,王裕国.基于语义距离的K-最近邻分类方法[J].软件学报,2005,16(12):2054-2062. 被引量:31
  • 2HAN E H, KARYPIS G, KUMAR V. Text categorization using weight adjusted k-nearest neighbor clas sification[C]// Proceedings of the 5th Pacific Asia Conference on Knowledge Discovery and Data Mining. Berlin, Germany: Springer-Verlag, 2001..53-65.
  • 3YAMADA T, YAMASHITA K, ISHII N, et al. Text classification by combining different distance functions with weights[C]// Proceedings of the 7th ACIS International Conference on Software Engineering, Artificial Intelligence, Networking, and Parallel/ Distributed Computing. Los Alamitos, CA, USA: IEEE Computer Society, 2006: 85-90.
  • 4JAGADISH H V, OOI B C, TAN K L, et al. iDistance: an adaptive B-tree based indexing method for nearest neighbor search[J]. ACM Transactions on Da tabase Systems, 2005, 30(2):364-397.
  • 5FREUND Y, SCHAPIRE R E, A decision theoretic generalization of on-line learning and an application to boosting [J]. Journal of Computer and System Sciences, 1997, 55(1)7119-139.
  • 6FERN X Z, BRODLEY C E, Boosting lazy decision trees [C]//20th International Conference on Machine Learning. Menlo Park, CA, USA: American Association for Artificial Intelligence, 2003: 178-185.
  • 7BREIMAN L. Prediction games and arcing algorithms [J]. Neural Computation, 1999, 11(7):1493-1517.
  • 8RIEGEWAY G, MADIGAN D, RICHARDSON T. Boosting methodology for regression problems [C]// Proceedings of the 7th International Workshop on Ar tificial Intelligence and Statistics. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc. , 1999: 152- 161.
  • 9KEGL B. Robust regression by boosting the median [C]//16th Annual Conference on Learning Theory and 7th Kernel Workshop. Berlin, Germany: Springer- Verlag, 2003: 258-272.
  • 10DRUCKER H. Improving regressors using Boosting techniques[C]// The 14th International Conference on Machine Learning. San Francisco, CA, USA: Morgan Kaufmann Publishers Inc. , 1997: 107-115.

共引文献30

同被引文献9

引证文献1

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部