期刊文献+

基于PCA的近邻均值填补优化算法 被引量:1

Predicting Missing Data with Neighborhood Mean Based on PCA
下载PDF
导出
摘要 均值填补是常用的数据填补方式,但往往忽略了相邻变量之间的相互关系,又对噪声数据极为敏感。将主成份分析算法应用到均值填补算法中,提取相邻各属性的特征重要度,并采用属性重要度作为权重,以均值填补的计算方式算出缺失数据相邻矩阵的加权平均值,将其作为相邻属性对于均值填补的影响偏移值,加入到均值填补的均值计算中。通过对UCI数据集的仿真实验可知,基于PCA改进的算法填补的准确性明显优于均值填补算法。 Mean filling algorithm is a commonly-adopted way to fill missing data.However the correlation between these variables is ignored and also extremely sensitive to noise data.In this paper,the principal component analysis(PCA)algorithm is applied to mean filling algorithm,and the characteristics of adjacent properties are proposed.The weighted mean value of the adjoining matrix of the missing data is calculated by using the attribute importance as the weight.As an adjacent property,the offset value of the mean value is added to the mean calculation of the mean filling.According to results of the UCI dataset simulation experiment,the accuracyof the improved complement algorithm based on PCA is clearly higher than that of the mean filling algorithm.
作者 谢霖铨 毕永朋 廖龙龙 XIE Lin-quan;BI Yong-peng;LIAO Long-long(Faculty of Science,Jiangxi University of Science and Technology,Ganzhou 341000,China)
出处 《软件导刊》 2018年第6期67-69,76,共4页 Software Guide
基金 国家自然科学基金项目(61762047) 江西省科技厅青年科学基金项目(20161BAB211015)
关键词 近邻均值填补 主成分分析 特征重要度 偏移值 nearest neighbor imputation PCA attribute significance deviant
  • 相关文献

参考文献7

二级参考文献42

  • 1张新荣,张宇林,周红标,唐中一.无线传感网技术在水产养殖环境参数监测中的应用[J].农机化研究,2012,34(5):191-195. 被引量:9
  • 2L.基什.抽样调查[M].北京:中国统计出版社,1997.
  • 3[美]Donald.B.Rubin. Multiple Imputation For Nonresponse In Surveys [M], New York :John Wiley & Sons Inc.1987.
  • 4[美]Roderick J. A. Little, Donald B. Rubin. Statistical Analysis with Missing Data [M], New York :John Wiley & Sons Inc.2002.
  • 5金进等编著.抽样技术[M].北京:中国统计出版社,2008.
  • 6李道亮,傅泽田.集约化水产养殖数字化集成系统[M].北京:电子工业出版社,2010:2-5.
  • 7Agrawal R, Imielinski T, Swami A. 1Vraning Association Rules between Sets of It:ns in Large Databases[C]. Proceedings of the ACM SIGMOD Conference on Management of Data, Washington, IX], USA, 1993.
  • 8Ragel A, Cremilleux B. Treatment of Missing Values for Association Rules[C]. Proceedings of the Second Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD-98), Melbourne, Australia, Lecture Notes in Artificial Intelligence 1394, Berlin: Springer, 1998.
  • 9Ragel A, Crmlleux It MVC-A Reprocessing Method to : with IVEssing Values[J]. Knowledge-Based System Journal,1999, 12 (5/6).
  • 10Shen J J, Chang C C, Li Y C. Combined Association Rules for Dealing with Missing Values[J]. Journal of Information Science, 2007, 33(4).

共引文献153

同被引文献10

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部