期刊文献+

缺失数据插补方法探讨——基于最近邻插补法和关联规则法 被引量:21

The Research of Missing Data Imputation Method:Based on Nearest Neighbor Imputation and Association Rules
下载PDF
导出
摘要 提出基于最近邻插补和关联规则的缺失数据插补方法,将不含缺失数据的变量作为辅助变量,通过定义距离函数寻找与含缺失数据的样本单元距离较近的样本,然后利用挖掘得到的关联规则支持度和提升度乘积的倒数作为权重,对样本单元之间的距离进行加权处理,得到加权距离,再用加权距离最小的样本单元对应的属性值对缺失值进行插补。这种方法可以解决由不同最近距离样本单元得到不同插补值的问题,最后给出了该方法的实施步骤和应用范例。 This paper proposes a new missing data imputation method based on nearest neighbor imputation and the association rules . The variables w hose sample data are complete can be used as auxiliary variables ,by defining the distance function we can obtain which sample (with complete data) is nearest to the sample with missing data . Then we calculate weight using the support and lift of the association rules related to the missing data ,so that we get the weighted distance ,the weighted distance reasonably reflects the dependency relationships among samples with complete data and samples with missing data .A new completing procedure and an example are developed and presented .
出处 《统计与信息论坛》 CSSCI 北大核心 2015年第1期35-40,共6页 Journal of Statistics and Information
基金 全国统计科学研究重点项目<小微工业企业抽样调查问题研究>(2013LZ34) 北京市社科基金重点项目<基于北京市地理分布的空间抽样设计研究>(14JGA022) 北京市优博论文指导教师人文社科项目(20121000202)
关键词 关联规则 缺失数据 最近邻插补 加权距离 association rules missing data nearest neighbor imputation weighted distance
  • 相关文献

参考文献7

  • 1Agrawal R, Imielinski T, Swami A. 1Vraning Association Rules between Sets of It:ns in Large Databases[C]. Proceedings of the ACM SIGMOD Conference on Management of Data, Washington, IX], USA, 1993.
  • 2Ragel A, Cremilleux B. Treatment of Missing Values for Association Rules[C]. Proceedings of the Second Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD-98), Melbourne, Australia, Lecture Notes in Artificial Intelligence 1394, Berlin: Springer, 1998.
  • 3Ragel A, Crmlleux It MVC-A Reprocessing Method to : with IVEssing Values[J]. Knowledge-Based System Journal,1999, 12 (5/6).
  • 4Shen J J, Chang C C, Li Y C. Combined Association Rules for Dealing with Missing Values[J]. Journal of Information Science, 2007, 33(4).
  • 5Leila Ben Othman, Sadok Ben Yahia. GBARMVC: Generic Basis of Association Rules Based Approach for Missing Values Completion[J]. International Journal of Computing :" Information Sciences, 2011, 9(1) .
  • 6Leila Ben Othman, Sadok Ben Yahia. Yet Another Approach for Completing Missing Values[C]. Springer-Verlag Berlin Heidelberg, CLA 2006, LNAI 4923, 2008.
  • 7Pang-Ning Ta, Michael Steinbach, Vipin Kumar.数据挖掘导论[M].2版.范明,范宏建,等,译.北京:人民邮电出版社,2011.

同被引文献195

引证文献21

二级引证文献74

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部