期刊文献+

一种基于相关系数加权的离散型数据填补算法与分析 被引量:2

Analysis on discrete data filling algorithm based on correlation coefficient weighting
下载PDF
导出
摘要 为解决具有关联性数据的缺失值问题,提出一种结合相关系数与相似性匹配作用于离散型数据填补缺失值的方法。首先,在非缺失数据源中挖掘频繁项集并计算数据属性间的相关性,计算出挖掘项的项内整体的相关性;然后,根据缺失数据所在项的非缺失前项与完整数据挖掘项的相似度选择填补项;填补项相似性一致则利用加权置信度进一步选取填补规则,一方面提高了Apriori挖掘规则集合的数量及质量,另一方面也保证了规则匹配的可靠性。经实验与相关方法比较,该方法提高了缺失数据填补的准确率与时间效率。 In order to solve the problem of missing values of correlation data,a method combining correlation coefficient and similarity matching is proposed to fill the missing values of discrete data. The frequent item sets are mined in non-missing data sources and the correlation between the data attributes is calculated to get the overall inter-item correlation. Then,the items under filling are selected according to the similarity between the non-missing previous item in the item of missing data and the complete data mining item. If the similarity of the items under filling is similar,filling rules are further selected by weighted confidence,so as to improve the quantity and quality of Apriori mining rule sets and guarantee the reliability of rule matching.Contrastive experiments were performed between the proposed method and other related methods. The experimental results verify that the proposed method can improve the accuracy and time efficiency of missing data filling.
作者 王志刚 田立勤 毛亚琼 WANG Zhigang;TIAN Liqin;MAO Yaqiong(Qinghai Normal University,Xining 810008,China;North China Institute of Science and Technology,Beijing 101601,China)
出处 《现代电子技术》 北大核心 2020年第9期109-112,共4页 Modern Electronics Technique
基金 国家重点研发计划(2017YFC0804108) 青海省重点实验室、重点研发项目(2017 ZJ Y21,2017 ZJ 752) 河北省物联网监控工程技术研究中心项目资助(314201620)。
关键词 离散数据填补 加权支持度 相关系数加权 缺失值填补 频繁项集挖掘 填补规则选取 discrete value filling weighted support correlation coefficient weighting missing value filling frequent item set mining filling rule selection
  • 相关文献

参考文献9

二级参考文献71

  • 1金连,王宏志,黄沈滨,高宏.基于Map-Reduce的大数据缺失值填充算法[J].计算机研究与发展,2013,50(S1):312-321. 被引量:18
  • 2刘富春.基于限制容差关系的集对粗糙集模型[J].计算机科学,2005,32(6):124-128. 被引量:8
  • 3岳勇,田考聪.数据缺失及其填补方法综述[J].预防医学情报杂志,2005,21(6):683-685. 被引量:30
  • 4张宏亭,李学仁,孔韬.BP神经网络在缺失数据估计中的应用[J].计算机工程与设计,2007,28(14):3457-3459. 被引量:13
  • 5Witten I H,Frank E,Hall M A.Data Mining:Practical machinelearning tools and techniques[M].Morgan Kaufmann,2011.
  • 6Han J W,Kamber M.Data Mining Concepts and Techniques[M].范明,译.2版.北京:机械工业出版社,2001:257-259.
  • 7Cortes C,Vapnik V.Support vector networks[J].Machine Learning,1995,20:273-297.
  • 8Blake C,Keogh E,Merz CJ.UCI repository of machine learning data-bases[EB/OL].Department of Information and Computer Science,U-niversity of California,Irvine,CA,1998.http://www.ics.uci.edu/~mlearn/MLRepository.html.
  • 9LeCun Y,Jackel L D,Bottou L,et al.Comparison of learning algo-rithms for handwritten digit recognition[C]//F Fogelman,P Galli-nari.Proc.Int’l Conf.Artifcial Neural Network:53-60.
  • 10Quinlan J R.Induction of decision trees[J].Machine learning,1986,1(1):81-106.

共引文献72

同被引文献19

引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部