摘要
为解决具有关联性数据的缺失值问题,提出一种结合相关系数与相似性匹配作用于离散型数据填补缺失值的方法。首先,在非缺失数据源中挖掘频繁项集并计算数据属性间的相关性,计算出挖掘项的项内整体的相关性;然后,根据缺失数据所在项的非缺失前项与完整数据挖掘项的相似度选择填补项;填补项相似性一致则利用加权置信度进一步选取填补规则,一方面提高了Apriori挖掘规则集合的数量及质量,另一方面也保证了规则匹配的可靠性。经实验与相关方法比较,该方法提高了缺失数据填补的准确率与时间效率。
In order to solve the problem of missing values of correlation data,a method combining correlation coefficient and similarity matching is proposed to fill the missing values of discrete data. The frequent item sets are mined in non-missing data sources and the correlation between the data attributes is calculated to get the overall inter-item correlation. Then,the items under filling are selected according to the similarity between the non-missing previous item in the item of missing data and the complete data mining item. If the similarity of the items under filling is similar,filling rules are further selected by weighted confidence,so as to improve the quantity and quality of Apriori mining rule sets and guarantee the reliability of rule matching.Contrastive experiments were performed between the proposed method and other related methods. The experimental results verify that the proposed method can improve the accuracy and time efficiency of missing data filling.
作者
王志刚
田立勤
毛亚琼
WANG Zhigang;TIAN Liqin;MAO Yaqiong(Qinghai Normal University,Xining 810008,China;North China Institute of Science and Technology,Beijing 101601,China)
出处
《现代电子技术》
北大核心
2020年第9期109-112,共4页
Modern Electronics Technique
基金
国家重点研发计划(2017YFC0804108)
青海省重点实验室、重点研发项目(2017 ZJ Y21,2017 ZJ 752)
河北省物联网监控工程技术研究中心项目资助(314201620)。
关键词
离散数据填补
加权支持度
相关系数加权
缺失值填补
频繁项集挖掘
填补规则选取
discrete value filling
weighted support
correlation coefficient weighting
missing value filling
frequent item set mining
filling rule selection