摘要
现有的不完备数据填充算法对所有缺失数据采用统一方式填充,没有考虑数据的重要性,效率低,实时性差.因此,本文提出一种基于属性重要性的不完全数据填充算法.通过差分矩阵求得属性约简,根据约简区分重要属性和非重要属性,对于重要属性数据填充采用改进的马氏距离填充方法,而不重要属性数据填充采用相似度概率填充方法,保证了数据精确度的同时,提高了实时性,具有实用性.最后,实验部分采用数据家庭系统数据和UCI标准数据集分别对算法性能进行了分析,验证了该算法的优越性.
Existing incomplete data filling algorithm are all use the same method to fill all the missing values,and did not consider the importance of each value,thus,makes all algorithms low efficiency and poor real-time.Therefore,this paper proposes a new data filling algorithm based on distinguishing the importance of attributes,it uses attribute reduction to distinguish important attributes and unimportant attributes,then,uses the improved mahalanobis-based algorithm to imputing the missing value that belong to the important attributes,and unimportant missing values according to the similarity-probabilistic method,thus,ensure that the accuracy of data,at the same time,make sure the real-time and practicality.at last,the experimental part using the Digitalhome system and the UCI standard datasets to analysis the algorithm performance,verifying the superiority of the algorithm.
出处
《微电子学与计算机》
CSCD
北大核心
2013年第7期167-172,176,共7页
Microelectronics & Computer
基金
大连市科技局科技计划项目(2011A17GX076)
关键词
不完备系统
数据填充
马氏距离
属性约简
incomplete system
data filling
mahalanobis distance
attribute reduction