摘要
提出一种基于支持向量机的缺失值填补方法。该方法将缺失值填补分为连续属性缺失值填补和类别属性缺失值填补两种情况。对于连续属性的情况,采用支持向量机回归进行缺失值的预测;对于类别属性的情况,采用支持向量机分类进行缺失值的预测。在几个UCI数据集和MINIT手写阿拉伯数字数据集上的对比实验说明,该算法较传统的均值填补方法和基于决策树回归的缺失值填补方法具有更高的恢复率。
In this paper,we present a support vector machine-based missing values filling method.In this method,missing values filling is divided into two cases,i.e.,the continuous attributes filling and the class attributes filling.For the continuous attributes case,support vector machine regression is used to predict the missing values;for the class attributes case,support vector machine classification is used to predict the missing values.Comparative experiments on several UCI high-dimensional data sets and MINIT handwritten Arabic numerals data set show that the proposed algorithm has higher recovery rate than the conventional mean values filling method and decision tree regression-based filling method.
出处
《计算机应用与软件》
CSCD
北大核心
2013年第5期226-228,共3页
Computer Applications and Software
关键词
缺失值
支持向量机
回归
分类
Missing values Support vector machine Regression Classification