摘要
当数据含有噪声或标签错误时,传统的属性选择方法(如粗糙集)无法得到正确结果,为此提出一种针对含噪、标签错误数据的属性选择方法.首先用最大边界投影方法获得数据的最佳投影;然后通过对投影矩阵进行2,1范数正则化操作,进而获得行稀疏的投影矩阵,据此获得对关键属性的挖掘;最后给出方法的收敛性和针对标签错误数据的有效性证明.实验结果表明,所提出的算法克服了噪声和标签错误的影响,较好地实现了针对含噪、标签错误数据的属性选择.
The traditional attribute reduction algorithms such as rough set will fall to get accurate results when deal with the data sets which have noise or labeling errors. Therefore, this paper proposes an attribute reduction algorithm which can analyze this kind of data effectively. Firstly, the best projection of the data sets is obtained by using the maximum margin projection(MMP) method. Then/2,1-norm on the projection matrix is used to achieve row-sparsity, which leads to selecting relevant features. Finally, the proof of the algorithm's convergence and validity to the data sets with errors is given. The result of experiments on the UCI data sets show the effectiveness of the proposed algorithm.
出处
《控制与决策》
EI
CSCD
北大核心
2013年第10期1485-1490,共6页
Control and Decision
基金
安徽省自然科学基金项目(1208085MF94)
关键词
属性选择
最大边界投影
2
1范数
噪声数据
标签错误
attribute reductiom maximum margin projectiom l2,1 norm
noise data
labeling error