摘要
针对现有回归算法没有考虑利用特征与输出的关系,各输出之间的关系,以及样本之间的关系来处理高维数据的多输出回归问题易输出不稳定的模型,提出一种新的低秩特征选择多输出回归方法。该方法采用低秩约束去构建低秩回归模型来获取多输出变量之间的关联结构;同时创新地在该低秩回归模型上使用L_(2,p)-范数来进行样本选择,合理地去除噪音和离群点的干扰;并且使用L_(2,p)-范数正则化项惩罚回归系数矩阵进行特征选择,有效地处理特征与输出的关系和避免"维灾难"的影响。通过实际数据集的实验结果表明,提出的方法在处理高维数据的多输出回归分析中能获得非常好的效果。
To solve the issue of the existing regression models do not well take advantage of the correlation between inputsand outputs,and among outputs,also between samples,to take the multiple output regression analysis for high-dimensionaldata,it proposes a novel multiple output regression method called Low-rank Feature Selection for Multiple-output Regressionalgorithm(for short LFS_MR).The method can catch the correlation structures of outputs via a low-rank regressionmodel with a low-rank constraint.Specially,it is innovative that the method conducts sample selection via an L2,p-normon this low-rank regression model,which can avoid the interference of noise and outliers reasonably.What’s more,themethod conducts feature selection by applying an L2,p-norm regularization term to penalty the regression coefficientmatrix,which handles with the correlations between inputs and outputs efficiently,and solves the problem of curse ofdimensionality for the high-dimensional data.The experimental results on many realistic datasets show that the proposedmethod can obtain very good results when conduct a multiple output regression analysis for high-dimensional data.
作者
杨利锋
林大华
邓振云
李永钢
YANG Lifeng;LIN Dahua;DENG Zhenyun;LI Yonggang(Guangxi Key Lab of Multi-source Information Mining & Security, Guangxi Normal University, Guilin, Guangxi 541004, China;Guangxi Collaborative Innovation Center of Multi-source Information Integration and Intelligent Processing, Guilin,Guangxi 541004, China;Guangxi Center for Educational Technology, Nanning 530021, China)
出处
《计算机工程与应用》
CSCD
北大核心
2017年第20期116-121,共6页
Computer Engineering and Applications
基金
国家自然科学基金(No.61450001
No.61263035
No.61573270)
国家高技术研究发展计划(863)(No.2012AA011005)
国家重点基础研究发展规划(973)(No.2013CB329404)
中国博士后科学基金(No.2015M570837)
广西自然科学基金(No.2012GXNSFGA060004
No.2014jj AA70175
No.2015GXNSFAA139306
No.2015GXNSFCB139011)
广西八桂创新团队
广西百人计划和广西高校科学技术研究重点项目(No.2013ZD04)
广西研究生教育创新计划项目(No.YCSZ2016046
No.YCSZ2016045)
关键词
多输出回归
低秩回归
回归系数矩阵
特征选择
multiple-output regression
low-rank regression
regression coefficient matrix
feature selection