摘要
针对大数据背景下数据分类问题,已有的在线学习算法通常引入L_1范数正则化增强预测模型的稀疏性,但单一的正则化约束不能高效的获取稀疏模型.基于此,提出了一种具有双重稀疏机制的在线学习算法(an online learning algorithm with dual sparse mechanisms,DSOL).在DSOL算法中,一方面利用L_(1/2)正则化项约束目标函数以增强预测模型的稀疏性,提高算法的泛化性能.另一方面用改进的梯度截取法对数据特征进行选择,有效稀疏化预测模型.通过L_(1/2)正则化与改进的梯度截取策略的有机融合,有效利用了历史数据信息,提高了算法分类数据的性能.通过与另4种代表性稀疏在线学习算法在9个公开数据集的实验对比表明DSOL算法对数据分类的准确性更高.
To deal with data classification problems under the background of big data,many existing online learning algorithms usually take advantage of L1 norm regularization to enhance the sparsity of the prediction model.However,a sparse prediction model cannot be obtained efficiently by a single regularization constraint.In this paper,an online learning algorithm with dual sparse mechanisms(DSOL) is proposed.In DSOL algorithm,the objective function is constrained by L1/2 regularization in order to enhance the prediction model’s sparsity,and then improve the generalization ability of DSOL.Furthermore,an improved truncated gradient method is applied to enhance the sparsity of the prediction model through properly selecting the features of data.By the organic integration of the above two sparse mechanisms,including the L1/2 regularization and the improved truncated gradient method,some historical data information can be effectively utilized,and then the performance of the algorithm in data classification can be greatly improved.Extensive experiments between DSOL and other 4 popular sparse online learning algorithms on 9 open data sets manifest that DSOL algorithm yields more favorable performance on data classification.
作者
魏波
吴瑞峰
张文生
吕敬钦
王莹莹
夏学文
WEI Bo;WU Rui-feng;ZHANG Wen-sheng;LU Jing-qin;WANG Ying-ying;XIA Xue-wen(School of Software, East China Jiaotong University, Nanchang, Jiangxi 330013, China;Institute of Automation, Chinese Academy cf Science, Beijing 100190 , China;School of Humanities and Social Sciences, East China Jiaotong University,Nanchang, Jiangxi 330013 , China)
出处
《电子学报》
EI
CAS
CSCD
北大核心
2019年第10期2202-2210,共9页
Acta Electronica Sinica
基金
国家自然科学基金(No.61806204,No.61463017,No.61663009)
国家自然科学基金重点项目(No.U61432008)
江西省高校教改课题(No.JXJG-18-5-19)