期刊文献+

一种块增量偏最小二乘算法 被引量:3

A chunk increment partial least square algorithm
原文传递
导出
摘要 增量学习模型是一种有效挖掘大规模数据的数据处理技术。增量式偏最小二乘(incremental partial least square, IPLS)模型是一种基于增量技术的偏最小二乘算法改进模型,具有不错的数据降维效果,但是,IPLS模型每新增1个样本都需要对模型进行增量更新,导致模型的训练时间较长。针对这一问题,基于数据分块更新的思想提出了一种块增量偏最小二乘算法(chunk incremental partial least square, CIPLS)。CIPLS算法将样本数据划分为数个的数据块(chunk),然后再以数据块为单位对模型进行增量更新,从而大幅减少了模型的更新频率,提高了模型的学习效率。在K8版本的p53蛋白数据集和路透文本分类语料库上的对比实验表明,CIPLS算法大幅度缩短了增量式偏最小二乘模型的训练时间。 For the data mining of large-scale data, incremental learning is an effective and efficient technique. As an improved partial least square(PLS) method based on incremental learning, incremental partial least square(IPLS) has a competitive dimension reduction performance. However, there is a drawback in this approach that training samples must be learned one by one, which consumes a lot of time on the issue of on-line learning. To overcome this problem, we propose an extension of IPLS called chunk incremental partial least square(CIPLS) in which a chunk of training samples is processed at a time. Comparative experiments on k8 cancer rescue mutants data set and Reuter-21578 text classification corpus show the proposed CIPLS algorithm is much more efficient than IPLS without sacrifice dimension reduction performance.
作者 曾雪强 叶震麟 左家莉 万中英 吴水秀 ZENG Xue-qiang;YE Zhen-lin;ZUO Jia-li;WAN Zhong-ying;WU Shui-xiu(Information Engineering School, Nanchang University, Nanchang 330031, Jiangxi, China;School of Computer & Information Engineering, Jiangxi Normal University, Nanchang 330022, Jiangxi, China)
出处 《山东大学学报(理学版)》 CAS CSCD 北大核心 2019年第3期93-101,共9页 Journal of Shandong University(Natural Science)
基金 国家自然科学基金资助项目(61463033 61866017) 江西省杰出青年人才资助计划(20171BCB23013) 江西省教育厅科学技术研究项目(GJJ150354)
关键词 增量学习 偏最小二乘 数据块 数据降维 incremental learning partial least square data chunk dimension reduction
  • 相关文献

参考文献3

二级参考文献25

  • 1Wu Xindong,Zhu Xingquan, Wu Gongqing, et al.Data mining with big data[J].IEEE Transactions on Knowledge and Data Engineering, 2014,26 ( 1 ) : 97-107.
  • 2Zhang Qingchen, Chen Zhikui.A weighted kernel possi- bilistic c-means algorithm based on cloud computing for clustering big data[J].International Journal of Communi- cation Systems,2014,27(9) : 1378-1391.
  • 3Chen Xuewen,Lin Xiaotong.Big data deep learning:chal- lenges and perspectives[J].IEEE Access, 2014,2 : 514-525.
  • 4Hinton G E,Salakhutdinov R R.Reducing the dimension- ality of data with neural networks[J].Science, 2006,313 (5786) : 504-507.
  • 5Bengio Y, Courville A,Vincent P.Representation learning: a review and new perspectives[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2013,35(8): 1798-1828.
  • 6Liang N, Huang G, Saratchandran P, et al.A fast and accu- rate online sequential learning algorithm for feed-forward networks[J].IEEE Transactions on Neural Networks,2006, 17(6) : 1411-1423.
  • 7Li Guang,Na Jing, Stoten D P, et al.Adaptive neural network feed-forward control for dynamically sub-struc- tured systems[J].IEEE Transactions on Control Systems Technology, 2014,22(3 ) : 944-954.
  • 8Rumelhart D E,Hinton G E,Willian R J.Learning rep- resentations of back-propagation errors[J].Nature, 1986, 323.
  • 9Schraudolph N N.Fast curvature matrix-vector products for second order gradient descent[J].Neural Computation, 2012,14(7).
  • 10Cheng W, Juang C.A fuzzy model with online incre- mental SVM and margin-selective gradient descent learn- ing for classification problems[J].IEEE Transactions on Fuzzy Systems,2014,22(2) :324-337.

共引文献12

同被引文献15

引证文献3

二级引证文献6

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部