期刊文献+

并行SVM算法在Flink平台的应用研究 被引量:4

Application Research of Parallel SVM Algorithm on Flink Platform
下载PDF
导出
摘要 在大数据时代背景下,数据规模成指数级增长,传统支持向量机(SVM)已无法适应大数据环境,所以需要将传统支持向量机算法改进使其可以应用于大数据计算框架.针对计算过程中存在占用内存大、寻优速度慢等问题,提出一种基于Flink平台的并行支持向量机算法.该方法首先基于层叠支持向量机(Cascade SVM)的合并策略以及训练结构,通过Flink分布式计算框架实现;其次,通过优化并行操作算子的性能引入分布式广播变量,优化算法,有效解决单机SVM算法训练效率低的问题.实验结果表明,结合Flink框架实现SVM算法并行化,能有效的减少了训练时间,提高模型的训练效率. In the context of the era of big data,the scale of data has grown exponentially,and traditional support vector machines(SVM)have been unable to adapt to the environment of big data,so the traditional support vector machine algorithm needs to be improved so that it can be applied to the big data computing framework.Aiming at the problems of large memory occupation and slow optimization speed in the calculation process,a parallel support vector machine algorithm based on Flink platform is proposed.This method is based on the cascading support vector machine(Cascade SVM)merge strategy and training structure,and is implemented through the Flink distributed computing framework.Secondly,the distributed broadcast variables are introduced by optimizing the performance of the parallel operation operator,and the algorithm is optimized to effectively solve the problem of low training efficiency of the stand-alone SVM algorithm.Experimental results show that the parallelization of SVM algorithm combined with Flink framework can effectively reduce the training time and improve the training efficiency of the model.
作者 白玉辛 刘晓燕 BAI Yu-xin;LIU Xiao-yan(College of Information Engineering and Automation,Kunming University of Science and Technology,Kunming 650500,China)
出处 《小型微型计算机系统》 CSCD 北大核心 2021年第5期1003-1007,共5页 Journal of Chinese Computer Systems
基金 国家自然科学基金项目(61462055)资助.
关键词 并行计算 支持向量机 大规模数据集处理 Flink parallel computing support vector machine large-scale dataset processing Flink
  • 相关文献

参考文献6

二级参考文献50

  • 1Burges C J C. A tutorial on support vector machines for pat- tern recognition[J].Data Mining and Knowledge Discovery, 1998, 2(2):121-167.
  • 2Seholkopf B, Smola A. Learning with kernels[M]. Cam- bridge, MA: MIT Press, 2001.
  • 3Smola A, Seh61kopf B. A tutorial on support vector regres- sion[J]. Statistics and Computing, 2004, 14(3):199-222.
  • 4Vapnik V. The nature of statistical learning theory[M]. Berlin: Springer Verlag, 2000.
  • 5Vapnik V N. Statistical learning theory[M]. New York: John Wiley and Sons, 1998.
  • 6Zakai A, Ritov Y. Consistency and localizability[J]. Journalof Machine Learning Research, 2009(10):827-856.
  • 7Blanzieri E, Melgani F. Nearest neighbor classification of re- mote sensing images with the maximal margin principle[J]. IEEE Transactions on Geoscience and Remote Sensing, 2008, 46(6) ;1804-1811.
  • 8Shen Min-fen, Chen Jia liang, Lin Chun-hao. Modeling of nonlinear medical signal based on local support vector ma- chine[C] // Proc of International Instrumentation and Meas urement Technology Conference, 2009 : 675-679.
  • 9Brailovsky V L, Barzilay O, Shahave R. On global, local, mixed and neighborhood kernels for support vector ma- chines[J]. Pattern Recognition Letters, 1999, 20(11-13) : 1183-1190.
  • 10Segata N, Blanzieri E. Fast and scalable local kernel ma- chines[J]. Journal of Machine Learning Research, 2010 (11) ;1883-1926.

共引文献49

同被引文献12

引证文献4

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部