摘要
为解决传统的支持向量回归模型在处理大规模数据时计算效率较低的局限,文章将交互有效方法与支持向量回归模型相结合,提出了基于交互有效方法的分布式支持向量回归模型(CE-SVR).该模型首先采用分布式存储方式将大规模数据随机分配给多台机器,其次采用交互有效方法构建支持向量回归的近似损失函数替代全局损失函数获得近似预测结果,能够有效地分析大规模数据.数值模拟和应用研究的结果表明:在线性模型中,文章所提出模型的预测性能与全局支持向量回归模型基本一致,且显著优于基于单轮型方法的分布式支持向量回归模型(OS-SVR);在非线性模型中,文章所提出模型的预测性能会随着机器数的增加而降低,但其预测性能显著优于OS-SVR模型.
To address the computationally inefficient problem of the classical support vector regression model when processing the large-scale data,this paper combines the communication-efficient method with the support vector regression model to propose a distributed support vector regression model(CE-SVR).CE-SVR model firstly uses distributed storage to randomly distribute the large-scale data into multiple machines,then uses a communication-efficient method to construct an approximate loss function of support vector regression instead of the global loss function and obtain an approximate prediction result,which can effectively solve the limitations of the classical support vector machine regression model.The results of the numerical simulation and applied research show:In the linear model,the prediction performance of the CE-SVR model is basically consistent with the global support vector regression model,and is significantly better than the distributed support vector regression model(OS-SVR)based on one-shot method;in the nonlinear model,the prediction performance of CE-SVR model decreases as the number of machines increasing,but its prediction performance is significantly better than that of the OS-SVR model.
作者
蔡超
冉晓婷
薛伟
田育鑫
CAI Chao;RAN Xiaoting;XUE Wei;TIAN Yuxin(School of Statistics,Shandong Technology and Business University,Yantai 264005)
出处
《系统科学与数学》
CSCD
北大核心
2023年第4期1081-1092,共12页
Journal of Systems Science and Mathematical Sciences
基金
山东省社会科学规划项目(19BYSJ40)资助课题。
关键词
大数据
分布式计算
交互有效
支持向量回归
Big data
distributed computing
communication-efficient
support vector regression