摘要
逻辑回归和线性支持向量机是解决大规模分类问题的有效方法,然而它们的分布式实现问题到现在也没有得到更好的研究.近年来,由于分布式计算框架在迭代算法中低效率性的原因,一种基于内存的集群计算平台Spark已经被提出来,并正在成为应用于大规模数据处理和分析的一个普遍框架.在本研究中,使用了新拟牛顿方程用于解决逻辑回归以及线性支持向量机的问题,并且在Spark框架中进行了实现.实验表明该方法显著提高了大规模分类问题的准确性和效率.
Logistic regression and linear support vector machine(SVM)are the effective method to solve the problem of large-scale classification,but there has been no better research on their distributed implementation issues up to now. In recent years,the Spark platform based on the memory has been put forward,being the common framework in the mass data processing and analysis due to the inefficient willfulness of the distributed computing framework in the iterative algorithm. In this paper,the new quasi-Newton equation is used to solve the logistic regression and linear support vector machine and realized in the Spark framework. Experiments show that this method significantly improves the accuracy and efficiency of the large-scale classification problem.
出处
《天津理工大学学报》
2017年第5期19-23,共5页
Journal of Tianjin University of Technology
关键词
逻辑回归
线性支持向量机
Spark框架
新拟牛顿方程
logistic regression
linear support vector machine
Spark framework
new quasi-Newton equation