适用于高维多类数据分类的并行非线性最小二乘法分类器被引量：1

Parallel Nonlinear Least Squares for High Dimensional Multi-class Data Classification

下载PDF

导出

摘要最小二乘法(LS)分类器是一种基础但有效的分类器,尤其适用于解决大规模数据分类问题.LS方法需要求逆矩阵,这使得LS方法在处理高维数据问题时效率低下.为此,提出基于LS的并行化非线性方法(PNLS).通过随机地划分数据维,PNLS能够并行地计算局部模型参数,经过迭代优化,形成最终的全局解.PNLS方法具有三个特点:1)局部线性但全局非线性;2)避免求解大矩阵的逆,适合处理高维数据;3)通过并行计算,能够提高学习效率.另外,理论分析证明了PNLS方法的收敛性.本文进一步提出一种随机版本的PNLS方法,它在每次迭代过程中随机分割数据维以优化PNLS的性能.实验结果表明,与最小二乘法相比,本文提出的方法可以获得更好的预测精度和运行效率. least squares（ LS ） classifier is a basic but effective classifier, especially for solving large-scale data classification problems. LS method needs to invert the matrix whose size is determined by the dimensionality, which makes it inherently inefficient for dealing with high-dimensional data. In this paper, we propose a parallel nonlinear version of LS { PNLS ）. Based on random dimensionality par- titioning, PNLS can obtain local model parameters in parallel. After an iterative optimization, PNI.~ forms the final global solution. At the same time, PNLS enjoys three properties： 1 ） PNLS is a locally linear but globally nonlinear method; 2） It can avoid inverting large matrix, which makes it suitable for high-dimensional data; and 3 ） It can calculate model parameters in parallel, which can improve learning efficiency. Besides, theoretical analysis proves the iterative PNLS method is convergent. In the paper, we also propose a ran- dom PNLS with randomly partitioned data in each iteration to optimize the performance of PNLS. Experimental results on text and im- age data demonstrate that the proposed methods can obtain better prediction accuracy and runtime efficiency than LS.

作者赵奕林朱真峰周清雷

机构地区郑州大学信息工程学院

出处《小型微型计算机系统》 CSCD 北大核心 2014年第3期579-583,共5页 Journal of Chinese Computer Systems

基金国家自然科学基金项目(61250007 U1204610)资助国家"八六三"高技术研究发展计划重点项目(2009AA012201)资助中国博士后科学基金项目(2011M501189)资助

关键词最小二乘法并行高维度分类 least squares parallel high dimensionality classification

分类号 TP391 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献1

1朱真峰,郭跃飞,薛向阳.增量式最小二乘法分类器与增量式支持向量机的对比[J].小型微型计算机系统,2011,32(3):493-498. 被引量：3

二级参考文献25

1郭崇慧,孙建涛,陆玉昌,唐焕文.线性支持向量机优化问题的极大熵方法[J].小型微型计算机系统,2006,27(7):1383-1387. 被引量：2
2罗武庭.DJ—2可变矩形电子束曝光机的DMA驱动程序[J].LSI制造与测试,1989,10(4):20-26. 被引量：373
3Hmeidia I, Hawashina B, El-Qawasmeh E. Performance of KNN and SVM classifiers on full word Arabic articles [ J ]. Advanced Engineering Informatics, 2008,22 ( 1 ) : 106 - 111.
4Li Kang, Peng Jian-xun. Neural input selection: a fast modelbased approach[ J]. Neurocomputing, 2007,70 (4-6) :762-769.
5Tan C P, Lani N F M, Lai W K. Application of support vector machine classifier for security surveillance system [ C ]. In: Proceedings of the 4th lASTED International Conference on Advances in Computer Science and Technology, 2008.
6Tan S B, Zhang J. An empirical study of sentiment analysis for Chinese documents[ J]. Expert Systems with Applications, 2008, 34 (4) :2622 -2629.
7Kerrthi S, Chapelle O, DeCoste D. Building support vector machines with reduced classifier complexity[ J]. Journal of Machine Learning Research, 2006,7 : 1493-1515.
8Tikhonov A N, Arscnin V Y. Solutions of Ill-posed problems [M]. New York: John Wiley and Sons, 1977.
9Lawson C L, Hanson R J. Solving least squares problems[ M]. Englewood Cliffs, New Jersey: Prentice Hall, 1974.
10Hastie T, Tibsgirani R, Friedman J. The elements of statistical learning: data mining, inference, and prediction [ M ]. New York: Springer Press, 2003.

共引文献2

1白李娟,赵小蕾,毛启容,吴宝凤.基于声学上下文的语音情感特征提取与分析[J].小型微型计算机系统,2013,34(6):1451-1456. 被引量：3
2朱真峰,翟艳祥,叶阳东.一种线性的在线AUC优化方法[J].计算机研究与发展,2018,55(12):2725-2733. 被引量：4

同被引文献21

1Daniel PB, Pierluigi C. Introduction to the theory of complexity. Prentice Hall. ISBN 0-13-915380-2,1994.
2Zou H, Hastie T. Regularizafion and variable selection via the elastic net. J. R. Statist. Soc. B ,2005,67:301-320.
3Hoed A, Kennar R. Ridge regression. In Encyclopedia of Statistical Sciences, 1998,8 : 129-136.
4Tibshirani R. Regression shrinkage and selection via the lasso. J. R. Statist. Soc. B, 1996,58:267-288.
5Efron B, Hastie T, Johnstone I, et al. Least angle regression. Ann. Statist. ,2004,32:407-499.
6Zou H. The adaptive lasso and its oracle properties. Journal of the A- merican Statistical Association ,2006,101 : 1418-1429.
7Li ZT, Mikko J, Sillanpaa. Overview of lasso-related penalized re- gression methods for quantitative trait mapping and genomic selec- tion. Theor Appl Genet,2012,125:419-435.
8Scheetz T, Kim K, et al. Regulation of gene expression in the mam- malian eye and its revevance to eye disease. Proc. Natl. Acad. Sci, 2006,103 : 14429-14434.
9Patrick B, Jian H. Group descent algorithms for nonconvex penalized linear and logistic regression models with grouped predictors. Stat Comput,2015,25 : 173-187.
10Zou H, Hastie T. Regression shrinkage and selection via the elastic net, with application to microarrays,2003 ,1-26.

引证文献1

1谢宏宇,侯艳,李康.基于正则化回归的组学数据变量筛选方法[J].中国卫生统计,2016,33(4):733-736. 被引量：4

二级引证文献4

1罗颢文,涂江龙,刘建模,俞鹏飞,葛艳秋,易应萍.特征选择技术在江西地区缺血性脑卒中合并肺部感染风险预测模型中的应用[J].现代预防医学,2020,47(22):4038-4041. 被引量：5
2王振宇,杨斯月,吕维,丁建宝,马建龙,杨晋.利用3种不同化学计量学方法分析枸杞子抗氧化部位的谱效关系[J].中国中药杂志,2021,46(13):3377-3387. 被引量：19
3何健,文晓涛,李波,陈芊澍,李垒.基于随机森林算法的叠前流体识别[J].石油学报,2022,43(3):376-385. 被引量：5
4定会.船舶噪声数据库中自动快速筛选数据方法[J].舰船科学技术,2018,40(2X):37-39. 被引量：1

1李飏,杨洁明.广义预测控制中对控制增量的研究[J].机械工程与自动化,2007(5):45-46. 被引量：3
2金元郁,张彦军.基于输入设计的广义预测控制[J].青岛科技大学学报（自然科学版）,2003,24(1):67-69. 被引量：3
3王斐.用FoxBASE^＋数据库语句编程求逆矩阵[J].新浪潮,1995(1):36-36.
4彭辉,廖娟娟.RBF-ARX模型在倒立摆系统建模中的应用[J].控制工程,2008,15(5):481-484.
5蔡龙征,余胜生,周敬利,王晓锋.一种无类标训练数据异常检测模型[J].小型微型计算机系统,2006,27(10):1856-1860. 被引量：2
6周威光,张琪君.粗糙集理论分割海量电子病历的研究与应用[J].工业控制计算机,2017,30(1):100-101.
7朱薇,刘利刚.IHLS颜色空间的灰度化全局非线性三向映射[J].计算机辅助设计与图形学学报,2013,25(2):154-159. 被引量：6
8曾小勇,彭辉,吴军.基于RBF-ARX模型的改进多变量预测控制及应用[J].中南大学学报（自然科学版）,2015,46(10):3710-3717. 被引量：5
9白成林.一种基于CSMA/CD的优先级随机分割的综合业务局域网协议[J].小型微型计算机系统,2003,24(11):1916-1918. 被引量：2
10黄珍,叶水生,吴霄.基于数据挖掘技术的系统审计机制[J].计算机工程与设计,2007,28(21):5108-5109.

小型微型计算机系统

2014年第3期

浏览历史

内容加载中请稍等...

适用于高维多类数据分类的并行非线性最小二乘法分类器被引量：1

参考文献1

二级参考文献25

共引文献2

同被引文献21

引证文献1

二级引证文献4

相关作者

相关机构

相关主题

浏览历史

适用于高维多类数据分类的并行非线性最小二乘法分类器 被引量：1

参考文献1

二级参考文献25

共引文献2

同被引文献21

引证文献1

二级引证文献4

相关作者

相关机构

相关主题

浏览历史

适用于高维多类数据分类的并行非线性最小二乘法分类器被引量：1