摘要
随机森林算法在股票收益率分类预测中是个有效的机器学习算法,准确度高,但存在参数寻优缺陷和特征选择困难等问题。为此,在传统随机森林算法的基础上,将特征选择粒子群算法和参数网格搜索算法相结合,提出一种新算法——粒子群参数网格搜索的随机森林算法。用粒子群算法对输入数据进行特征选择,通过剔除冗余特征来降低输入数据维度,并引入网格搜索算法来优化随机森林的一些参数,不仅降低了随机森林算法的计算复杂度,且提高了随机森林的分类预测准确度。文章采用沪深300和中证500股票历史数据进行仿真,相比随机森林算法和网格算法优化的随机森林算法,改进的新算法其分类精确度显著提高,沪深300预测准确率达到86.3%,中证500预测准确率达到87.6%。
Random forest algorithm is an effective machine learning algorithm of high accuracy in stock yield prediction,but it has some disadvantages such as defects in parameter optimization and difficulties in feature selection.Based on the traditional random forest algorithm,a novel algorithm is proposed in this paper by combining Particle Swarm Optimization(PSO)with Grid Search(GRID),which is called as Particle Swarm parameters Grid Search algorithm of random forests.To reduce the dimension of the input data for feature selection,the input features are selected by the PSO algorithm eliminating redundant features.The GRID algorithm is introduced to optimize some parameters of random forests,which not only reduces computational complexity of the random forest algorithm,but also improves its classification prediction accuracy.Historical stock data of Hushen 300 and CSI 500 are used for simulation.Compared with random forest algorithm and grid algorithm optimized random forest algorithm,the classification accuracy of this novel algorithm is significantly improved,i.e.,the prediction accuracies of Hushen 300 and CSI 500 reach 86.3%and 87.6%,respectively.
作者
方昕
陈玲玲
曹海燕
FANG Xin;CHEN Lingling;CAO Haiyan(School of Communication Engineering,Hangzhou Dianzi University,Hangzhou Zhejiang 310018,China)
出处
《杭州电子科技大学学报(自然科学版)》
2020年第1期35-40,共6页
Journal of Hangzhou Dianzi University:Natural Sciences
基金
国家自然青年科学基金资助项目(61501158).
关键词
粒子群
随机森林
股票收益率
特征选择
particle swarm
random forest
stock yield
feature selection