摘要
本文依据数据挖掘技术对股票收益率的变化方向进行探究。通过小波多尺度分解,将股票价格转化为不同频率域下的子序列数据、并对其中的高频序列进行降噪。构建极度梯度提升树(XGBoost)、以及其它主流机器学习算法,对沪深300和中证500指数中成分股的涨跌进行了拟合并预测。研究发现XGBoost的平均准确率分别达到了54.69%和55.13%,同时依据预测信号构建的投资策略可产生稳定收益,表明该方法具备较强的预测能力。在此基础上,对机器学习算法存在的“黑箱”问题进行了阐述和研究,对模型选股的逻辑进行了探析:提出一种因子权重的度量方法,研究发现市净率、市盈率、能量潮等指标在模型中是较为重要的判别指标,并通过偏相依关系度量了模型中各因子对于股价涨跌方向的边际影响,得到模型倾向于选择市盈率、市净率较小的股票等一些结论,使算法的逻辑更为清楚。
A data mining algorithm is proposed to study the direction of stock return rate.Firstly we use wavelet decomposition to get data under different frequency,threshold filter to reduce the noise of high frequency data.andXGBoost and other machine learning methods to analyze the components of CSI 300 and CSI 500.The result shows that XGBoost is highly accurate,with the accuracy of 54.69% and 55.13% respectively.At the same time,the investment strategy constructed can yield steady returns,meaning that this data mining algorithm is useful.On this basis,this paper makes up for the“black box”problem and tries to know the logic of stock selection in the model.Firstly,the relative importance of each indicator is measured,and we find that OBV、PB and PE are relatively important.Then the marginal effect of various features on direction of return rate are measured through the partial dependence analysis and we get some conclusions such that the model tends to select the stock with low PE and PB,which makes the algorithm more logical.
作者
苟小菊
王芊
GOU Xiao-ju;WANG Qian(School of Management,University of Science and Technology of China,Hefei 230026,China)
出处
《运筹与管理》
CSSCI
CSCD
北大核心
2021年第1期163-169,共7页
Operations Research and Management Science
基金
国家自然科学基金青年基金项目(71701191)。
关键词
数据挖掘
收益率方向
极度梯度提升树
小波分解
偏相依关系
data mining
direction of stock return rate
XGBoost
wavelet decomposition
partial dependence