摘要
上市公司的财务数据往往呈现出高维度,强相关性和组效应特征,传统的回归模型不再适用,基于惩罚函数建立稀疏模型成为了当前研究的热点.在线性回归模型基础上,基于组变量和双重变量选择方法研究了具有组效应的高维数据协变量选择模型.模拟对比研究发现,cMCP方法的预测误差更小,冗余变量和重要变量的选择比较合理.同时,通过制造业每股收益的数据进行实证分析显示,cMCP方法在变量选择上更贴合实际,能有效降低变量之间的多重共线性,使得模型的可解释性更强.
The financial data of listed companies often show the characteristics of high dimension,strong correlation and group effect,and the traditional regression model is no longer applicable.Based on the linear regression model,this paper studies the covariate selection model of high dimensional data with group effect based on group variable and dual variable selection methods.The simulation and comparison study shows that the cMCP method has a smaller prediction error,and the selection of redundant and important variables is reasonable.At the same time,the empirical analysis of the data of manufacturing earnings per share shows that the cMCP method is more realistic in variable selection,which can effectively reduce the multicollinearity among variables and make the model more interpretable.
作者
吴建民
WU Jian-min(School of Humanities and Social Sciences,Beijing Institute of Technology,Beijing 100081,China)
出处
《数学的实践与认识》
2023年第8期260-266,共7页
Mathematics in Practice and Theory
基金
教育部社科基金(10YJC630279)。
关键词
高维数据
组效应
组变量选择
双重变量选择
high-dimensional data
group effects
group variable selection
double variable selection