摘要
文章基于解释变量与被解释变量之间的互信息提出一种新的变量选择方法:MI-SIS。该方法可以处理解释变量数目p远大于观测样本量n的超高维问题,即p=O(exp(nε))ε>0。另外,该方法是一种不依赖于模型假设的变量选择方法。数值模拟和实证研究表明,MI-SIS方法在小样本情形下能够有效地发现微弱信号。
This paper proposes a new variable screening method based on mutual information between explanatory variables and explained variables,namely MI-SIS.This method can deal with the ultrahigh dimension problem where the number of explanatory variables[p]is much larger than the observed sample size[n],that is,[p=O(exp(nε)),ε>0.].In addition,the proposed method is a variable selection method independent of model assumptions.Numerical simulation and empirical study show that the MI-SIS method can effectively detect weak signals in small samples.
作者
周生彬
黄叶金
Zhou Shengbin;Huang Yejin(School of Mathematical Sciences,Harbin Normal University,Harbin 150025,China;PBC Shcool of Finance,Tsinghua University,Beijing 100083,China)
出处
《统计与决策》
CSSCI
北大核心
2020年第1期20-23,共4页
Statistics & Decision
关键词
变量选择
互信息
非参数密度估计
超高维数据分析
variable selection
mutual information
non-parametric density estimation
ultrahigh dimension data analysis