摘要
文章通过对样本数据缺失值问题的分析,在随机缺失模式下选择科学有效的插补法对数据缺失问题进行研究,然后以市场中顾客对商品的喜爱度作为调查对象,构建模型对其进行实证分析,并在与多重插补法结合运用的基础上研究不同缺失率下的插补效果。结果显示,随着缺失率的提高,调查中所获得的数据可用性减少,同时插补法的效果也随之降低。在四种插补法中,EM插补和多重插补的插补效果优于其余两种,同时运用模型与多重插补相结合的插补效果也不错。因此根据不同的缺失率,需要选择合适的插补法对数据进行插补。
Through the analysis of the missing data of sample data,this paper chooses scientific and effective interpolation method to study the problem of missing data under the random missing mode,and then uses the customer's preference for the products in the market as a survey target to construct a model for conducting empirical analysis.Finally,on the basis of combining with multiple interpolation method,the paper makes a study on the interpolation effect under different missing rate.The results show that,with the increase of the missing rate,the availability of data from the survey decreases,and the effectiveness of the interpolation method is also reduced accordingly.Among the four interpolation methods,the interpolation effect of EM interpolation and multiple interpolation is better than the other two,and at the same time,the interpolation effect of the combination of the model and multiple interpolation is also good.Therefore,it is necessary to select the proper interpolation method to interpolate the data according to different missing rates.
作者
宋亮
万建洲
Song Liang;Wan Jianzhou(School of Mathematics and Statistics,Nanyang Institute of Technology,Nanyang Henan 473000,China)
出处
《统计与决策》
CSSCI
北大核心
2020年第18期10-14,共5页
Statistics & Decision
基金
国家自然科学基金资助项目(11901320)
河南省教育厅高等学校重点科研项目(19A110028)
河南省科技厅基础与前沿项目(162300410076)。
关键词
抽样调查
缺失值
插补法
逻辑回归分析
sample survey
missing value
interpolation
logistic regression analysis