期刊文献+

基于少数类过采样的倾向得分匹配插补法 被引量:4

Propensity Score Matching Imputation Based on Synthetic Minority Over-sampling Technique
下载PDF
导出
摘要 无回答在大数据应用中频繁发生。通常,实际数据的无回答率较低,在这样的情况下,采用倾向得分模型对无回答单元与回答单元进行匹配,易导致倾向得分匹配插补法的插补效果显著下降。为此,将合成少数类过采样算法的思想融入到倾向得分匹配插补法中,提出基于少数类过采样的倾向得分匹配插补法。利用统计模拟与实证研究,在不同无回答率、插补重数和误差分布情形下,演示新插补法的统计性质和应用效果。统计模拟显示,新插补法具有明显高于倾向得分匹配插补法的精度,统计性质受无回答率、插补重数和误差分布的影响小。实证结果显示,新插补法在实际数据中具有较好的应用性。基于少数类过采样的倾向得分匹配插补法提供了处理无回答问题的新思路,并具有较好的扩展性。 Non-response often occurs in big data applications.Generally,the non-response rate of actual data is low.For those data,it is easy to cause the degradation of the propensity score matching imputation to matching the non-response and the response units using by the propensity score model.Therefore,incorporate the idea of synthetic minority over-sampling algorithm into the propensity score matching imputation and propose the propensity score matching imputation based on synthetic minority over-sampling technique.Statistical simulation and empirical research demonstrate that the imputation effects and statistical properties of the new imputation approach to consider different non-response rate,imputation multiplicity and error distributions.The simulation results show that using the new imputation approach improves the imputation accuracy of the propensity score matching imputation significantly.The imputed results are robust to the non-response rate,imputation multiplicity and error distribution.Empirical research provides the good applicability of the propensity score matching imputation based on synthetic minority over-sampling technique.The new approach introduces a new solution view for the non-response and it’s expansible.
作者 杨贵军 杜飞 孙玲莉 YANG Gui-jun;DU Fei;SUN Ling-li(School of Statistics,Tianjin University of Finance and Economics,Tianjin 300222,China;CCESR,Tianjin University of Finance and Economics,Tianjin 300222,China)
出处 《统计与信息论坛》 CSSCI 北大核心 2021年第1期3-12,共10页 Journal of Statistics and Information
基金 国家社会科学基金重点项目“基于大数据的人口统计调查方法与应用研究”(20ATJ008) 国家社会科学基金青年项目“轮换样本校准估计方法在中国住户调查中的应用研究”(20CTJ009) 天津市2019年度哲学社会科学规划重点课题“大数据背景下多目标抽样设计的理论和应用”(TJTJ19-001) 国家自然科学基金面上项目“劣者淘汰两阶段自适应临床试验的设计和分析”(11471239)。
关键词 倾向得分匹配插补法 合成少数类过采样算法 无回答率 无回答机制 propensity score matching imputation synthetic minority over-sampling technique algorithm non-response rate non-response mechanism
  • 相关文献

参考文献4

二级参考文献31

  • 1Alex Z Fu,唐艳,陈刚.倾向得分法综述[J].中国药物经济学,2008,0(2):27-34. 被引量:15
  • 2王璐,王飞.Hot deck插补和插补后数据的方差模拟研究[J].数量经济技术经济研究,2006,23(2):148-152. 被引量:3
  • 3Graham J W. Missing data[M]. New York:Springer, 2012.
  • 4倪家勋.抽样调查[M].孙山泽,译.北京:中国统计出版社,1997.
  • 5Hansen M H, Hurwitz W N. The Problem of Non--response in Sample Surveys[J]. Journal of the American Statistical Association, 1946, 41(236).
  • 6Politz A, Simmons W. An Attempt to get the "not at Homes" into the Sample Without Callbacks[J]. Journal of the American Statistical Association, 1949, 44(245).
  • 7Deming W E, Stephan F F. On a Least Squares Adjustment of a Sampled Frequency Table When the Expected Marginal Totals are Known[j]. The Annals of Mathematical Statistics, 1940(4).
  • 8Horvitz D G, Thompson D J. A Generalization of Sampling Without Replacement from a Finite Universe[J]. Journal of the American Statistical Association, 1952, 47(260).
  • 9Nordbotten S. Automatic Editing of Individual Statistical Observations [C]. Conference of European Statisticians Statistical Standards and Studies No. 2, New York: UN Statistical Commission for Europe, 1963.
  • 10Kahon G, Kish L. Some Efficient Random Imputation Methods [J]. Communications in Statistics- Theory and Methods, 1984(16).

共引文献15

同被引文献64

引证文献4

二级引证文献5

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部