摘要
随着网络调查与大数据的发展,非概率样本受到越来越多的关注和重视,然而非概率样本存在入样概率和权数未知的问题,为了充分利用信息,如何将非概率样本和概率样本结合,利用混合样本推断总体成为一个热点问题。基于此,文章提出将概率样本与非概率样本混合,从同时计算权数和分别计算权数两条思路出发,结合倾向得分来构造非概率样本的伪权数,并利用混合样本来推断总体。模拟与实证研究表明:提出的两种混合样本推断方法所得的总体均值估计的绝对偏差、方差与均方误差都比仅基于概率样本的总体均值估计小,并且相对于分别计算权数估计总体均值的方法,同时计算权数估计总体均值的方法效果更好。
With the development of web survey and big data,the non-probability samples have attracted more and more attention.However,the unknown sampling probability and weights exist in non-probability samples.In order to make full use of information,how to combine non-probability samples with probability samples to infer the population by mixed samples has become a hot topic.In view of this,the paper proposes that probability samples and non-probability samples are mixed to construct the pseudo weights of non-probability samples based on propensity score and infer the population by mixed samples from two perspectives of calculating weights at the same time and calculating weights respectively.The simulation and empirical studies show that the absolute deviation,variance and mean square error of the population mean estimation obtained by the proposed two mixed sample inference methods are smaller than those of the population mean estimation only based on the probability samples.Moreover,the population mean estimation method of calculating weights at the same time performs better than that of calculating weights respectively.
作者
刘展
潘莹丽
涂朝凤
张梦
Liu Zhan;Pan Yingli;Tu Chaofeng;Zhang Meng(School of Mathematics and Statistics,Hubei Key Laboratory of Applied Mathematics,Hubei University,Wuhan 430062,China)
出处
《统计与决策》
CSSCI
北大核心
2021年第2期20-24,共5页
Statistics & Decision
关键词
倾向得分匹配法
伪权数
混合样本
概率样本
非概率样本
propensity score matching method
pseudo weights
mixed samples
probability sample
non-probability sample