摘要
目的通过构建存在不同混杂结构的广义倾向性评分(generalized propensity score,GPS)模型和结局模型,探索比较三种GPS估计法:广义倾向性评分-最小二乘法(generalized propensity score-ordinary least squares,GPS-OLS),广义倾向性评分-增强算法(generalized propensity score-boosting,GPS-Boosting)以及协变量均衡广义倾向性评分(covariate balancing generalized propensity score,CBGPS)法对混杂因素的均衡性能及对暴露效应估计的影响,并将其应用于连续型自变量与健康结局的关联性研究。方法通过蒙特卡洛模拟,分别生成不同样本量的样本(N=400与N=1000),利用GPS-OLS、GPS-Boosting以及CBGPS方法模拟4种不同混杂结构存在的暴露因素模型和结局模型,按照逆概率加权法构造相应的权重,并依据各协变量与暴露因素相关系数变化情况来估计三种GPS估计法均衡混杂变量的能力,通过对比偏倚和均方误差的大小来反映其对暴露效应估计的影响。将其应用于2017年山西省营养调查研究,探讨肉类食物摄入量和高血压之间的关联性。结果在4种混杂结构存在的情况下,相比于GPS-OLS法和GPS-Boosting法,CBGPS法均衡混杂因素的能力最佳。在暴露效应估计方面,CBGPS法也能明显降低暴露效应估计的均方误差以及偏倚程度,估计效果优于GPS-OLS法、GPS-Boosting法。结论使用广义倾向性评分均衡混杂因素时,优选CBGPS方法。同时使用GPS法在实例中验证了肉类食物摄入量和高血压之间的关联性。
Objective Through the construction of generalized propensity score(GPS)model and outcome model with different mixed structures,three GPS estimation methods were explored and compared:generalized propensity score-ordinary least squares(GPS-OLS),generalized propensity score-boosting(GPS-boosting)and covariate balancing generalized propensity score(CBGPS)methods for the influence on the balance performance of confounding factors and the estimation of exposure effect,and applying it to the study of the correlation between continuous independent variables and health outcomes.Methods In this study,Monte Carlo simulation was used to generate samples of different sizes(N=400 and N=1000).GPS models or outcome models with 4 different confounding structures were constructed,estimate GPS scores by GPS-OLS,GPS-Boosting and CBGPS methods,and weights were constructed by inverse probability weighting.This study compared the changes in the correlation coefficients between the covariates and the exposure after weighting to determine the ability of the three GPS estimation methods to balance confounders,and compare the bias and the mean square error to determine the impact on the estimation of the exposure effect.Applying them to the 2017 Shaanxi Nutrition Survey,the study explores the relationship between meat food intake and high blood pressure.Results The simulation results show that in the 4 scenes where different confounding structures exist,the GPS estimated by the CBGPS method balanced confounders is the best,which is better than the GPS-OLS and the GPS-Boosting methods.For the estimation of exposed effects,the CBGPS can well reduce the mean square error and bias value it produces,and the final effect is better than the GPS-OLS method and GPS-Boosting method.Conclusion When using the generalized propensity score to balance confounders,the CBGPS method is preferred.At the same time,the GPS method was used to verify the correlation between meat food intake and high blood pressure in an example.
作者
王晨晨
孙倩
王彤
Wang Chenchen;Sun Qian;Wang Tong(Department of Biostatistics,School of Public Health,Shaanxi Medical University(030001),Taiyuan)
出处
《中国卫生统计》
CSCD
北大核心
2022年第1期7-13,共7页
Chinese Journal of Health Statistics
基金
国家自然科学基金项目(82073674,81872715)。