摘要
对因果机制和对宏观检验的探寻催生了定量社会学研究对区群层面数据的需求,然而这类高质量的追踪数据资源相对稀缺。传统研究通常通过综合多个来源的个体社会调查数据来构建面板数据集以改善宏观数据匮乏现状,但其亦受制于社会调查在时间和空间分布上的稀疏性以及不同调查间的差异性。本文引介了一种可用于生成区群层面跨时空面板数据的动态贝叶斯潜变量建模框架,并通过应用实例展示了该方法的具体应用过程,比较了动态贝叶斯方法相较于几种常用的缺失值插补方法的优势。本文的示例结果表明,动态贝叶斯潜变量模型在跨时空、多维度的信息整合和参数不确定性探索方面具有重要的优势,可以实现对调查数据缺失年份或地区的估计和插补,大大缓解了社会学研究中面板数据不足的问题。
In contemporary quantitative sociological research,the testing of causal mechanisms and macro theories has driven researchers’need for high-quality time-series data at the district cluster level.However,sociological research suffers from significant shortcomings in accessing large-scale,long time-span tracking data compared to fields such as economics.While the aggregation of individual social survey data from multiple sources to generate panel data is an important way to improve data scarcity,it is also constrained by the limitations of the spatial and temporal distribution of social surveys and the variability across surveys.In this paper,we introduce a dynamic Bayesian latent variable modeling framework designed to facilitate the generation of complete panel data at the cluster level.The implementation of this framework is demonstrated through a practical example,and its efficacy is highlighted in comparison to several common missing data imputation techniques.The results show that the dynamic Bayesian latent variable model has noticeable advantages in terms of temporal-spatial imputation,multi-dimensional social index integration,and even the inclusion of parameter uncertainty.This method has potential in the estimating and imputing missing data for years and regions within surveys,yielding a clear picture of its future appliance in panel data generation and dimension integration for macro-level sociological research.However,the practical application of this approach still faces certain limitations,such as data availability,“synonym repetition”,and insufficient sensitivity to drastic changes.In view of this,this paper proposes corresponding optimization strategies to enhance the applicability and flexibility of this modeling framework,thereby expanding its application scope in the field of social sciences.The research in this paper provides valuable insights for practical application of the dynamic Bayesian latent variable modeling approach,offering inspiration for future related studies.
作者
张高祥
陈哲
陈云松
ZHANG Gaoxiang;CHEN Zhe;CHEN Yunsong(School of Social and Behavioral Sciences,Nanjing University)
出处
《社会》
北大核心
2024年第3期173-219,共47页
Chinese Journal of Sociology
关键词
数据生成
维度整合
潜变量
贝叶斯项目反应模型
动态线性模型
data generation
dimension integration
latent variables
Bayesian item response theory model
dynamic linear model