期刊文献+

2PLM下缺失数据处理方法及其比较

Missing Data Handling Methods Based on the 2PLM
下载PDF
导出
摘要 项目反应理论(IRT)是用于客观测量的现代教育与心理测量理论之一,广泛用于缺失数据十分常见的大尺度测验分析。IRT中两参数逻辑斯蒂克模型(2PLM)下仅有完全随机缺失机制下缺失反应和缺失能力处理的EM算法。本研究推导2PLM下缺失反应忽略的EM算法,并提出随机缺失机制下缺失反应和缺失能力处理的EM算法和考虑能力估计和作答反应不确定性的多重借补法。研究显示:在各种缺失机制、缺失比例和测验设计下,缺失反应忽略的EM算法和多重借补法表现理想。 Missing data are encountered regularly by researchers in educational research. For example, many large-scale assessments are low- stakes surveys, which typically suffer from a substantial amount of missing data. The low-stakes nature of these surveys, as well as variations in the average performance across countries and other factors such as testing traditions, test design, time limits, intentional omission, have been discussed as contributing factors to the amount of omitted responses observed in these assessments. Researchers have shown that missing data may create problems in the estimation of item parameters and subject ability parameters in the item response theory (IRT) context. A number of missing data handling methods have been developed in the IRT framework. The methods are not only involving the response function imputation, but also including treating the missing items as not presented (NP), incorrect (IN) or fract!onally correct (FR), which can be carded out directly with the item parameter estimation software BILOG-MG. There have also been a number of algorithms in the context of data imputation. The current study described several approaches to deal with missing data in the two-parameter logistic model (2PLM). Although the software BILOG-MG can handle the missing data, it is a commercial sottware. We needed to domestically develop an EM algorithm in which the missing responses were ignored, that is, to be treated as missing completely at random (MCAR). MCAR, which can be thought of as having no systematic cause, is only one specific type of missing data. It is noted that Zhang, Xin, Zeng, and Sun have proposed an EM algorithm (denote it as ZS) to deal with missing data under MCAR with a huge computational burden when the percent of missing data is higher. When data are missing at random (MAR), the probability of a value being missing is dependent on item response of the individual but not on the missing value itself. The estimation of item parameters and abilities may be influenced. However, to the best of our knowledge, there has been no work addressing the missing data under the assumption for MAR in 2PLM. We proposed an EM algorithm under MAR, denoted by EE. Following a general introduction of multiple imputing methods, which was based on the item response model, two new multiple imputing methods (EF and ER) were proposed by considering uncertainties of ability parameter estimates and missing item responses, compared against two original methods (PF and PR) proposed by Huisman and Molenaar, which were only based on item responses probability. Simulation studies were provided to demonstrate the accuracy of these methods with a sample size of 1000. Various percentages of missing data were simulated: 5%, 15%, 30%, 40% and 50%. Missing data were simulated according to three different types of underlying missing data mechanism, including MCAR, MAR, and missing not at random. Missing data were imputed by NP, ZS, IN, PR, PF, ER, EF, and EE. Simulation results suggested that new multiple imputing methods and NP method worked well in various conditions; the EM algorithm under MAR had similar performance compared with the NP, because the 2PLM had the advantage of invariance of model parameters.
出处 《心理科学》 CSSCI CSCD 北大核心 2016年第6期1500-1507,共8页 Journal of Psychological Science
基金 国家自然科学基金项目(31500909 31360237 31160203) 全国教育科学规划教育部重点课题(DHA150285) 教育部人文社会科学研究青年基金项目(13YJC880060) 国家留学基金青年骨干教师出国研修项目(201509470001) 江西省自然科学基金项目(20161BAB212044) 江西省社会科学研究"十二五"(2012年)规划项目(12JY07) 江西省教育科学2013年度一般课题(13YB032) 江西省教育厅科学技术研究项目(GJJ13207 GJJ13208 GJJ13209) 江西师范大学青年成长基金 江西师范大学博士启动基金的资助
关键词 缺失数据 EM算法 随机缺失机制 多重借补 项目反应理论 missing data, the EM algorithm, missing at random, multiple imputing, item response theory
  • 相关文献

参考文献7

二级参考文献101

  • 1黄华彩,丁树良,罗芬.IRT框架下的SQRT/EM参数估计方法及应用研究[J].江西师范大学学报(自然科学版),2005,29(3):231-234. 被引量:1
  • 2Albert, J. H. (1992). Bayesian estimation of normal ogive item response curves using Gibbs sampling. Journal of Educational Statistics, 17, 251-269.
  • 3Huisman, M. (2000). Imputation of missing item responses: some simple techniques. Quality and Quantity, 34, 331-351.
  • 4Jones, D. H., & Nediak, M. (2000). Item parameter calibration of LSAT items using MCMC approximation of Bayes posterior distributions. Rutcor research report, 7-2000.
  • 5Karkee, T., & Finkelman, M. (2007, April). Missing Data treatment Methods in Parameter Recovery for a Mixed-Format Test. Paper presented at the annual meeting of the American Educational Research Association, Chicago.
  • 6利特尔,鲁宾.(2004).缺失数据统计分析.(孙山泽译). 北京:中国统计出版社.
  • 7Ludlow, L. H., & O’Leary M. (1999). Scoring omitted and not-reached items: practical data analysis implications. Educational and Psychological Measurement, 59, 615-630.
  • 8Maris, G., & Bechger, T. M. (2005). An introduction to the DA-T Gibbs sampler for the two-parameter logistic (2PL) model and beyond. Psicológica, 26, 327-352.
  • 9Muraki, E., & Bock, R. D. (1993). PARSCALE: IRT based test scoring and item analysis for graded open-ended exercises and performance tasks. Chicago: Scientific software Int.
  • 10Patz, R. J., & Junker, B. W. (1999a). A straightforward approach to markov chain monte carlo methods for item response models. Journal of Educational and Behavioral Statistics, 24, 146-178.

共引文献24

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部