期刊文献+

多维题组反应模型:多维随机系数多项Logistic模型的应用拓展 被引量:4

Multidimentional Rasch Testlet Model: An Extension and Generalization of MRCMLM
下载PDF
导出
摘要 本文将多维随机系数多项Logistic模型(MRCMLM)拓展应用到多维题组领域,得到适用于多维目标能力和多维题组效应的多维题组反应模型(MTRM),该模型具有高度灵活性和适用性。本文通过两个模拟研究和一个应用研究探究MTRM参数估计精度和模型适用性,以及与two-tier模型的差异,结果发现:(1)能力维度间相关和项目评分等级是影响模型参数估计的重要因素;(2)MTRM对项目参数估计准确性和稳定性高于two-tier模型,对题组效应大小估计更为准确。(3)MTRM在考虑项目内多维题组情况下模型拟合度更高,为测验分析提供了更广泛的模型结构选择,具有显著的应用价值。 Testlets have been widely used in educational assessment. It has been shown that ignoring testlet effects when analyzing response data often results in inaccurate estimates of reliability coefficients and latent trait standard errors, increased bias of item parameter estimates, inaccurate test equating, and failure to detect DIF. As such, there is increasing interest among researchers in using testlet models instead of standard item response models. Different types of testlet models have been proposed to partial out the influence of testlet factors from the estimation of latent proficiency. However, most of the previous models target testlet effects for which 1) only one latent trait is measured, and 2) each item belongs to only one testlet(between-item multidimensional). As an alternative, the two-tier model can be used to deal with multidimensional latent traits. However, the two-tier model is usually used within the framework of confirmatory factor analysis. This research extends the multidimensional random coefficients multinomial logistic model(MRCMLM) to the multidimensional testlet response model(MTRM), with the aim to take within-item multidimensional testlets and multiple ability into the consideration under IRT framework. With different model constraints, the MTRM can be used to model a variety of multidimensional test structures. Two studies based on simulated data and one empirical study based on a large-scale math assessment data are discussed. In simulation study 1, we considered different correlations among trait dimensions. We compared the MRCMLM which ignores the testlet effects with the MTRM in terms of the accuracy of estimation. In simulation study 2, the MTRM was compared to a two-tier model for polytomous data in terms of item and person parameter estimation accuracy. In the third study which analyzed real large-scale math test results, three-dimensional proficiencies in math were modeled and estimated. In total, seven testlets were identified. Some items were loaded on more than one testlets, indicating within-item multidimensional testlet effects. Model fit and estimation of three different models(MRCMLM, MTRM-1 with only uncrossed testlets considered, and MTRM-2 with all the seven testlets considered) were compared. All the analysis was conducted in Con Quest, using Monte Carlo estimation. Estimation accuracy in simulation studies was evaluated using bias, RMSE, and correlation coefficients between the true and estimated values. Results of simulation 1 indicated that the MTRM produced more accurate estimated item difficulties for items within testlets than the MRCMLM, while both models reached accurate results for independent items. It was also discovered that the recovery of item difficulties in the MTRM was less influenced by the correlations among the latent traits. In addition, as the correlation coefficients between abilities decreased, the ability and item difficulty estimates were more biased if testlet effects were not modeled. As discovered in simulation 2,both the MTRM and the two-tier model accurately estimated item and person parameters. When testlets effects were present, estimates of both item and person parameters in the MTRM were more stable than two-tier model, indicating that the MTRM is not influenced by complex test structures or extreme responses patterns. Results of the empirical data analysis showed that the MTRM with all seven testlets considered fits the data the best. The application of the MTRM reduces incorrect estimation of the reliability and standard error for each primary trait, even for moderate testlet effects and high correlations between ability dimensions. The present study proposes the multidimensional testlet model, supplementing previous testlet models by taking both within-item multidimensional testlets and multiple abilities into account. A new integrated model, the MTRM, was developed based on MRCMLM. This model can be applied to a variety of educational tests where complex testlets are embedded and multidimensional proficiencies are estimated, through identifying an appropriate ability-judge(score) matrix and testlet-judge(design) matrix. A promising attribute of this model is that the parameter estimation is easily achieved through using the software Con Quest. We suggest that in many assessment contexts, ignoring testlets effects can add ambiguity to the interpretation of test scores, thus data should be appropriately fitted to testlet models.
出处 《心理学报》 CSSCI CSCD 北大核心 2017年第12期1604-1614,共11页 Acta Psychologica Sinica
基金 全国教育科学"十二五"规划2013年度教育部青年课题(EBA130370)资助
关键词 多维目标能力 多维题组 two-tier模型 MRCMLM 估计精度 multidimensional ability multidimensional testlet two-tier model MRCMLM estimated accuracy
  • 相关文献

参考文献2

二级参考文献25

  • 1Bradlow, E. T., Wainer, H., & Wang, X. H. (1999). A Bayesian random effects model for testlets. Psychometrika, 64(2), 153-168.
  • 2Chen, C. T., & Wang, W. C. (2007). Effects of ignoring item interaction on item parameter estimation and detection of interacting items. Applied Psychological Measurement, 31(5), 388-411.
  • 3DeMars, C. E. (2006). Application of the bi-factor multidimensional item response theory model to testlet-based tests. Journal of Educational Measurement, 43(2), 145-168.
  • 4Gelman, A., & Rubin, D. B. (1993). Inference from iterative simulation using multiple sequences. Statistical Science, 7, 457-472.
  • 5Lee, G., Dunbar, S. B., & Frisbie, D. A. (2001). The relative appropriateness of eight measurement models for analyzing scores from tests composed of testlets. Educational and Psychological Measurement, 61(6), 958-975.
  • 6Li, Y. M., Bolt, D. M., & Fu, J. B. (2006). A comparison ofalternative models for testlets. Applied Psychological Measurement, 30(1), 3-21.
  • 7Lord, F. M., Novick, M. R., & Birnbaum, A. (1968). Statistical theories of mental test scores. Reeding, MA: Addison-Wesley.
  • 8Rijmen, F. (2009). Three multidimensional models for testlet-based tests: Formal relations and an empirical comparison. Tech. Rep. No. RR-09-37). Educational Testing Service.
  • 9Wainer, H., Bradlow, E. T., & Wang, X. H. (2007). Testlet response theory and its applications. New York, NY: Cambridge University Press.
  • 10Wainer, H., & Wang, X. H. (2000). Using a new statistical model for testlets to score TOEFL. Journal of Educational Measurement, 37(3), 203-220.

共引文献27

同被引文献25

引证文献4

二级引证文献5

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部