摘要
IRT中的计量模型较多,不同计量模型适合不同特点的数据资料,实际工作者应根据实际情况选择适当的IRT模型来分析数据。我国是个考试、测评大国,测评的题型丰富多样,在实际应用IRT时,一个模型往往很难反应所有数据资料本身的特点,这时可考虑应用多个IRT模型(即"混合模型")来分析,以达到对数据的最佳拟合。本文对混合模型的思想方法及原理、参数估计的实现、以及模型性能进行了研究,发现:(1)本文自主开发的混合模型参数估计程序Mix—Tu具有较高的返真性,且与国际知名IRT分析软件Parscale相当。(2)在"项目异常"情况下,Mix—Tu程序对参数b和c的估计受数据异常程度的影响要大于Parscale程序,而对参数a的估计受数据异常程度的影响要小于Parscale程序,而在参数theta上两个程序相当。(3)在"被试异常"情况下,Mix—Tu程序对所有参数的估计受数据异常程度的影响均要小于Parscale程序,Mix—Tu程序表现的更为稳健。
There are many IRT models available for realistic work and adaptive to different data. As there exit many kinds of examinations in China and the item types are rich, realistic work finds that it is impossible for one single IRT model to reflect all data' s features. Thus more IRT models or a "mixed model" are required to realize the optimized data fit. This paper explored the idea, principle, parameter estimation and the properties of the mixture model based on 3PLM and GRM. To explore the parameters estimation precision, and to probe the properties of the mixture model, the Monte Carlo method was used. The result showed: (1) The parameters estimation precision of Mix._Tu program, equivalent to the precision of Parscale program was preferably great. (2) When "item bugs", the estimating of parameter b and c with Mix_Tu program were more affected by the extent of item bugs than that with the Parscale program, the estimating of parameter a was vice versa, but the estimating of theta was similar. (3) In the case of "examination bugs", the estimating of all parameters with Mix_Tu program was less affected by the degree of examination bugs than that with the Parscale program. The estimating was more robust than that with the Parscale program.
出处
《心理科学》
CSSCI
CSCD
北大核心
2011年第5期1189-1194,共6页
Journal of Psychological Science
基金
教育部人文社会科学研究基金项目(09YJCXLX012)
江西省教育厅科技项目(GJJ10098)
江西省教育厅高校人文社科项目(XL1107
XL1011)
国家自然科学基金(30860084)的资助
关键词
项目反应理论
3PLM
等级反应模型
混合模型
Item Response Theory, three parameter Logistic model, Graded Response Theory, Mixture model