摘要
在IRT框架下,建立了0-1评分方式下单维双参数Logistic多题多做(MAMI)测验模型。与Spray给出的一题多做(MASI)模型相比,MAMI不仅模型更加精致,而且扩展了适用范围,参数估计方法也不同,采用EM算法求取项目参数。MonteCarlo模拟结果显示,应用MAMI测验模型与测验题量作相应增加的作法相比,两者给出的能力估计精度相同,但MAMI模型给出的项目参数估计精度更高。如果将MAMI测验模型与被试人数相应增加的作法相比,项目参数的估计精度相同,但MAMI给出的能力参数估计精度更高。这个发现表明,在一定条件下若允许修改答案,并采用累加式记分方式,纵使题量不变,也可使能力估计的精度相当于题量增加一倍的估计精度,而项目参数估计精度也会提高。这些发现不仅对技能评价和认知能力评价有参考价值,而且对数据的处理方式也有参考价值。
Three one-parameter item response theory (IRT) models were proposed by Spray to describe score probabilities of an examinee who takes a multiple attempts, single-item (MASI) test of a psychomotor skill. However, if students are encouraged to cheek and modify their answers in a test, the phenomenon could be regarded as multiple-attempt, multipleitem (MAMI) test. To describe the MAMI test, a two-parameter IRT MAMI model (Binomial trails model) was proposed and an item parameter estimation procedure was formulated in this paper. Three assumptions about the model were made. The first two were the same as in the ordinary IRT, the unidimesionality and the local independence. The third assumption was another kind of local independence, which required that the individual trails or attempts be independent for a given examinee.
The model and the estimation procedure developed in this article were evaluated using simulated data. Test consisted 60 items and sample size was 1000 in this simulation. The simulated data were generated 50 times. The ability parameters, difficulty parameters, the logarithm of the discrimination parameters were drawn from the standard normal distribution N(0,1). Three different methods estimated procedure (MMLE/EM for MAMI model, MMLE/EM for BILOG, MMLE/EM for ordinary IRT) were used to analyze this simulation data. MMLE/EM for MAMI model means that the elements in the score matrix were the sum of the original score and the modified score (cumulating score scheme) when the examinees modified their answers. MMLE/EM for ordinary IRT means that the score matrix was the original score matrix and the modified score matrix obtained from repeated response were lengthened in row (as if the number of the examinees were double, in brief, lengthened score matrix) and were widened in column (as if the length of the test were double, in brief, widened score matrix) separately when the examinees modified their answers.
The mean of the absolute difference between the estimated and the correspondent simulated value of the parameters (ABS), the bias and root mean square error (SD) of the estimated values of the parameters were computed for each item parameter across 50 replications.
The results of simulations showed that:
1. The accuracy in terms of ABS and SD of estimating the ability parameters in MAMI model was higher than that obtained by MMLE/EM used in ordinary IRT for the score matrix being lengthened.
2. The accuracy of estimating the item parameters in MAMI model was higher than that obtained by MMLE/EM used in ordinary IRT for the score matrix being widened.
These findings indicate that when MAMI appears, the cumulating score scheme is more reasonable than the traditional scoring scheme, in which only the last response is collected. This finding may motivate researchers to consider how to score when the skill test allows reviewing and changing answers.
出处
《心理学报》
CSSCI
CSCD
北大核心
2007年第4期730-736,共7页
Acta Psychologica Sinica
基金
国家自然科学基金(60263005)
全国教育科学规划重点课题(DBB010501)
省教育厅科技项目
江西省自然科学基金(0411021)
江西省分布计算工程技术研究中心开放课题基金资助项目
关键词
多题多做模型
EM算法
参数估计精度.
multiple-attempt multiple-item model, EM algorithm, accuracy of estimating parameter.