摘要
设计项目参数、被试得分已知的测验情境,在两、三、四参数Logistic加权模型下进行能力估计,发现被试得分等级之间的能力步长存在着均匀的步长间距,被试得分能较好的反映多级记分的分数加权作用。两参数Logistic加权模型下会出现被试能力参数估计扰动现象,猜测现象会导致能力高估现象,失误现象会导致能力低估现象;三参数Logistic加权模型c型下能力高估现象未出现或不明显;三参数Logistic加权模型γ型下能力低估现象未出现或不明显;四参数Logistic加权模型下被试能力高估现象和低估现象都未出现或不明显,四参数Logistic加权模型是被试能力稳健性估计较好的方法。
The weighted-score logistic model(WSLM) was proposed by Jian, Dai, & Dai(2016). Based on the item emphases of the polytomously scored item, the WSLM model adds the weighted-score parameters into the dichotomous logistic model. Because the dichotomous model has five forms at least. Similarly, the weighted-score logistic model also has four forms, including the one-parameter weighted-score logistic model, the twoparameter weighted-score logistic model, the three-parameter weighted-score logistic model including c parameter, the three-parameter weighted-score logistic model including γ parameter, and the four-parameter weighted-score logistic model.There are response disturbances such as random guessing, carelessness, transcription error in the educational tests. In the paper and pencil testes or computerized adaptive testing, the aberrant responses such as careless errors and lucky guesses would cause significant ability estimation biases in previous studies. Mislevy & Bock(1982) proposed the Biweight estimator and made comparison between the Biweight estimator and maximum likelihood estimator. Results showed that the Biweight estimator could typically reduce Biases, thereby dispel measurement disturbances. And threeparameter logistic IRT model, four-parameter logistic IRT model, Huber robust estimation, and the other methods have therefore been proposed to address the response disturbance, including random guessing, carelessness, etc.The paper compares the robustifying ability estimates of the four models in an example of a test. The four models compared include twoparameter WSLM, three-parameter WSLM with c parameter, three-parameter WSLM with γ parameter, and four-parameter WSLM. Second, three simulation studies in three test cases are presented respectively, with the aim of comparing the four approaches, including 2 PM-MLE, Biweight estimation, Huber estimation and 4 PM-Robust estimation. The hypothetical test instrument contains 34 items, with difficulty thresholds b^N(0,1), and log(a)~N(0,1). The 35 th item with difficulty thresholds ranges from-4 to 4. The ability of the middle-ability examinee is estimated by the responses on the 34 items of the basic test under two-parameter logistic model, and the ability estimation is seen as the reference value for the other three models.Based on the two-parameter WSLM, the ability of the examinees will be overestimated when there is guessing phenomenon on the difficult items;Meanwhile, the ability of the examinees will be underestimated when there is sleeping phenomenon on the easy items. Secondly, for the threeparameter WSLM which contains c parameter, the overestimation phenomenon would be rectified, that is, the response disturbances would disappear. However, the underestimation phenomenon still exists when the examinees miss the easy items. Thirdly, for the three-parameter WSLM which contains γ parameter, the underestimation phenomenon would be rectified well when the examinees miss the easy items, that is, the response disturbances would disappear. But the overestimation phenomenon still exists when the examinees get the difficult items. Fourthly, for the four-parameter WSLM which contains c, γ parameter, the underestimation phenomenon would be rectified well when the examinees miss the easy items, and the overestimation phenomenon would also be rectified well when the low-ability examinees get the difficult items luckily, that is, the response disturbances would disappear. So, the examinee can get the ability robust estimation under the four-parameter WSLM when there are response disturbances such as random guessing and carelessness error in the tests.
作者
梅云
简小珠
刘建平
Mei Yun;Jian Xiaozhu;Liu Jianping(School of Psychology,Jiangxi Normal University,Nanchang,330022;School of Education,Jinggangshan University,Ji'an,343009;Jiangxi Key Laboratory of Psychology and Cognitive Science,Nanchang,330022)
出处
《心理科学》
CSSCI
CSCD
北大核心
2019年第1期163-169,共7页
Journal of Psychological Science
基金
国家社会科学基金项目(14BSH071)
江西省高校人文社会科学项目(XL1515)的资助
关键词
Logistic加权模型
猜测现象
失误现象
能力高估
能力低估
weighted Logistic model
guessing phenomena
randon error phenomena
ability overestimated
ability underestimated