四参数Logistic加权模型下被试能力稳健估计被引量：2

The Ability Overestimation and Ability Underestimation of the Examinee under the Weighted-Score Logistic Model

下载PDF

导出

摘要设计项目参数、被试得分已知的测验情境,在两、三、四参数Logistic加权模型下进行能力估计,发现被试得分等级之间的能力步长存在着均匀的步长间距,被试得分能较好的反映多级记分的分数加权作用。两参数Logistic加权模型下会出现被试能力参数估计扰动现象,猜测现象会导致能力高估现象,失误现象会导致能力低估现象;三参数Logistic加权模型c型下能力高估现象未出现或不明显;三参数Logistic加权模型γ型下能力低估现象未出现或不明显;四参数Logistic加权模型下被试能力高估现象和低估现象都未出现或不明显,四参数Logistic加权模型是被试能力稳健性估计较好的方法。 The weighted-score logistic model(WSLM) was proposed by Jian, Dai, & Dai(2016). Based on the item emphases of the polytomously scored item, the WSLM model adds the weighted-score parameters into the dichotomous logistic model. Because the dichotomous model has five forms at least. Similarly, the weighted-score logistic model also has four forms, including the one-parameter weighted-score logistic model, the twoparameter weighted-score logistic model, the three-parameter weighted-score logistic model including c parameter, the three-parameter weighted-score logistic model including γ parameter, and the four-parameter weighted-score logistic model.There are response disturbances such as random guessing, carelessness, transcription error in the educational tests. In the paper and pencil testes or computerized adaptive testing, the aberrant responses such as careless errors and lucky guesses would cause significant ability estimation biases in previous studies. Mislevy & Bock(1982) proposed the Biweight estimator and made comparison between the Biweight estimator and maximum likelihood estimator. Results showed that the Biweight estimator could typically reduce Biases, thereby dispel measurement disturbances. And threeparameter logistic IRT model, four-parameter logistic IRT model, Huber robust estimation, and the other methods have therefore been proposed to address the response disturbance, including random guessing, carelessness, etc.The paper compares the robustifying ability estimates of the four models in an example of a test. The four models compared include twoparameter WSLM, three-parameter WSLM with c parameter, three-parameter WSLM with γ parameter, and four-parameter WSLM. Second, three simulation studies in three test cases are presented respectively, with the aim of comparing the four approaches, including 2 PM-MLE, Biweight estimation, Huber estimation and 4 PM-Robust estimation. The hypothetical test instrument contains 34 items, with difficulty thresholds b^N(0,1), and log(a)~N(0,1). The 35 th item with difficulty thresholds ranges from-4 to 4. The ability of the middle-ability examinee is estimated by the responses on the 34 items of the basic test under two-parameter logistic model, and the ability estimation is seen as the reference value for the other three models.Based on the two-parameter WSLM, the ability of the examinees will be overestimated when there is guessing phenomenon on the difficult items;Meanwhile, the ability of the examinees will be underestimated when there is sleeping phenomenon on the easy items. Secondly, for the threeparameter WSLM which contains c parameter, the overestimation phenomenon would be rectified, that is, the response disturbances would disappear. However, the underestimation phenomenon still exists when the examinees miss the easy items. Thirdly, for the three-parameter WSLM which contains γ parameter, the underestimation phenomenon would be rectified well when the examinees miss the easy items, that is, the response disturbances would disappear. But the overestimation phenomenon still exists when the examinees get the difficult items. Fourthly, for the four-parameter WSLM which contains c, γ parameter, the underestimation phenomenon would be rectified well when the examinees miss the easy items, and the overestimation phenomenon would also be rectified well when the low-ability examinees get the difficult items luckily, that is, the response disturbances would disappear. So, the examinee can get the ability robust estimation under the four-parameter WSLM when there are response disturbances such as random guessing and carelessness error in the tests.

作者梅云简小珠刘建平 Mei Yun;Jian Xiaozhu;Liu Jianping(School of Psychology,Jiangxi Normal University,Nanchang,330022;School of Education,Jinggangshan University,Ji'an,343009;Jiangxi Key Laboratory of Psychology and Cognitive Science,Nanchang,330022)

机构地区江西师范大学心理学院江西省心理与认知科学重点实验室井冈山大学教育学院

出处《心理科学》 CSSCI CSCD 北大核心 2019年第1期163-169,共7页 Journal of Psychological Science

基金国家社会科学基金项目(14BSH071) 江西省高校人文社会科学项目(XL1515)的资助

关键词 Logistic加权模型猜测现象失误现象能力高估能力低估 weighted Logistic model guessing phenomena randon error phenomena ability overestimated ability underestimated

分类号 B841 [哲学宗教—基础心理学]

引文网络
相关文献

参考文献4

1简小珠,戴步云,戴海琦.Logistic加权模型的理论构建与模拟分析[J].心理学报,2016,48(12):1625-1630. 被引量：2
2简小珠,戴海琦.4参数GRM对猜测现象和失误现象的纠正[J].江西师范大学学报（自然科学版）,2016,40(2):140-144. 被引量：4
3简小珠,戴海崎,彭春妹.IRT中Logistic模型的c、γ参数对能力估计的改善[J].心理学报,2007,39(4):737-746. 被引量：6
4张华华,程莹.计算机化自适应测验(CAT)的发展和前景展望[J].考试研究,2005,1(1):12-24. 被引量：16

二级参考文献24

1戴海崎,简小珠.被试作答的偶然性对IRT能力估计的影响研究[J].心理科学,2005,28(6):1433-1436. 被引量：6
2杜文久.项目反应理论框架下多级评分项目的信息函数[J].心理学报,2006,38(1):135-144. 被引量：3
3朱玮,丁树良,陈小攀.IRT中最小化χ~2/EM参数估计方法[J].心理学报,2006,38(3):453-460. 被引量：4
4简小珠,戴海崎,彭春妹.IRT中Logistic模型的c、γ参数对能力估计的改善[J].心理学报,2007,39(4):737-746. 被引量：6
5张华华.计算机自适应考试设计中的误区[J].考试研究,2002,(2):35-39.
6Hessen D J. A new class of parametric IRT models for dichotomous item scores. Journal of Applied Measurement, 2004, 5(4): 385 -397
7Hessen D J. Constant latent odds - ratios models and the Mantel - Haenszel null hypothesis. Psychometrika, 2005 , 70(3): 497 -516
8Mislevy R J, Bock R D. BILOG 3 Item analysis and test scoring with binary logistic models. Scientific software. Inc(America), 1990, Second edition: 6 -22
9简小珠．The Fit of C and γ Parameter Within Logistic Model．硕士论文．江西师范大学，2006
10Linden W J, Glas C A W. Computerized adaptive testing theory and practice. Kluwer Academic Publishers, 2000. 257-262

共引文献24

1姜火文.基于Web的CAT系统实现机制的探讨[J].科技广场,2007(11):21-23.
2简小珠,焦璨,Steven P.Reise,彭春妹.四参数模型对被试作答异常现象的拟合与纠正[J].心理科学进展,2010,18(3):537-544. 被引量：7
3李铭勇,张敏强,简小珠.计算机自适应测验中测验安全控制方法评述[J].心理科学进展,2010,18(8):1339-1348. 被引量：11
4简小珠,张敏强,彭春妹.四参数Logistic模型研究进展及其评析[J].心理学探新,2010,30(3):69-73. 被引量：8
5简小珠,张敏强.CAT初始阶段被试能力估计方法改进探究[J].心理科学,2010,33(6):1470-1472. 被引量：2
6程小扬,丁树良.子题库题量不平衡的按a分层选题策略[J].江西师范大学学报（自然科学版）,2011,35(1):5-9. 被引量：10
7程小扬,丁树良,朱隆尹,巫华芳.等级评分模型下的最大信息量分层选题策略[J].江西师范大学学报（自然科学版）,2012,36(5):446-451. 被引量：6
8章沪超,丁树良,戴勰,关潮辉.基于抽样原理的计算机化自适应测验选题策略[J].江西师范大学学报（自然科学版）,2014,38(2):119-123.
9程小扬,丁树良,巫华芳,朱隆尹.多级评分模型下的题库结构对CAT的影响分析[J].心理学探新,2014,34(5):452-456. 被引量：3
10简小珠,戴海琦.4参数GRM对猜测现象和失误现象的纠正[J].江西师范大学学报（自然科学版）,2016,40(2):140-144. 被引量：4

同被引文献9

1戴海崎,简小珠.被试作答的偶然性对IRT能力估计的影响研究[J].心理科学,2005,28(6):1433-1436. 被引量：6
2杜文久.项目反应理论框架下多级评分项目的信息函数[J].心理学报,2006,38(1):135-144. 被引量：3
3罗照盛,欧阳雪莲,漆书青,戴海琦,丁树良.项目反应理论等级反应模型项目信息量[J].心理学报,2008,40(11):1212-1220. 被引量：22
4简小珠,焦璨,Steven P.Reise,彭春妹.四参数模型对被试作答异常现象的拟合与纠正[J].心理科学进展,2010,18(3):537-544. 被引量：7
5简小珠,张敏强,彭春妹.四参数Logistic模型研究进展及其评析[J].心理学探新,2010,30(3):69-73. 被引量：8
6周婕,陈平,丁树良.基于等级反应模型的CAT系统的研究和开发[J].考试研究,2005,1(3):79-86. 被引量：1
7肖涵敏,杜文久,张婷婷.基于项目节点的多级评分模型的统一[J].心理学报,2011,43(12):1462-1467. 被引量：2
8简小珠,戴海琦.4参数GRM对猜测现象和失误现象的纠正[J].江西师范大学学报（自然科学版）,2016,40(2):140-144. 被引量：4
9刘玥,刘红云.四参数Logistic模型和传统模型对被试作答拟合能力的比较研究[J].心理学探新,2018,38(3):228-235. 被引量：7

引证文献2

1简小珠,戴步云.等级反应模型下多级评分形式的难度等级作用探讨[J].中国考试,2022(6):57-67. 被引量：1
2金英姿,王佶旻.四参数Logistic模型与双参数、三参数Logistic模型在语言测验中的拟合比较及睡眠现象检验--以来华留学生预科结业考试为例[J].中国考试,2022(8):57-65. 被引量：1

二级引证文献2

1曾光,张玉玲,谢晓尧,黎瑞源.一种改进的4参数等级反应模型和应用[J].江西师范大学学报（自然科学版）,2023,47(2):124-132.
2蔡令仪,段斌,旷怡,柯其聪.基于认知技能图谱的智能问答系统设计与实现[J].湘潭大学学报（自然科学版）,2024,46(2):11-19.

1李元白,曾平飞,杨亚坤,康春花.一种非参数的多策略方法:多策略的海明距离判别法[J].江西师范大学学报（自然科学版）,2018,42(1):67-73. 被引量：3
2克莉丝汀·沃尔登·尼兹,睿妮·克里拉(图),徐如梦(译).一分不掉[J].儿童文学选刊,2019,0(3):40-41.
3崔乐,吴迪,成丽波.基于逐步回归的稳健估计和异常值检测[J].沈阳师范大学学报（自然科学版）,2018,36(6):527-532. 被引量：5
4罗黎辉.教育目标分类学提纲[J].华东师范大学学报（教育科学版）,1986,4(1).
5邓荷香.从生态翻译理论探讨店铺招牌名称的翻译——以长沙市店铺招牌翻译为例[J].校园英语,2019,0(10):203-204. 被引量：1
6高子茵,杜明刚,李慎龙.液力自动变速器动力性换挡规律设计及优化[J].车辆与动力技术,2019(1):23-28. 被引量：8
7王园园.基于经典和稳健方法的波士顿房价研究成果综述[J].市场周刊,2019,32(3):40-43. 被引量：1
8苏普玉,余皖皖,赵梅,方桂霞,陈道俊.医学本科一年级学生专业思想稳定性与学习倦怠关系的研究[J].成都中医药大学学报（教育科学版）,2018,20(4):97-101. 被引量：4
9胡伦,陆迁,杜为公.社会资本对农民工多维贫困影响分析[J].社会科学,2018,0(12):25-38. 被引量：20
10义乌市综合行政执法局义亭大队.义亭镇推行记分制管理探索精品街管理新路径[J].城建监察,2019,0(2):32-33.

心理科学

2019年第1期

浏览历史

内容加载中请稍等...

四参数Logistic加权模型下被试能力稳健估计被引量：2

参考文献4

二级参考文献24

共引文献24

同被引文献9

引证文献2

二级引证文献2

相关作者

相关机构

相关主题

浏览历史

四参数Logistic加权模型下被试能力稳健估计 被引量：2

参考文献4

二级参考文献24

共引文献24

同被引文献9

引证文献2

二级引证文献2

相关作者

相关机构

相关主题

浏览历史

四参数Logistic加权模型下被试能力稳健估计被引量：2