基于3PLM和GRM混合模型的等值方法比较

Comparison of equating methods based on mixed model of 3PLM and GRM

下载PDF

导出

摘要测验等值对于考试的公平性、题库建设以及计算机自适应测验都具有重要的意义。以广东省佛山市"升中"考试为例,对3PLM和GRM混合模型下的IRT五种等值方法进行比较,IRT五种等值方法分别是MM、MS、RMS、SL和HA方法。比较以Tucker方法的等值结果作为标准,并选用误差平均差和标准加权均方差作为评价指标。结果发现:(1)IRT五种等值方法中,Haebara特征曲线法(HA)最优;(2)客观题等值结果最为精确,主观题等值误差最大,包括客观题与主观题的混合题的等值误差介于两者之间;(3)等值方法与题型有交互作用,客观题等值方法间差异不显著,而主观题或混合题等值方法间出现显著差异。 Test equating is of great significance for the examination of the fairness, the construction of the test question bank and the computer adaptive test. Exemplifying high school entering examination data in the province, this paper compared five equating methods of item response theory based on mixed model of3 PLM and GRM. The results of Tucker liner method were used as the evaluation criteria. And we chose the average error and the standard weighted variance to be the two evaluating indicator. After comparing the five IRT methods, MM method, MS method, RMS method, SL method and HA method, the results reached the following conclusions：（1） In a test of mixture scoring item type, when using the standard weighted variance as evaluating indicator, Haebara method among five kinds of IRT test equating methods was best;（2） Main effect of the item type was significant. The result of objective question was the most accurate, and the subjective problem was the largest. The equivalence error between the objective and subjective questions was between the two.（3） The interaction with equating methods of scoring items types was significant. There was no significant difference among objective questions, but there were significant differences among subjective or mixed questions.

作者黎光明 LI Guangming(School of Psychology,Center for Studies of Psychological Application,South China Normal University,Guangzhou 510631)

机构地区华南师范大学心理学院、心理应用研究中心

出处《心理研究》 2018年第4期351-357,共7页 Psychological Research

基金国家自然科学基金面上项目(31470050) 广东省哲学社会科学“十三五”规划一般项目(GD17CXL01) 广东省2015度高等教育教学改革项目(粤教高函[2015]173号) 广州市哲学社会科学“十三五”规划一般项目(2017GZYB111) 华南师范大学2014年度校级高等教育教学研究和改革项目(教学[2014]52号)

关键词升中考试 IRT 测验等值 3PLM GRM 混合模型 Haebara特征曲线法 high school entering examination item response theory test equating 3PLM GRM nixture model haebara method

分类号 B841 [哲学宗教—基础心理学]

引文网络
相关文献

参考文献6

1丁树良,熊建华,戴海琦.影响项目反应理论等值效果的因素探查[J].中国考试,2005(1):25-26. 被引量：3
2黎光明,张敏强.IRT测验等值模型的选择——以广东佛山市中考数学实测数据为例[J].中国考试,2012(2):8-13. 被引量：7
3焦丽亚,辛涛.基于CTT的锚测验非等组设计中四种等值方法的比较研究[J].心理发展与教育,2006,22(1):97-102. 被引量：11
4焦丽亚.基于IRT的共同题非等组设计中五种项目参数等值方法的比较研究[J].考试研究,2009,5(2):85-99. 被引量：7
5涂冬波,蔡艳,戴海琦,丁树良.项目反应理论新进展:基于3PLM和GRM的混合模型[J].心理科学,2011,34(5):1189-1194. 被引量：9
6张敏强,黎光明,王小婷,黄春汝,王幸君.RT混合模型下五种线性等值方法跨分布比较[J].心理与行为研究,2015,13(6):794-798. 被引量：1

二级参考文献57

1张敏强,胡晖.略论测验等值的理论、方法和应用[J].华南师范大学学报（社会科学版）,1988(4):113-118. 被引量：6
2周骏,欧东明,徐淑媛,戴海琦,漆书青.等级反应模型下项目特征曲线等值法在大型考试中的应用[J].心理学报,2005,37(6):832-838. 被引量：10
3Baker, F. B., & Kim, S. H. (2004). Item response theory parameters estimation techniques (The second edition ). Marcel Dekker, Inc, New York.
4Embreston, S. E. (2000). Item response theory for phsychologists. Laweence Erlbanum Associates, publicshers. Mahwah, New Jersey.
5Ercikan, K., Schwarz, R. D., Julian, M. W., Burket, G. R., Weber ,M. M., & Link, V. (1998). Calibration and scoring of tests with multiple- choice and constructed - response item types. Journal of Educational Measurement. 35(2):137- 154.
6Masters, G.. N., & Wright, B. D. (1997). The partial credit model. Van der Linden, W. J., & Hambleton, R. K. (Eds.). Handbook of modern item response theory (pp. 85 - 100 ). Springer - New York press.
7Samejima, F. (1997). Graded Response Model. Van der Linden, W. J., & Hambleton, R. K. (Eds.). Handbook of modern item response theory (pp. 85 - 100). Springer - New York press.
8Scientific Software International. (2003). Multilog for windows version 7.03. http: //www. ssicentral, com/.
9Scientific Software International. (2003). Parscale for windows version 4.10. http: //www. ssieentral, eom/.
10戴海崎,张锋,陈雪枫.心理与教育测量(第三版)[M].广州:暨南大学出版社.2011.

共引文献29

1贾志先.神经网络在试卷等值方面的应用探讨[J].计算机与现代化,2009(2):115-117. 被引量：1
2张敏强,黎光明,焦璨.普教“升中”考试中测验等值的应用研究——以广东省佛山市“升中”考试为例[J].心理与行为研究,2009,7(1):27-31. 被引量：4
3蔡艳,丁树良,涂冬波.铆题比例对等值精度的影响[J].心理学探新,2009,29(2):86-89. 被引量：11
4黎光明,张敏强.全测验与锚测验题型分值比对等值误差的影响[J].考试研究,2009,5(3):71-78. 被引量：6
5刘玥,骆方,刘红云.IRT真分数等值和IRT观察分数等值的对比研究[J].心理科学,2010,33(3):676-680. 被引量：1
6马洪超.锚题参数特征对IRT真分数等值的影响[J].中国考试,2010(8):9-13. 被引量：2
7马洪超.考生样本量对项目反应理论(IRT)等值稳定性的影响[J].考试研究,2011,7(2):62-66. 被引量：5
8高慧健,辛涛,李峰.基于RSM对Q矩阵相同的无锚题测验的等值[J].心理科学,2011,34(4):957-964. 被引量：5
9黎光明,张敏强.IRT测验等值流程化操作思路的构建[J].中国考试,2012(11):3-10. 被引量：6
10贾志先.基于聚类分析的锚测验等值样本选取方法研究[J].控制工程,2012,19(6):1015-1018. 被引量：2

1Sherry.为什么美国名校的大学生更容易成功？[J].阅读与作文（高中版）,2018,0(4):24-25.
2Sherry.为什么美国名校学生更易成功[J].意林,2018,0(8):68-68.
3雪蕾.为什么美国名校的学生更易成功[J].课外阅读,2018,0(9):61-63.
4廖碧涛,邱兰.基于莱以达准则的粗大误差的自动剔除[J].内江科技,2017,38(11):50-51. 被引量：2
5徐金良,杜丹丹,翟志敏,郭进京.卵巢良恶性肿瘤患者血液NLR,LMR,RDW,PLR四项参数的变化及其临床意义[J].现代检验医学杂志,2018,33(2):16-18. 被引量：19
6张健,任杰.基于共同题非等组设计的等值结果评价标准研究综述[J].中国考试,2018(3):32-37. 被引量：3
7王超,陆宏.计算机自适应测验的研究策略与应用实践[J].现代教育技术,2017,27(12):44-49. 被引量：4
8成源达.圈形盘尾丝虫在湖南的发现与危害[J].中国兽医杂志,1981,14(11):12-13. 被引量：2
9“第4届中德儿科高峰论坛暨新生儿临床实践疑难问题共识研讨会”会议通知[J].中华实用儿科临床杂志,2018,33(14):1051-1051.
10周挺,李红,刘莹,李瑞生,乔鑫.基于RSM与RBF的车身多目标轻量化应用研究[J].汽车实用技术,2018,44(13):108-109.

心理研究

2018年第4期

浏览历史

内容加载中请稍等...

基于3PLM和GRM混合模型的等值方法比较

参考文献6

二级参考文献57

共引文献29

相关作者

相关机构

相关主题

浏览历史