摘要
评分教师的评分效应和评分量表研究是研究主观题评分误差的核心问题。本论文以2006年高考政治(上海卷)第38题(论述题)为例,运用ACER Conquest的Raters Effect模型研究,结果显示该大题基本没有表现出模糊性、趋中性和等级限制等评分误差,评分教师能够比较好地区分考生不同行为特征,除个别评分教师的评分一致性还有待提高外,评分松紧度差异比较显著。为此,作者提出根据松紧度调整考试分数的方法。
The rater effects and the rating scale are the core issues of the scoring errors of subjective items. With Raters Effect Model of ACER Conquest,the author studied the related data with an essay item in the Politics Test of 2006 UEE (University Entrance Examinations in Shanghai) in order to find the essential elements that affect scoring quality of this item. The results demonstrate that neither distinguished central tendency nor severe halo effect or restriction of range was detected, however, there is statistically significant difference in marking leniency. The author provided a method of adjusting the score based on the difference in the estimates of leniency for the raters.
出处
《考试研究》
2007年第3期37-48,共12页
Examinations Research
关键词
主观题
评分
松紧度
一致性
subjective item scoring leniency reliability