摘要
本研究基于多层面Rasch模型,针对某市一次青少年外语能力竞赛决赛中的评分员效应进行研究,使用Facets软件分析了7名评分员对13名选手在外语能力竞赛中的评分。研究结果表明:1)评分员之间的严厉度有较为显著的差异,且有个别评分员自身一致性较差;2)7位评分员总体并没出现显著的集中趋势和随机性,但是个别评分员在打分时可能出现集中趋势和随机效应;3)7位评分员出现了明显的晕轮效应;4)在区别性严厉度方面,评分员在对个别选手和不同性别的选手评分时出现偏差,在评分项目上不存在评分偏差。本文对产生上述评分偏差的原因进行了初步的探讨,并针对问题提出了相应的建议。
This study investigated the rater effects by applying a Many-facet Rasch Model in a Youth Foreign Lan- guage Competence Contest. The rating of 13 candidates' performance by 7 accredited raters was analyzed with facets, and the results suggested that 1 ) there was a statistically significant difference among raters' severity though most indi- vidual raters demonstrated good internal self-consistency; (2) as a group, there was no significant central tendency and randomness for the 7 raters, but some central tendency and randomness was detected within one individual rater respectively; (3) the raters were found to have significant halo effect; (4) there was significant rater bias in the in- teraction between raters and candidates, between raters and the gender of candidates, but no rater bias in the interac- tion between raters and the three traits of the rating scale. The reasons for rater bias were analyzed and some sugges- tions were put forward to improve this kind of assessment.
出处
《外语测试与教学》
2016年第1期32-38,共7页
Foreign Language Testing and Teaching