期刊文献+

基于多层面Rasch模型的评分员效应研究——以某市级青少年外语能力竞赛决赛为例 被引量:2

A rater effects study based on Many-facet Rasch Model:A case study on finalist of a junior English contest
下载PDF
导出
摘要 本研究基于多层面Rasch模型,针对某市一次青少年外语能力竞赛决赛中的评分员效应进行研究,使用Facets软件分析了7名评分员对13名选手在外语能力竞赛中的评分。研究结果表明:1)评分员之间的严厉度有较为显著的差异,且有个别评分员自身一致性较差;2)7位评分员总体并没出现显著的集中趋势和随机性,但是个别评分员在打分时可能出现集中趋势和随机效应;3)7位评分员出现了明显的晕轮效应;4)在区别性严厉度方面,评分员在对个别选手和不同性别的选手评分时出现偏差,在评分项目上不存在评分偏差。本文对产生上述评分偏差的原因进行了初步的探讨,并针对问题提出了相应的建议。 This study investigated the rater effects by applying a Many-facet Rasch Model in a Youth Foreign Lan- guage Competence Contest. The rating of 13 candidates' performance by 7 accredited raters was analyzed with facets, and the results suggested that 1 ) there was a statistically significant difference among raters' severity though most indi- vidual raters demonstrated good internal self-consistency; (2) as a group, there was no significant central tendency and randomness for the 7 raters, but some central tendency and randomness was detected within one individual rater respectively; (3) the raters were found to have significant halo effect; (4) there was significant rater bias in the in- teraction between raters and candidates, between raters and the gender of candidates, but no rater bias in the interac- tion between raters and the three traits of the rating scale. The reasons for rater bias were analyzed and some sugges- tions were put forward to improve this kind of assessment.
作者 程俊瑜 袁洁
出处 《外语测试与教学》 2016年第1期32-38,共7页 Foreign Language Testing and Teaching
关键词 多层面RASCH模型 评分员效应 严厉度 评分员偏差 MFRM rater effects severity rater bias
  • 相关文献

参考文献14

  • 1Cronbach L J. Essentials of Psychological Testing(5th ed) [ M]. New York:Haper and Row, 1990.
  • 2Eckes T. Examining rater effects in Test Daf writing and speaking performance assessments : A many-facet Ra- sch analysis[J]. Language Assessment Quarterly, 2005, 2 (3): 197-221.
  • 3Eckes T. Rater types in writing performance assessments : A classification approach to rater variability [ J ]. Language Testing, 2008, 25: 155-185.
  • 4Engelhard G Jr. Examining rater errors in the assessment of written composition with a many-faceted Rasch model[J]. Journal of Educational Measurement, 1994, 31 (2) : 93-112.
  • 5Linacre J M. Many-facet Rasch Measurement[ M ]. Chicago: MESA Press, 1994.
  • 6Linacre J M. Investigating rating scale category utility [ J ]. Journal of Outcome Measurement, 1999, 3 (2) : 103-122.
  • 7McNamara T. Measuring Second Language Performance[ M ]. London: Longman, 1996. Myford C M & Wolfe E W. Detecting and measuring rater effects using many-facet Rasch measurement: Part I [J]. Journal of Applied Measurement, 2003, 4(4) : 386-422.
  • 8Myford C M & Wolfe E W, Understanding Rasch measurement: Detecting and measuring rater effects using many-facet Rasch measurement : Part II [ J ]. Journal of Applied Measurement, 2004, 5 ( 2 ) : 189-227.
  • 9Saal F E, Downey R G & Lahey M A. Rating the ratings: Assessing the psychometric quality of rating data [ J ]. Psychological Bulletin, 1980,88 ( 2 ) : 413-428.
  • 10Scullen S E, Mount M K & Golf M. Understanding the latent structure of job performance ratings[ J]. Journal of Applied Psychology, 2000, 85: 956-970.

二级参考文献55

  • 1Bachman, L. F. 2002. Some reflections on task-based language performance assessment [J]. Language Testing 19: 453-76.
  • 2Bachman, L. F., B. K. Lynch & M. Mason. 1995. Investigating variability in tasks and rater judgments in a performance test of foreign language speaking [J]. Language Testing 12: 238-257.
  • 3Bonk, W. J. & G. J. Ockey. 2003. A many-facet Rasch analysis of the second language group oral discussion task [J]. Language Testing 20, 1: 89- 110.
  • 4Elder, C., N. Iwashita & T. F. McNamara. 2002. Estimating the difficulty of oral proficiency tasks: what does the test-taker have to offer? [J]. Language Testing 19, 4: 347-368.
  • 51996. Testing tasks: Issues in task design and the group oral[J]. Language Testing 13, 1: 23-51.
  • 6Fulcher, G. 2003. Testing Second Language Speaking [ M ]. London: Longman/Pearson Education.
  • 7IwashitaN. &T. M. McNamara. 2001. Can we predict task difficulty in an oral proficiency test? Exploring the potential of an information processing approach to task design [J]. Language Learning 51, 3: 401-436.
  • 8Jennings, M. 1999. The test-takers' choice: An investigation of the effect of topic on language- test performance [J]. Language Testing 16, 4: 426-456.
  • 9Linacre, J. M. 1989, 1994. Many-facet Rasch Measurement [M]. MESA Press: Chicago.
  • 10Linacre, J. M. 1999. Investigating rating scale category utility [J]. Journal of Outcome Measurement 3(2) : 103-122.

共引文献74

同被引文献21

引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部