期刊文献+

写作能力行为测试评分员培训研究述评 被引量:1

A Study of Rater Training in Writing Performance Assessment
下载PDF
导出
摘要 在大规模语言考试实践中,评分员培训通常被认为是作文人工评分的必经环节,其目的在于提高评分一致性,保证考试公平性。然而,语言测试界对于评分员培训目的以及自身价值的争论颇多,本文将对这些观点进行梳理。同时,还将对评分员培训步骤、评分员培训效果与持续时间、培训中评分员认知发展过程等相关研究进行探讨。 In the practice of large-scale language assessment, rater training is often regarded as an essential part in human essay rating process. It is a measure employed by rating institutions for the aim of improving rater consistence in order to ensure fairness in the exam. However, there are various views on the purpose of rater training and debates over its value in language assessment field. The article first examines these views and then goes on to review studies of rater training procedures, the effects and duration of rater training, and rater's cognitive development over the training process. It is obvious that although rater training may not be as ideally efficient as trainers have expected, it is still useful in improving rating validity when carefully designed and administered.
出处 《外国语文研究》 2016年第1期99-105,共7页 Foreign Language and Literature Research
基金 教育部人文社会科学研究青年基金项目"基于语料库的甘肃藏汉中学生英语书面语对比研究"(项目编号:15YJC740004) "兰州大学中央高校基本科研业务费专项资金"(项目编号:14LZUGBWZY001)资助
关键词 语言测试 评分员 评分培训 language assessment rater rater training
  • 相关文献

参考文献29

  • 1Bachman, L. F., & Palmer, A. S. Language testing in practice [M]. Oxford: Oxford University Press, 1996.
  • 2Barritt, L., Stock, P. & Clarke, F. Researching practice: evaluating assessment essays [J]. College Composition andCommunication, 1986, (37): 315-327.
  • 3Carrell, P. L. The effect of writers'personalities and raters'personalities on the holistic evaluation of writing [J].Assessing Writing, 1995, 2(2): 153-190.
  • 4Chalhoub-Deville, M., & Wigglesworth, G. Rater judgment and English language speaking proficiency [J]. WorldEnglishes, 2005, 24(3): 383-391.
  • 5Charney, D. The validity of using holistic scoring to evaluate writing: a critical overview [J]. Research in the Teaching ofEnglish, 1984, (18): 65-81.
  • 6Congdon, P. J. & McQueen, J. The stability of rater severity in large-scale assessment programs [J]. Journal ofEducational Measuremen, 2000, 37(2): 163-178.
  • 7Elder, C., Barkhuizen, G., Knoch, U., & Randow, J. Evaluating rater responses to an online training program for L2writing assessment [J]. Language Testing, 2007, 24(1): 37-64.
  • 8Gere, A. R. Written Composition: Toward a Theory of Evaluation [J]. College English, 1980,(42): 44-58.
  • 9Hamilton, J., Reddel, S. & Spratt, M. Teachers' perceptions of on-line rater training and monitoring [J]. System, 2001,(29): 505-520.
  • 10Huot, B. Reliability, validity, and holistic scoring: what we know and what we need to know [J]. College Compositionand Communication, 1990,(41): 201-213.

二级参考文献96

  • 1Michael Milanovic, University of Cambridge Local Examinations Syndicate Lynda Taylor, University of Cambridge Local iExaminations Syndicate.Setting up a dynamic language testing system in national language test reform:the Public English Test System(PETS)in China[J].外国语,1999,22(3):7-13. 被引量:6
  • 2Aldereon, C., Clapham,C. & Wall, D. Language Test Constructionand Evaluation [M]. CambridgerCambridge University Press, 1995.
  • 3Bachman, L. F. Modem language testing at the turn of the century:Assuring that what we count counts [J]. Language testing, 2000, 17(1):1-42..
  • 4Bachman, L, F. & A. S. Palmer. Language Testing in Practice [M].Oxford University Press, 1996.
  • 5Baker, B. A. Individual differences in rater decision-making style:An exploratory mixed- methods study [J]. Language AssessmentQuarterly, 2012, 9: 225-248.
  • 6Barkaoui, K. Think-aloud protocols in research on essay rating: Anempirical study of their veridicality and reactivity [J]. LanguageTesting, 2011,28(1):51-75.
  • 7Bejar, 1.1. Hater cognition: Implications for validity [J]. EducationalMeasurement: Issues and Practice, 2012, 31(3): 2-9.
  • 8Crisp, V. An investigation of rater cognition in the assessment ofprojects [J]. Educational Measurement: Issues and Practice, 2012,31(3): 10 - 20.
  • 9Congdon, P. J. & J. McQueen. The stability of rater severity insurement, 2000, 37(2): 163-17..
  • 10Gumming, A., R. Kantor and D. E. Powers. Decision making whilerating ESL/EFL writing tasks: A descriptive framework [Jj. TheModem Language Journal, 2002, 86(1): 67-96.

共引文献24

同被引文献9

引证文献1

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部