期刊文献+

大规模计算机口试分析评分效果研究 被引量:4

Analytic scoring for a computer-based speaking test
原文传递
导出
摘要 基于计算机的外语考试和评分越来越多,以往研究大都涉及基于纸笔评分效果,对基于计算机的主观题评分效果研究不多。本文从评分一致性和分数维度差异两方面研究了大规模计算机口试分析评分的效果。通过对比整体评分法发现,分析评分法的评卷员行为一致性较好。评卷员使用整体评分法时未能较好地根据表达内容的完整性打分,而且容易出现集中趋势。进行分析评分时,评卷员在内容和语言分项上一致性较差。在准确度方面,评卷员给低水平考生评分要比给高水平考生评分好。 Although computer-based rating of speaking tests is becoming more and more popular, relevant research on the reliability and validity of such rating is far from enough. This paper investigated the application of analytic scoring rubrics for a computer-based speaking test. Scoring consistency and variability of candi-dates' profiles of analytic scores were discussed by means of comparing the results of analytic scoring and holistic scoring. Results showed that the analytic rating scale outperformed the holistic one. The raters using both scales failed to give expected ratings in evaluating the content of the students ' performance, though they demonstrated acceptable reliability in rating the students' pronunciation and fluency. Information functions also revealed certain degree of rating inaccuracy in assessing the performance of the high-level students. Implications of this study were discussed and suggestions were given in terms of rater training and choice of rating scales.
出处 《现代外语》 CSSCI 北大核心 2015年第2期248-257,293,共10页 Modern Foreign Languages
基金 教育部人文社科重点研究基地重大项目"题组反应理论与英语试题库研究"(11JJD740012)资助
关键词 口语考试 分析评分 整体评分 评卷行为 speaking test analytic scoring holistic scoring rater behavior
  • 相关文献

参考文献34

  • 1Alderson, J. C. 2009. Test review: Test of English as a Foreign LanguagerU: Internet-based Test (TOEFL iB ) [J]. Language Testing 26(4): 621-631.
  • 2Bacha, N. 2001. Writing evaluation: What can analytic versus holistic essay scoring tell us? [J]. System 29(3): 371-383.
  • 3Bachman, L. 2000. Fundamental Considerations in Language Testing[M]. Oxford: Oxford University Press.
  • 4Baker, B. F. 2001. The Basics of Item Response Theory [M]. ERIC Clearinghouse on Assessent and Evaluation.
  • 5Barkaoui, K. 2010a. Variability in ESL essay rating processes: The role of the rating scale and rater experience[J]. Language Assessment Quarterly 7( 1 ) : 54-74.
  • 6Barkaoui, K. 2010b. Explaining ESL essay holistic scores: A multilevel modeling approach [J]. Language Testing 27(4): 515-535.
  • 7Barkaoui, K. 2011. Effects of marking method and rater experience on ESL essay scores and rater performance[J]. Assessment in Education: Principles, Policy & Practice 18(3): 279- 293.
  • 8Brown, A. & L. Taylor. 2006. A world survey of examiners' views and experience of the revised IELTS Speaking Test [J]. Cambridge ESOL: Research Notes 26: 14-18.
  • 9Chi, E. 2001. Comparing holistic and analytic scoring for performance assessment with many- facet Rasch model [J]. Journal of Applied Measurement 2(4) : 379-388.
  • 10Chuang, Y., L. L. Chen & M. C. Chuang. 2008. Computer-based rating method for evaluating multiple visual stimuli on multiple scales [J]. Computers in Human Behavior 24(5): 1929- 1946.

二级参考文献59

共引文献33

同被引文献67

引证文献4

二级引证文献10

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部