期刊文献+

基于电子档案袋测评的评分者间信度分析报告

An examination of inter-rater reliability of an e-portfolio assessment
原文传递
导出
摘要 本研究旨在评估一项电子档案袋测评实验的评分者间信度,分析评分差异的可能原因。从样本框按学号抽取85名学生的档案袋(28.3%)。8位评分者分成5组(每组2人,1名任课教师,1名外部评分者),根据量规对旨在反映学生反思能力和自我测评能力的两指标进行独立评分。分别用Cohen的kappa系数、PABAK指数、Spearman秩相关系数、配对样本t检验和效应量对两个评分者的一致性、相关性和差异性进行估计。结果表明:1)10组评分中,7组kappa值达到中等以上一致,9组修正kappa-PABAK值达到中等以上一致;2) 10组评分均达到中等到极强相关程度;3)任务类型和任务呈现方式、评分者角色以及对量规的理解一定程度上造成了评分偏差。本文针对研究结果讨论了对未来档案袋测评设计的启示。 This study examines the inter-rater reliability of teachers’ scoring in an e-portfolio assessment experiment.A sample of 85 student portfolios are chosen from the sampling frame according to their ID number(28.3%).Eight raters in 5 pairs(one is the course teacher and the other external rater) independently score students’ performances against the rubric on two tasks that aim to develop students’ reflective skills and self-assessment skills.Inter-rater agreement,correlation as well as differences are estimated using Cohen’s kappa,PABAK,Spearmans’ s rank coefficient,paired sample t test and effect size.Results show that:1) a moderate to substantial agreement could be found in 7 rater-pairs if kappa values are used,and in 9 of the 10 rater-pairs in the PABAK values;2) a moderate to very strong level of correlation existes in the 10 rater-pairs;and 3)task types and structure,rater’s role,raters’ understanding of the rubric or the course contribute to the rater differences.This article concludes with implications for future research on portfolio design for assessment purpose.
作者 林莉兰 Lin Lilan
出处 《西安外国语大学学报》 CSSCI 北大核心 2021年第4期67-72,共6页 Journal of Xi’an International Studies University
基金 国家社科基金项目“基于发展性评估理念的大学生外语自主学习在线测评系统设计与应用”(项目编号:BCA140053)的阶段性研究成果。
关键词 电子档案袋测评 评分者间信度 一致性 相关性 差异 e-portfolio assessment inter-rater reliability agreement correlation difference
  • 相关文献

参考文献2

二级参考文献13

  • 1吴志明,张厚粲,杨立谦.结构化面试中的评分一致性问题初探[J].应用心理学,1997,3(2):8-14. 被引量:28
  • 2彭平根,丁彪,苏永华.LGD在选拔企业中高级管理人才方面的实证研究[J].心理科学,2002,25(5):576-579. 被引量:15
  • 3孙晓敏,张厚粲.表现性评价中评分者信度估计方法的比较研究——从相关法、百分比法到概化理论[J].心理科学,2005,28(3):646-649. 被引量:45
  • 4李坤崇.多元教学评量[M].台湾:台湾心理出版社[M],2001..
  • 5漆书青,现代教育与心理测量学原理,1998年,9页
  • 6Lemahieu P, Gitomer K, Eresh J. Portfolios in large-scale assessment : difficult but not impossible. Educational Measurement : Issues and Practice , 1995,14( 3 ) : 11 - 28.
  • 7Baume D, Yorke M. The reliability of assessment by portfolio on a course to develop and accredit teachers in higher Education . Studies in Higher Education, 2002,27(1):7-25.
  • 8Pitts J, Coles C, Thomas P , et al . Enhancing reliability in portfolio assessment: discussions between assessors. Medical Teacher, 2002,24 (2) : 197 - 201.
  • 9Supovitz J A , MacGowan A, Slattery J. assessing agreement: an examination of the interrater reliability of portfolio assessment in Rochester, New York. Educational Assessment, 1997,4(3) :237 - 259.
  • 10Koretz D. Large-scale portfolio assessments in the US:evidence pertaining to the quality of measurement. Assessment in Education : Principles, Policy & Practice ,1998,5(3) :309-334.

共引文献24

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部