期刊文献+

自动作文评分系统测量性、归纳性和外推性效度研究 被引量:2

A study on the evaluation, generalization and extrapolation of automated writing evaluation system
下载PDF
导出
摘要 研究人员对常用的自动作文评阅(AWE)系统PEG、IEA、e-rater、Intelli Metric等都开展过大量效度研究,对AWE系统的发展起到了积极作用。然而,针对我国自主研发的AWE系统批改网的效度研究却十分有限。本研究从测量性、归纳性和外推性三方面对批改网效度加以验证,结果显示,批改网的人机评分同一分数档内的完全一致性以及完全加相邻分数档一致性与国外同类AWE系统基本相似,人机评分显著相关,说明其具有一定的测量性,但是相关性略低于国外其它AWE系统。批改网对不同任务作文评分呈现显著相关性,显示出一定的归纳性,但相关性略低于人工评分间的相关性以及国外其它AWE系统的人机评分相关性。批改网作文评分与听力、阅读以及学习档案袋分数显著相关,具有一定的外推性,且相关性高于国外其它多数AWE系统。研究也发现,批改网对不同任务作文评分有显著差异,系统评分与口语成绩未呈现显著相关性。研究者对此进行了解释。本研究较为全面地对批改网系统的效度进行了验证,对于系统的开发、利用和改进有着积极意义。 Studies on the validity of automated writing evaluation(AWE)systems such as PEG、IEA、e-rater、Intelli Metric have contributed to the development of AWE systems.However,little research has been conducted on the validity of AWE system developed in China.This study focuses on the validity of Pigai AWE system by investigating its evaluation,generalization and extrapolation.It is found that the agreement of human and computer scoring is similar to most AWE systems abroad and the correlation between human and computer scoring is significant as far as the five rating levels are concerned,showing a certain degree of evaluation.Pigai AWE scores are significantly correlated between writing tasks,displaying a good generalization.Pigai AWE score is also correlated to the scores for students'listening,reading and e-portfolio,but not to speaking,which demonstrates its extrapolation.The study on the validity will contribute to the design and development of AWE in China.
作者 张荔
出处 《外语与翻译》 2017年第3期64-71,共8页 Foreign Languages and Translation
基金 国家社科基金项目"基于语料库和云技术的网络自动作文评阅系统信效度及其辅助教学研究"(项目号:13BYY081)的部分成果
  • 相关文献

二级参考文献68

引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部