期刊文献+

基于深度学习的主观题自动评分效度研究——以大学校本英语水平考试汉译英试题为例 被引量:6

Validation of a deep learning based automatic scoring engine for constructed responses:A case study of the Chinese-English translation in SJTU English Proficiency Test
原文传递
导出
摘要 本研究以上海交通大学英语水平考试的汉译英试题为例,探讨了AI自动评分原理与机制,并基于大规模考试数据验证其评分效度。研究发现,AI自动评分结果与人工评分相关系数可达0.76,两种评分结果的均值无显著性差异,但在高分段和低分段人工评分的质量更高。本研究还对AI自动评分在大规模考试中应用的可行性以及目前存在的若干问题进行了探讨。 This study provided an explication of the working mechanism of an AI based automatic scoring engine for Chinese-English translation of SJTU-EPT,and evaluated the validity of automatic scoring using large-scale test data.The results showed that the correlation coefficient between automatic scores and human scores was as high as 0.76,and there was no significant difference between the mean scores generated by the two scoring methods.Human scoring,however,slightly outperformed automatic scoring in grading translations of high and low quality.Discussions were presented on practicality of employing automatic scoring in large-scale tests and current problems associated with the performance of AI based automatic scoring.
作者 张利东 朱一清 ZHANG Lidong;ZHU Yiqing
机构地区 上海交通大学
出处 《外语界》 CSSCI 北大核心 2022年第2期41-48,55,共9页 Foreign Language World
基金 上海市浦江人才计划(编号15PJC072)的资助。
关键词 深度学习 自动评分 效度验证 汉译英 deep learning automatic scoring validation Chinese-English translation
  • 相关文献

参考文献7

二级参考文献86

  • 1尚福华,王宏威,黄真.自动评价机器翻译译文质量的一种方法[J].大庆石油学院学报,2004,28(3):57-59. 被引量:2
  • 2司显柱.对近二十年中国译学界对翻译单位命题研究的述评[J].外语学刊,2001(1):96-101. 被引量:55
  • 3何莲珍.基于汉、英语平行语料库的翻译数据库设计[J].现代外语,2007,30(2):191-199. 被引量:26
  • 4黄瑾.ICTCLAS学习笔记[R].http://www.nlp.org.cn/docs/doclist.php,2008.
  • 5罗爱荣,段慧明.机译评估方法评述及一个基于测试集的自动评估系统--MTE的进展[A].陈力为、袁琦主编.计算语言学进展与应用[C].北京:清华大学出版社,1995.
  • 6俞士汶,姜新,朱学锋,等.机译译文质量自动评价原理[A].计算语言学教学参考资料[C].北京:北京大学计算机科学技术系,北京大学计算语言学研究所,1993.
  • 7Allen,M. P. 1997. Understanding Regression Analysis [M]. New York: Plenum Press.
  • 8Banerjee, S. & A. Lavie. 2005. METEOR: An Automatic Metric for MT Evaluation with Improved Correlation with Human Judgments [P]. Presented at the ACL 2005 Workshop on Intrinsic and Extrinsic Evaluation Measures for MT and/or Summarization.
  • 9Bowerman, B. L. & R. T. O'connell. 1990. Linear Statistical Models: An Applied Approach (Second Edition) [M]. Boston: Pws-Kent Publishing Company.
  • 10Dodigovic, M. 2005. Artificial Intelligence in Second Language Learning: Raising Error Awareness [M]. Buffalo, NY: Multilingual Matters.

共引文献51

同被引文献42

引证文献6

二级引证文献31

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部