基于深度学习的主观题自动评分效度研究——以大学校本英语水平考试汉译英试题为例被引量：6

Validation of a deep learning based automatic scoring engine for constructed responses:A case study of the Chinese-English translation in SJTU English Proficiency Test

原文传递

导出

摘要本研究以上海交通大学英语水平考试的汉译英试题为例,探讨了AI自动评分原理与机制,并基于大规模考试数据验证其评分效度。研究发现,AI自动评分结果与人工评分相关系数可达0.76,两种评分结果的均值无显著性差异,但在高分段和低分段人工评分的质量更高。本研究还对AI自动评分在大规模考试中应用的可行性以及目前存在的若干问题进行了探讨。 This study provided an explication of the working mechanism of an AI based automatic scoring engine for Chinese-English translation of SJTU-EPT,and evaluated the validity of automatic scoring using large-scale test data.The results showed that the correlation coefficient between automatic scores and human scores was as high as 0.76,and there was no significant difference between the mean scores generated by the two scoring methods.Human scoring,however,slightly outperformed automatic scoring in grading translations of high and low quality.Discussions were presented on practicality of employing automatic scoring in large-scale tests and current problems associated with the performance of AI based automatic scoring.

作者张利东朱一清 ZHANG Lidong;ZHU Yiqing

机构地区上海交通大学

出处《外语界》 CSSCI 北大核心 2022年第2期41-48,55,共9页 Foreign Language World

基金上海市浦江人才计划(编号15PJC072)的资助。

关键词深度学习自动评分效度验证汉译英 deep learning automatic scoring validation Chinese-English translation

分类号 H319 [语言文字—英语]

引文网络
相关文献

参考文献7

1江进林,文秋芳.N元组和翻译单位对英译汉自动评分作用的比较研究[J].现代外语,2010,33(2):177-184. 被引量：14
2江进林,文秋芳.大规模测试中学生英译汉机器评分模型的构建[J].外语电化教学,2012(2):3-8. 被引量：18
3田艳.网上英译汉自动评分实践探索[J].中国翻译,2011,32(2):38-41. 被引量：15
4田艳.在线英译汉即时自动评分理论探索[J].中国科技翻译,2015,28(1):27-30. 被引量：6
5袁煜.翻译质量自动评估特征集[J].外语教学与研究,2016,48(5):776-787. 被引量：6
6王雷,常宝宝.大学英语翻译考试人工辅助计算机评分初探[J].外语电化教学,2009(4):17-21. 被引量：11
7王金铨,文秋芳.中国学生大规模汉译英测试机助评分模型的研究与构建[J].现代外语,2009,32(4):415-420. 被引量：20

二级参考文献86

1尚福华,王宏威,黄真.自动评价机器翻译译文质量的一种方法[J].大庆石油学院学报,2004,28(3):57-59. 被引量：2
2司显柱.对近二十年中国译学界对翻译单位命题研究的述评[J].外语学刊,2001(1):96-101. 被引量：55
3何莲珍.基于汉、英语平行语料库的翻译数据库设计[J].现代外语,2007,30(2):191-199. 被引量：26
4黄瑾.ICTCLAS学习笔记[R].http://www.nlp.org.cn/docs/doclist.php,2008.
5罗爱荣,段慧明.机译评估方法评述及一个基于测试集的自动评估系统--MTE的进展[A].陈力为、袁琦主编.计算语言学进展与应用[C].北京:清华大学出版社,1995.
6俞士汶,姜新,朱学锋,等.机译译文质量自动评价原理[A].计算语言学教学参考资料[C].北京:北京大学计算机科学技术系,北京大学计算语言学研究所,1993.
7Allen,M. P. 1997. Understanding Regression Analysis [M]. New York: Plenum Press.
8Banerjee, S. & A. Lavie. 2005. METEOR: An Automatic Metric for MT Evaluation with Improved Correlation with Human Judgments [P]. Presented at the ACL 2005 Workshop on Intrinsic and Extrinsic Evaluation Measures for MT and/or Summarization.
9Bowerman, B. L. & R. T. O'connell. 1990. Linear Statistical Models: An Applied Approach (Second Edition) [M]. Boston: Pws-Kent Publishing Company.
10Dodigovic, M. 2005. Artificial Intelligence in Second Language Learning: Raising Error Awareness [M]. Buffalo, NY: Multilingual Matters.