摘要
本研究以"中国学习者英语语料库"中的大学英语四、六级考试作文为研究样本,比较300篇作文的人工评分和国内某自动评分系统结果的差异;同时利用人工和软件结合的方法,分析作文的词汇、句法、语篇和错误等四类语言类定量特征对人工和机器评分的影响,揭示人机评分差异的成因。结果显示,自动批改系统整体评分效度有待提高;机器所依赖的语言类量化特征在人工批改中的作用有限,评分依据的不同造成两者结果的差异。
In this study the differences between human and machine scoring are analyzed by randomly sampling 300 CET-4 and CET-6 essays from Chinese Learner English Corpus( CLEC) for the automated assessment by a domestically developed automated essay scoring( AES) system. Meanwhile,the quantitative-linguistic features of each essay,including lexical and syntactic complexity,coherence and accuracy were analyzed by both human and software so as to explore the impacts of qualitative features on human and machine scores respectively and to analyze the underlying causes for human-machine score differences. The results showed that the AES system still needs improvement in validity,because its scoring relies heavily on the quantitative-linguistic features,which have limited influences on human scoring. The differences in scoring criteria bring about divergence in scoring results.
作者
白丽芳
王建
BAI Lifang;WANG Jian
出处
《外语测试与教学》
2018年第3期44-54,共11页
Foreign Language Testing and Teaching
基金
国家社科基金"英语作文自动评分系统效度研究"(14BYY085)阶段性研究成果
关键词
自动作文批改系统
大学英语四、六级考试作文
人机评分差异
语言类量化特征
automated essay scoring system
CET-4 and CET-6 essays
differences of human and machine scoring
quantitative -linguistic features