期刊文献+

基于语义相似度与XGBoost算法的英语作文智能评价框架研究 被引量:11

A study of automated English essay evaluating framework based on semantic similarity and XGBoost algorithm
下载PDF
导出
摘要 作文智能评分和评语智能生成能极大减轻评阅专家的工作量、节约人力成本。目前,评分和评语结果的准确性与公平性尚不高。近年来,机器学习和自然语言处理等技术的快速发展,在一定程度上提升了文本分类、机器翻译等任务的性能,但仍有许多新的研究成果尚未应用于作文智能评价。本研究综合了词向量(word2vec)、段落向量(paragraph2vec)、词性向量(pos2vec)和LDA(latent dirichlet allocation)等特征,共同组合为作文的语义表示向量;采用基于kNN(k nearest neighbors)算法的语义相似度模型,得到作文的评语标签;采用基于XGBoost(extreme gradient boosting)的回归模型计算英语作文的评分值;并以900篇大学生英语作文为样本,构造算例进行验证。最后表明,提出的智能评价框架在英语作文自动评分和评语生成的准确性上,都要高于传统方法。 Automated essay scoring and comment generation has greatly released expert human raters from huge workload of evaluating English essays,but up to now,there is still some doubt about the accuracy and fairness of its results.In recent years,with the rapid development of machine learning and natural language processing,etc.,to some extent the performance of text classification,machine translation and the like has been improved.However,quite a number of new research achievements have not been applied to automated essay scoring.This paper presents a semantic representation vector of essays,which is a combination of the features of word2vec,paragraph2vec,pos2vec and LDA(latent Dirichlet allocation);then,the commentary labels of essays are generated through the semantic similarity model based on kNN(k nearest neighbors)algorithm;next,the English essays are scored on the basis of XGBoost(extreme gradient boosting)regression model;finally,900 college students′English essays are taken as samples to verify the results.The case studies show that the evaluating framework in this paper has higher accuracy in automated scoring and comment generation of English essays than traditional methods.
作者 吕欣 程雨夏 LV Xin;CHENG Yuxia(School of Foreign Languages and Literatures,Hangzhou Dianzi University,Hangzhou 310018,China;School of Computer Science and Technology,Hangzhou Dianzi University,Hangzhou 310018,China)
出处 《浙江大学学报(理学版)》 CAS CSCD 北大核心 2020年第3期329-336,共8页 Journal of Zhejiang University(Science Edition)
基金 国家社会科学基金资助项目(16BYY092) 浙江省哲学社会科学规划课题项目(19NDJC043YB) 杭州市哲学社会科学规划课题项目(M18JC040) 杭州电子科技大学2017年高等教育研究资助项目(YB201763) 浙江省杭电智慧城市研究中心开放课题项目(GK150906299001/034).
关键词 英语作文 智能评分 语义表示 相似度 XGBoost English essay automated essay scoring semantic representation similarity XGBoost
  • 相关文献

参考文献5

二级参考文献44

  • 1史晶蕊,郑玉明,韩希.人工神经网络在文本分类中的应用[J].计算机应用研究,2005,22(10):213-216. 被引量:10
  • 2何玉,冯剑琳,王元珍.基于最大关联规则的文本分类[J].计算机科学,2006,33(11):143-145. 被引量:6
  • 3Burstein, J. The E-rater scoring engine : Automated essay scoring with natural language processing[ A]. Shermis, M. D. , Burstein, J. Automated essay scoring: a cross disciplinary perspective. Mahwah, N J: Lawrence Erlbaum Associates,2003.
  • 4Cheville J. Automated scoring technologies and the rising influence of error[ J]. English Journal,2004.
  • 5Chung, G., O' Neil, H. Jr. Methodological approaches to online scoring of essays [ R ]. Los Angeles, CA : University of California, Center for the Study of Evaluation, 1997.
  • 6Elliot, S. IntelliMetfic: from here to validity[A]. Shermis, M. D. , Burstein, J. Automated essay scoring: a cross disciplinary perspective. Mahwah, NJ: Lawrence Erlbaum Associates,2003.
  • 7Foltz, P. , W. , Kintsch, W. Landauer, T. K. The measurement of textual coherence with Latent Semantic Analysis[ J]. Discourse Processes, 1998.
  • 8Hsu, C. Lin, C. A comparison on methods for multi-class support vector machines [ J ]. IEEE Transactions on Neural Networks ,2002.
  • 9Landauer, T. K. , Dumais, S. T. A solution to Plato's problem: the latent semantic analysis theory of the Acquisition, induction, and representation of knowledge [ J ]. Psychological Review, 1997.
  • 10Landauer, T. K. , Laham, D. , Foltz, P. W. Automated scoring and annotation of essays with the Intelligent Essay Assessor[ A]. Shermis, M. D. , Burstein, J. Automated Essay Scoring: A Cross Disciplinary Perspective[ C]. Lawrence Erlbaum Associates, Mahwah, N J: Lawrence Erlbaum Associates,2003.

共引文献30

同被引文献111

引证文献11

二级引证文献13

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部