摘要
作文自动评分(AES)是教育领域中应用自然语言处理(NLP)技术的重要研究方向之一,其旨在提高评分效率,增强评价的客观性和可靠性。针对主题相关性缺失和长文本信息丢失问题以及预训练语言模型BERT不同层次能够提取不同维度特征的特点,提出一种基于主题感知和语义增强的作文自动评分模型。该模型采用多头注意力机制提取作文的浅层语义特征并感知作文主题特征,同时利用BERT的中间层句法特征和深层语义特征增强对作文语义的理解。在此基础上,融合不同维度的特征并用于作文自动评分。实验结果表明,该模型在公共数据集ASAP的8个子集上均表现出了显著的性能优势,相比于通义千问等基线模型,其能够有效提升作文自动评分性能,平均二次加权的卡帕值(QWK)达到80.25%。
Automatic Essay Scoring(AES)is an important research topic for the application of Natural Language Processing(NLP)technology in the field of education.AES aims to improve scoring efficiency and enhance the objectivity and reliability of evaluations.This study proposes a topic perception and semantic enhancement approach for AES,addressing the issues of missing thematic relevance and loss of information in long texts,as well as leveraging the different levels of feature extraction capability in the pre-training language model,Bidirectional Encoder Representations from Transformers(BERT).This approach utilizes a multi-head attention mechanism to extract shallow semantic features of an essay and perceive its thematic characteristics.Additionally,it leverages the mid-level syntactic and deep semantic features of BERT to enhance the understanding of the semantics of the essay.Finally,the fused features from different dimensions are used for the AES.Experimental results indicate that the proposed model exhibits significant performance advantages for eight subsets of the ASAP public dataset.The proposed model effectively improves the performance of AES compared to that of baseline models,such as Qwen-7B;its average Quadratic Weighted Kappa(QWK)is 80.25%.
作者
陈宇航
杨勇
先木斯亚·买买提明
帕力旦·吐尔逊
樊小超
任鸽
刁宇峰
CHEN Yuhang;YANG Yong;Xianmusiya·Maimaitiming;Palidan·Tuerxun;FAN Xiaochao;REN Ge;DIAO Yufeng(School of Computer Science and Technology,Xinjiang Normal University,Urumqi 830054,Xinjiang,China;School of Mathematics and Informatics,Hetian Normal College,Hetian 848000,Xinjiang,China;School of Computer Science and Technology,Inner Mongolia University for Nationalities,Tongliao 028000,Inner Mongolia,China)
出处
《计算机工程》
CAS
CSCD
北大核心
2024年第8期363-371,共9页
Computer Engineering
基金
新疆维吾尔自治区自然科学基金(2021D01B72)
国家自然科学基金(62066044,62167008,62006130)。
关键词
作文自动评分
语义增强
主题感知
特征融合
预训练语言模型
Automatic Essay Scoring(AES)
semantic enhancement
topic perception
feature fusion
pretraining language model