摘要
作文自动评分技术是一种利用机器学习进行自然语言处理的技术。目前,基于深度学习的端到端模型在作文自动评分领域已经广泛使用。然而,由于端到端模型难以获取不同特征之间的相关性,因此提出一种基于语义特征融合的作文自动评分方法(TSEF)。该方法主要分为特征提取和特征融合2个阶段。特征提取阶段,使用Bert模型对输入文本进行预训练,并使用多头注意力机制对输入文本进行自训练,以补充预训练的不足;特征融合阶段,使用交叉融合方法将获取的不同特征融合,以此获得更好性能的模型。在实验中,将TSEF与许多强基线进行比较,结果表明了本文方法的有效性和稳健性。
Automatic composition scoring technology is a kind of natural language processing technology using machine learning.At present,end-to-end models based on deep learning have been widely used in the field of automatic essay scoring.However,because of the difficulty in obtaining correlations between different features in end-to-end models,Automatic Scoring Method for Composition Based on Semantic Feature Fusion(TSEF)has been proposed.This method is mainly divided into two stages:fea‐ture extraction and feature fusion.In the feature extraction stage,the Bert model is used to pre-train the input text,and a multi head-attention mechanism is used to self-train the input text to supplement the shortcomings of pre-training;In the feature fu‐sion stage,cross fusion methods are used to fuse the different features obtained in order to obtain a better performance model.In the experiment,TSEF was compared with many strong baselines,and the results demonstrated the effectiveness and robustness of our method.
作者
袁航
杨勇
任鸽
帕力旦·吐尔逊
YUAN Hang;YANG Yong;REN Ge;Palidan Turson(School of Computer Science and Technology,XinJiang Normal University,Urumqi 830054,China)
出处
《计算机与现代化》
2024年第6期8-13,24,共7页
Computer and Modernization
基金
新疆维吾尔自治区自然科学基金项目(2021D01B72)
国家自然科学基金资助项目(62167008,62066044)。
关键词
作文自动评分
自训练
预训练
交叉融合
automatic grading of essays
self-training
pre-training
cross fusion