期刊文献+

基于Transformer融合词性特征的中文语法纠错模型 被引量:2

Chinese grammatical error correction model based on Transformer fused with part-of-speech feature
下载PDF
导出
摘要 针对中文同一个词的不同词性在句子中所代表的关系不同的问题,提出基于Transformer融合词性特征的中文语法纠错(CGEC)模型,所提模型将语言学知识作为辅助信息融入中文语法纠错任务。首先,在不改变句子序列长度的基础上,在原始词嵌入层中以不同方式拼接词性向量,得到全差异词嵌入、词差异词嵌入和词性差异词嵌入三种不同的词嵌入方式;然后,将新的词嵌入方式与Transformer模型相结合,对错误语句进行语法纠错。实验结果表明,三种词嵌入方式均不同程度地提高了F0.5值,且全差异词嵌入方式的效果最好:与Transformer模型相比,F0.5提升了2.73个百分点,BLEU提升了6.27个百分点;与基于Transformer增强架构的中文语法纠错模型相比,F0.5提升了1.88个百分点。所提模型在对词性特征提取时可以侧重源语句与目标语句的语法差异,更好地捕捉句子的语法特征。 Aiming at the problem that different part-of-speech of the same Chinese word represents different relationships in sentences,a Chinese Grammatical Error Correction(CGEC)model based on Transformer fused with part-ofspeech feature was proposed.Linguistic knowledge was incorporated as auxiliary information into Chinese grammatical error correction tasks by the proposed model.First,without changing the length of the sentence sequence,the part-of-speech vectors were spliced in different ways in the original word embedding layer to obtain full-difference word embedding,worddifference word embedding and part-of-speech-difference word embedding.Then,the new word embedding methods were combined with the Transformer model to perform grammatical error correction on wrong sentences.The experimental results show that the three word embedding methods improves the F0.5value to varying degrees,and the full-difference word embedding has the best effect.Compared with the Transformer model,the F0.5value of the full-difference word embedding increases by 2.73 percentage points and BLEU(Bilingual Evaluation Understudy)increases by 6.27 percentage points.Compared with the Chinese grammatical error correction model based on the Transformer enhanced architecture,F0.5increases by 1.88 percentage points.The proposed algorithm enables the model to focus on the grammatical differences between the source sentence and the target sentence when extracting part-of-speech features,so as to better capture the grammatical features of sentences.
作者 尚海怡 黄继风 陈海光 SHANG Haiyi;HUANG Jifeng;CHEN Haiguang(College of Information,Mechanical and Electrical Engineering,Shanghai Normal University,Shanghai 201418,China)
出处 《计算机应用》 CSCD 北大核心 2022年第S02期25-30,共6页 journal of Computer Applications
基金 上海市地方能力建设项目(19070502900)。
关键词 中文语法纠错 语言学知识 词嵌入 Transformer模型 解码器 Chinese Grammatical Error Correction(CGEC) linguistic knowledge word embedding Transformer model decoder
  • 相关文献

参考文献9

二级参考文献11

共引文献235

同被引文献26

引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部