摘要
针对基于Transformer的机器翻译模型中存在的运行效率不高、计算参数过大以及计算复杂度过高的问题,提出一种基于融合CNN和Transformer的分离结构机器翻译模型。首先,对于运行效率不高和计算参数过大的问题,使用计算注意力模块和归一化模块分离的结构保证堆叠多层结构的可复用性,提高运行效率和降低计算参数。其次,引入了卷积计算模块和原始自注意力模块进行融合,原始自注意力模块用于计算全局上下文语义关系,卷积计算模块用于计算局部上下文语义关系,降低模型的复杂度。与其他机器翻译模型在相同的数据集进行实验对比,实验结果表明,该模型的计算参数最低,效果也比其他模型表现得更好。
To address the problems of inefficient operation,excessive computational parameters,and high computational complexity in the Transformer-based machine translation model,this paper proposed a separate structure machine translation model based on fused CNN and Transformer.Firstly,for the problems of inefficient operation and excessive computational parameters,this paper used the structure of separating computational attention module and normalization module to ensure the reusability of stacked multilayer structure,improve the operation efficiency and reduce the computational parameters.Secondly,the model introduced the convolutional computation module and the original self-attentive module for fusion.This paper used the original self-attentive module to calculate the global contextual semantic relations and the convolutional computation module to calculate the local contextual semantic relations to reduce the complexity of the model.Experimental comparisons with other machine translation models on the same dataset show that the model has the lowest computational parameters and performs better than other models.
作者
葛君伟
涂兆昊
方义秋
Ge Junwei;Tu Zhaohao;Fang Yiqiu(College of Software Engineering,Chongqing University of Posts&Telecommunications,Chongqing 400065,China)
出处
《计算机应用研究》
CSCD
北大核心
2022年第2期432-435,共4页
Application Research of Computers
基金
国家自然科学基金面上项目(62072066)。
关键词
卷积注意力
模块分离
机器翻译
convolutional attention
module separation
machine translation