摘要
提出了一种可扩展的基于深度神经网络方法的在线翻译系统架构方法,采用GPU和CPU混合解码的后端部署方法来提高系统的并发能力,降低系统延迟.实验结果表明,所提出的系统架构方法相比于只使用GPU或CPU架构,系统并发能力更强,而响应延迟相对较低.同时系统的架构方法可以方便地扩展到多服务器架构中,整体上提高系统的性能.
Neural network machine translation,which is a new machine translation method,has become the mainstream of machine translation research.In this paper,we propose an extensible online translation system architecture based on deep neural network,which builds the system backend through the method of GPU and CPU mixed decoding to improve the concurrency ability of the system,and reduce system delay.Experimental results show that the proposed system architecture method is effective.Compared to pure GPU or CPU architecture,the system has higher concurrency ability and the response delay is relatively low.At the same time,the architecture can be extended to the multi-server architecture and improve the performance of the system further.
作者
张巍
林飞飞
梁镇爽
黄振
ZHANG Wei;LIN Feifei;LIANG Zhenshuang;HUANG Zhen(College of Information Science and Engineering,Ocean University of China,Qingdao 266100,China;Global Tone Communication Technology (Qingdao) Co. ,Ltd. ,Qingdao 266061,China)
出处
《厦门大学学报(自然科学版)》
CAS
CSCD
北大核心
2019年第2期184-188,共5页
Journal of Xiamen University:Natural Science
关键词
神经机器翻译
在线翻译
混合解码
neural machine translation service
online translation
hybrid decoding