期刊文献+

文化风格区分的无监督领域适应的电商产品翻译 被引量:1

Culture-style aware e-commerce product translation based on unsupervised domain adaptation
下载PDF
导出
摘要 电商产品翻译系统的训练存在两个主要的问题:电商领域训练数据稀缺和电商产品描述文化风格差异较大.为此,通过获取大量的电商产品数据信息作为训练语料,并利用基于无监督领域适应的混合训练和文化风格区分的方法改善电商产品翻译系统的性能.具体地:一方面将基于外领域数据训练得到的翻译系统应用于电商领域单语数据得到伪平行语料,使用伪平行语料进行混合训练进一步得到新的模型;另一方面给不同语言的电商数据添加对应的文化风格区分标记,在训练过程中告诉模型当前数据的所属类别,根据类别信息获取相应的文化风格区分特征向量,从而提高电商领域产品信息翻译的准确度.实验结果表明,该方法优于多种基于单语语料的电商产品翻译方法. Generally,two major problems in the training of e-commerce product translation system are encountered,namely,the scarcity of training data in the e-commerce field and the difference in cultural-style of e-commerce product description.In order to improve the performance of e-commerce product translation system,we have collected a large amount of product data information as training corpus and propose mix-training and culture-style aware methods based on unsupervised domain adaptation.In the mix-training method,we mix the pseudo corpus of the e-commerce domain data obtained by the model system based on external domain data training to obtain a new model.In the method of cultural-style aware,we add corresponding cultural-style distinction marks to e-commerce data of different languages,tell the model of current data category in the training process,and obtain the corresponding cultural style distinguishing feature vector according to the category information.Experimental results indicate that our method outperforms various existing e-commerce product translations based on monolingual corpus.
作者 史小静 宁秋怡 段湘煜 SHI Xiaojing;NING Qiuyi;DUAN Xiangyu(School of Computer Science and Technology,Soochow University,Suzhou 215006,China)
出处 《厦门大学学报(自然科学版)》 CAS CSCD 北大核心 2021年第6期1011-1018,共8页 Journal of Xiamen University:Natural Science
基金 国家自然科学基金(61673289)。
关键词 机器翻译 领域适应 无监督 machine translation domain adaptation unsupervised
  • 相关文献

参考文献3

二级参考文献2

共引文献185

同被引文献15

引证文献1

二级引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部