两阶段域适应神经机器翻译方法

A two-phase domain adaptation method for neural machine translation

下载PDF

导出

摘要 [目的]为了提升神经机器翻译模型的迁移学习效果,以语言数据为中心开展域适应方法探索.[方法]根据KL散度和最大均差两种域适应量度的定量分析结果,提出一种针对拥有大规模平行句子和小规模域文本场景的两阶段减量学习框架.第1阶段域过滤,利用域文本过滤平行句子,得到域平行句子,再利用得到的域平行句子训练出域神经机器翻译模型.第2阶段质量过滤,利用训练出的域神经机器翻译模型将第1阶段过滤出的域平行句子翻译一遍,比较机器译文与人工译文的质量,删除低质量平行句子以获得高质量域平行句子.最后利用得到的高质量域平行句子训练出优化的域神经机器翻译模型.[结果]在适应法律域英汉神经机器翻译上的实验结果显示新提出的两阶段算法只需原来训练步的四分之一左右,就反而可以提高2个多的BLEU分数.[结论]实验结论证明减量学习框架能够在大大减少训练时空开销的前提下获得最优的性能,最终实现神经机器翻译模型的快速域迁移. [Objective]For the purpose of improving the transfer learning performance of neural machine translation(NMT)models,a domain adaptation method is explored with language data as the center.[Methods]According to the quantitative analysis results of two domain adaptation metrics of Kullback-leibler divergence and maximum mean discrepancy,a two-phase decremental learning framework is proposed for scenes with large-scale parallel sentences and small-scale domain texts.In the first phase,namely the domain filtering,domain texts are used to filter parallel sentences so that domain parallel sentences are obtained.Then,these obtained domain parallel sentences are used to train a domain NMT model.In the second phase,namely quality filtering,domain parallel sentences filtered in the first phase are translated by using the trained domain NMT model.Next,qualities of machine translation and manual translation are compared.Then,low quality parallel sentences are deleted to obtain high quality domain parallel sentences.Finally,an optimized domain NMT model is trained from the obtained high quality domain parallel sentences.[Results]Experimental results on English-Chinese NMT adapted to the legal domain show that the proposed two-phase algorithm only requires approximately a quarter of the original training steps,but can increase more than two BLEU points.[Conclusion]The experimental conclusion demonstrates that the decremental learning framework is capable of achieving the state-of-the-art performance with greatly reduced training space-time costs,and can implement fast domain transfer of NMT models.

作者刘伍颖金凯 LIU Wuying;JIN Kai(Shandong Key Laboratory of Language Resources Development and Application,Ludong University,Yantai 264025,China;School of Foreign Languages,Qilu University of Technology,Jinan 250353,China)

机构地区鲁东大学山东省语言资源开发与应用重点实验室齐鲁工业大学外国语学院

出处《厦门大学学报（自然科学版）》 CAS CSCD 北大核心 2024年第6期1033-1041,共9页 Journal of Xiamen University：Natural Science

基金教育部新文科研究与改革实践项目(2021060049) 教育部人文社会科学研究青年基金项目(20YJC740062) 教育部人文社会科学研究规划基金项目(20YJAZH069) 山东省研究生教育教学改革研究项目(SDYJG21185) 山东省本科教学改革研究重点项目(Z2021323) 上海市哲学社会科学“十三五”规划课题(2019BYY028)。

关键词域适应域适应量度减量学习神经机器翻译法律域 domain adaptation domain adaptation metrics decremental learning neural machine translation legal domain

分类号 TP391.1 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献3

1刘伍颖,王挺.结构化集成学习垃圾邮件过滤[J].计算机研究与发展,2012,49(3):628-635. 被引量：13
2刘欢,刘俊鹏,黄锴宇,黄德根.面向低资源俄汉机器翻译的领域适应方法[J].厦门大学学报（自然科学版）,2022,61(4):654-659. 被引量：4
3崔磊,周明.统计机器翻译领域自适应综述[J].智能计算机与应用,2014,4(6):31-34. 被引量：8

二级参考文献17

1姜远,周志华.基于词频分类器集成的文本分类方法[J].计算机研究与发展,2006,43(10):1681-1687. 被引量：22
2Dietterich T G. Ensemble methods in machine learning [C] // Proc of the Multiple Classifier Systems. London: Springer, 2000:1-15.
3Liu Wuying, Wang Ting. Multi-field learning for email spam filtering [C] //Proc of the 33rd Annual Int ACM SIGIR Conf on Research and Development in Information Retrieval. New York: ACM, 2010: 745-746.
4Fabrizio S. Machine learning in automated text categorization [J]. ACM Computing Surveys, 2002, 34(1): 1-47.
5Drucker H, Wu D, Vapnik V N. Support vector machines for spam categorization [J]. IEEE Trans on Neural Networks, 1999, 10(5): 1048-1054.
6Zobel J, Moffat A. Inverted files for text search engines [J]. ACM Computing Surveys, 2006, 38(2):.Article 6.
7Joachims T. Training linear SVMs in linear time [C] //Proc of the 12th ACM SIGKDD Int Conf on Knowledge Discovery and Data Mining. New York: ACM, 2006:217-226.
8Paul G. Better Bayesian filtering [C/OL] //Proc of the 2003 Spam Conf. 2003. [2010-01-01]. http://www, paulgraham. com/better, html.
9Sculley D, Wachman G M. Relaxed online SVMs in the TREC spam filtering track [C] //Proc of the 16th Text Retrieval Conf. Gaithersburg: NIST, 2007.
10Cormack G V, Lynam T. TREC 2005 spam track overview [C] //Proc of the 14th Text Retrieval Conf. Gaithersburg: NIST, 2005.

共引文献22

1申铉京,何月,张博,龙建武.基于空间信息及隶属度约束的FCM图像分割算法[J].北京工业大学学报,2012,38(7):1073-1078. 被引量：6
2周全强,张付志.基于仿生模式识别的用户概貌攻击集成检测方法[J].计算机研究与发展,2014,51(4):789-801. 被引量：5
3陈宇.基于用户行为的个性化邮件分类算法[J].福建工程学院学报,2014,12(1):83-87.
4郭军权,诸葛建伟,孙东红,段海新.Spampot：基于分布式蜜罐的垃圾邮件捕获系统[J].计算机研究与发展,2014,51(5):1071-1080. 被引量：11
5陈念,唐振民.QBC主动采样学习在垃圾邮件在线过滤中的应用[J].计算机工程与应用,2014,50(22):170-174. 被引量：3
6杨艳燕,郭红转,路新华.基于粗糙集的带决策规则边界的邮件过滤算法[J].计算机应用研究,2015,32(1):258-261. 被引量：2
7张虎,谭红叶,钱宇华,李茹,陈千.基于集成学习的中文文本欺骗检测研究[J].计算机研究与发展,2015,52(5):1005-1013. 被引量：7
8刘永磊,金志刚,杜磊.开放接入点的安全可信接入[J].计算机工程与应用,2016,52(6):99-101. 被引量：2
9丁亮,李颖,何彦青,王星,张运良,姚长青.基于汉语主题词表的统计机器翻译训练数据筛选方法及实验研究[J].情报学报,2016,35(8):875-884. 被引量：9
10韩红建,蒋跃.基于语料库的人机文学译本语言特征对比研究——以《傲慢与偏见》三个译本为例[J].外语教学,2016,37(5):102-106. 被引量：14

1谷雪鹏,郭军军,余正涛.融合BERT预训练语言知识的神经机器翻译方法[J].厦门大学学报（自然科学版）,2024,63(6):1024-1032.
2范莹.新零售驱动流通供应链商业模式转型升级研究[J].焦作大学学报,2024,38(3):54-57.
3杨坤成.建筑工程项目施工阶段质量控制措施[J].中国科技期刊数据库工业A,2024(11):020-023.
4李宗泽.基于开源大语言模型PiSSA微调的多跳问题生成模型[J].软件,2024,45(10):77-80.
5谭斌.房屋建筑工程施工阶段的质量管理问题及对策研究[J].房地产世界,2024(17):77-79.
6赵邦桂,李跃恒,马小军,郭永刚,李得银,魏云.不同遮挡因素对IBC组件发电量及温度的影响[J].材料导报,2024,38(S02):37-40.
7王滨,郭黎娜.基于改进CMDAM信息检索模型的翻译机器人质量提升研究[J].自动化与仪器仪表,2024(10):278-281.
8汪昂,查亮.水利工程施工监理质量控制措施分析[J].治淮,2024(11):11-12.
9于英俊.基于约束解码与最小贝叶斯的多模态语言翻译模型解码方法研究[J].自动化与仪器仪表,2024(11):172-176.
10唐聪,毛存礼,高盛祥,张思琦,王振晗.基于编码转写增强词嵌入迁移的老-中神经机器翻译[J].厦门大学学报（自然科学版）,2024,63(6):1016-1023.

厦门大学学报（自然科学版）

2024年第6期

浏览历史

内容加载中请稍等...

两阶段域适应神经机器翻译方法

参考文献3

二级参考文献17

共引文献22

相关作者

相关机构

相关主题

浏览历史