期刊文献+
共找到314篇文章
< 1 2 16 >
每页显示 20 50 100
Neural Machine Translation by Fusing Key Information of Text
1
作者 Shijie Hu Xiaoyu Li +8 位作者 Jiayu Bai Hang Lei Weizhong Qian Sunqiang Hu Cong Zhang Akpatsa Samuel Kofi Qian Qiu Yong Zhou Shan Yang 《Computers, Materials & Continua》 SCIE EI 2023年第2期2803-2815,共13页
When the Transformer proposed by Google in 2017,it was first used for machine translation tasks and achieved the state of the art at that time.Although the current neural machine translation model can generate high qu... When the Transformer proposed by Google in 2017,it was first used for machine translation tasks and achieved the state of the art at that time.Although the current neural machine translation model can generate high quality translation results,there are still mistranslations and omissions in the translation of key information of long sentences.On the other hand,the most important part in traditional translation tasks is the translation of key information.In the translation results,as long as the key information is translated accurately and completely,even if other parts of the results are translated incorrect,the final translation results’quality can still be guaranteed.In order to solve the problem of mistranslation and missed translation effectively,and improve the accuracy and completeness of long sentence translation in machine translation,this paper proposes a key information fused neural machine translation model based on Transformer.The model proposed in this paper extracts the keywords of the source language text separately as the input of the encoder.After the same encoding as the source language text,it is fused with the output of the source language text encoded by the encoder,then the key information is processed and input into the decoder.With incorporating keyword information from the source language sentence,the model’s performance in the task of translating long sentences is very reliable.In order to verify the effectiveness of the method of fusion of key information proposed in this paper,a series of experiments were carried out on the verification set.The experimental results show that the Bilingual Evaluation Understudy(BLEU)score of the model proposed in this paper on theWorkshop on Machine Translation(WMT)2017 test dataset is higher than the BLEU score of Transformer proposed by Google on the WMT2017 test dataset.The experimental results show the advantages of the model proposed in this paper. 展开更多
关键词 Key information TRANSFORMER FUSION neural machine translation
下载PDF
Neural Machine Translation Models with Attention-Based Dropout Layer
2
作者 Huma Israr Safdar Abbas Khan +3 位作者 Muhammad Ali Tahir Muhammad Khuram Shahzad Muneer Ahmad Jasni Mohamad Zain 《Computers, Materials & Continua》 SCIE EI 2023年第5期2981-3009,共29页
In bilingual translation,attention-based Neural Machine Translation(NMT)models are used to achieve synchrony between input and output sequences and the notion of alignment.NMT model has obtained state-of-the-art perfo... In bilingual translation,attention-based Neural Machine Translation(NMT)models are used to achieve synchrony between input and output sequences and the notion of alignment.NMT model has obtained state-of-the-art performance for several language pairs.However,there has been little work exploring useful architectures for Urdu-to-English machine translation.We conducted extensive Urdu-to-English translation experiments using Long short-term memory(LSTM)/Bidirectional recurrent neural networks(Bi-RNN)/Statistical recurrent unit(SRU)/Gated recurrent unit(GRU)/Convolutional neural network(CNN)and Transformer.Experimental results show that Bi-RNN and LSTM with attention mechanism trained iteratively,with a scalable data set,make precise predictions on unseen data.The trained models yielded competitive results by achieving 62.6%and 61%accuracy and 49.67 and 47.14 BLEU scores,respectively.From a qualitative perspective,the translation of the test sets was examined manually,and it was observed that trained models tend to produce repetitive output more frequently.The attention score produced by Bi-RNN and LSTM produced clear alignment,while GRU showed incorrect translation for words,poor alignment and lack of a clear structure.Therefore,we considered refining the attention-based models by defining an additional attention-based dropout layer.Attention dropout fixes alignment errors and minimizes translation errors at the word level.After empirical demonstration and comparison with their counterparts,we found improvement in the quality of the resulting translation system and a decrease in the perplexity and over-translation score.The ability of the proposed model was evaluated using Arabic-English and Persian-English datasets as well.We empirically concluded that adding an attention-based dropout layer helps improve GRU,SRU,and Transformer translation and is considerably more efficient in translation quality and speed. 展开更多
关键词 Natural language processing neural machine translation word embedding ATTENTION PERPLEXITY selective dropout regularization URDU PERSIAN Arabic BLEU
下载PDF
Research on system combination of machine translation based on Transformer
3
作者 刘文斌 HE Yanqing +1 位作者 LAN Tian WU Zhenfeng 《High Technology Letters》 EI CAS 2023年第3期310-317,共8页
Influenced by its training corpus,the performance of different machine translation systems varies greatly.Aiming at achieving higher quality translations,system combination methods combine the translation results of m... Influenced by its training corpus,the performance of different machine translation systems varies greatly.Aiming at achieving higher quality translations,system combination methods combine the translation results of multiple systems through statistical combination or neural network combination.This paper proposes a new multi-system translation combination method based on the Transformer architecture,which uses a multi-encoder to encode source sentences and the translation results of each system in order to realize encoder combination and decoder combination.The experimental verification on the Chinese-English translation task shows that this method has 1.2-2.35 more bilingual evaluation understudy(BLEU)points compared with the best single system results,0.71-3.12 more BLEU points compared with the statistical combination method,and 0.14-0.62 more BLEU points compared with the state-of-the-art neural network combination method.The experimental results demonstrate the effectiveness of the proposed system combination method based on Transformer. 展开更多
关键词 TRANSFORMER system combination neural machine translation(nmt) attention mechanism multi-encoder
下载PDF
Corpus Augmentation for Improving Neural Machine Translation 被引量:2
4
作者 Zijian Li Chengying Chi Yunyun Zhan 《Computers, Materials & Continua》 SCIE EI 2020年第7期637-650,共14页
The translation quality of neural machine translation(NMT)systems depends largely on the quality of large-scale bilingual parallel corpora available.Research shows that under the condition of limited resources,the per... The translation quality of neural machine translation(NMT)systems depends largely on the quality of large-scale bilingual parallel corpora available.Research shows that under the condition of limited resources,the performance of NMT is greatly reduced,and a large amount of high-quality bilingual parallel data is needed to train a competitive translation model.However,not all languages have large-scale and high-quality bilingual corpus resources available.In these cases,improving the quality of the corpora has become the main focus to increase the accuracy of the NMT results.This paper proposes a new method to improve the quality of data by using data cleaning,data expansion,and other measures to expand the data at the word and sentence-level,thus improving the richness of the bilingual data.The long short-term memory(LSTM)language model is also used to ensure the smoothness of sentence construction in the process of sentence construction.At the same time,it uses a variety of processing methods to improve the quality of the bilingual data.Experiments using three standard test sets are conducted to validate the proposed method;the most advanced fairseq-transformer NMT system is used in the training.The results show that the proposed method has worked well on improving the translation results.Compared with the state-of-the-art methods,the BLEU value of our method is increased by 2.34 compared with that of the baseline. 展开更多
关键词 neural machine translation corpus argumentation model improvement deep learning data cleaning
下载PDF
Dependency-Based Local Attention Approach to Neural Machine Translation 被引量:2
5
作者 Jing Qiu Yan Liu +4 位作者 Yuhan Chai Yaqi Si Shen Su Le Wang Yue Wu 《Computers, Materials & Continua》 SCIE EI 2019年第5期547-562,共16页
Recently dependency information has been used in different ways to improve neural machine translation.For example,add dependency labels to the hidden states of source words.Or the contiguous information of a source wo... Recently dependency information has been used in different ways to improve neural machine translation.For example,add dependency labels to the hidden states of source words.Or the contiguous information of a source word would be found according to the dependency tree and then be learned independently and be added into Neural Machine Translation(NMT)model as a unit in various ways.However,these works are all limited to the use of dependency information to enrich the hidden states of source words.Since many works in Statistical Machine Translation(SMT)and NMT have proven the validity and potential of using dependency information.We believe that there are still many ways to apply dependency information in the NMT structure.In this paper,we explore a new way to use dependency information to improve NMT.Based on the theory of local attention mechanism,we present Dependency-based Local Attention Approach(DLAA),a new attention mechanism that allowed the NMT model to trace the dependency words related to the current translating words.Our work also indicates that dependency information could help to supervise attention mechanism.Experiment results on WMT 17 Chineseto-English translation task shared training datasets show that our model is effective and perform distinctively on long sentence translation. 展开更多
关键词 neural machine translation attention mechanism dependency parsing
下载PDF
A Novel Beam Search to Improve Neural Machine Translation for English-Chinese 被引量:1
6
作者 Xinyue Lin Jin Liu +1 位作者 Jianming Zhang Se-Jung Lim 《Computers, Materials & Continua》 SCIE EI 2020年第10期387-404,共18页
Neural Machine Translation(NMT)is an end-to-end learning approach for automated translation,overcoming the weaknesses of conventional phrase-based translation systems.Although NMT based systems have gained their popul... Neural Machine Translation(NMT)is an end-to-end learning approach for automated translation,overcoming the weaknesses of conventional phrase-based translation systems.Although NMT based systems have gained their popularity in commercial translation applications,there is still plenty of room for improvement.Being the most popular search algorithm in NMT,beam search is vital to the translation result.However,traditional beam search can produce duplicate or missing translation due to its target sequence selection strategy.Aiming to alleviate this problem,this paper proposed neural machine translation improvements based on a novel beam search evaluation function.And we use reinforcement learning to train a translation evaluation system to select better candidate words for generating translations.In the experiments,we conducted extensive experiments to evaluate our methods.CASIA corpus and the 1,000,000 pairs of bilingual corpora of NiuTrans are used in our experiments.The experiment results prove that the proposed methods can effectively improve the English to Chinese translation quality. 展开更多
关键词 neural machine translation beam search reinforcement learning
下载PDF
Improvements of Google Neural Machine Translation
7
作者 李瑞 蒋美佳 《海外英语》 2017年第15期132-134,共3页
Machine Translation has been playing an important role in modern society due to its effectiveness and efficiency,but the great demand for corpus makes it difficult for users to use traditional Machine Translation syst... Machine Translation has been playing an important role in modern society due to its effectiveness and efficiency,but the great demand for corpus makes it difficult for users to use traditional Machine Translation systems.To solve this problem and improve translation quality,in November 2016,Google introduces Google Neural Machine Translation system,which implements the latest techniques to achieve better outcomes.The conspicuous achievement has been proved by experiments using BLEU score to measure performance of different systems.With GNMT,the gap between human and machine translation is narrowing. 展开更多
关键词 machine translation machine translation improvement translation google neural machine translation neural machine translation
下载PDF
Progress in Machine Translation 被引量:1
8
作者 Haifeng Wang Hua Wu +2 位作者 Zhongjun He Liang Huang Kenneth Ward Church 《Engineering》 SCIE EI CAS 2022年第11期143-153,共11页
After more than 70 years of evolution,great achievements have been made in machine translation.Especially in recent years,translation quality has been greatly improved with the emergence of neural machine translation(... After more than 70 years of evolution,great achievements have been made in machine translation.Especially in recent years,translation quality has been greatly improved with the emergence of neural machine translation(NMT).In this article,we first review the history of machine translation from rule-based machine translation to example-based machine translation and statistical machine translation.We then introduce NMT in more detail,including the basic framework and the current dominant framework,Transformer,as well as multilingual translation models to deal with the data sparseness problem.In addition,we introduce cutting-edge simultaneous translation methods that achieve a balance between translation quality and latency.We then describe various products and applications of machine translation.At the end of this article,we briefly discuss challenges and future research directions in this field. 展开更多
关键词 machine translation neural machine translation Simultaneous translation
下载PDF
Sentence-Level Paraphrasing for Machine Translation System Combination 被引量:1
9
作者 Junguo Zhu Muyun Yang +1 位作者 Sheng Li Tiejun Zhao 《国际计算机前沿大会会议论文集》 2016年第1期156-158,共3页
In this paper, we propose to enhance machine translation system combination (MTSC) with a sentence-level paraphrasing model trained by a neural network. This work extends the number of candidates in MTSC by paraphrasi... In this paper, we propose to enhance machine translation system combination (MTSC) with a sentence-level paraphrasing model trained by a neural network. This work extends the number of candidates in MTSC by paraphrasing the whole original MT translation sentences. First we train a neural paraphrasing model of Encoder-Decoder, and leverage the model to paraphrase the MT system outputs to generate synonymous candidates in the semantic space. Then we merge all of them into a single improved translation by a state-of-the-art system combination approach (MEMT) adding some new paraphrasing features. Our experimental results show a significant improvement of 0.28 BLEU points on the WMT2011 test data and 0.41 BLEU points without considering the out-of-vocabulary (OOV) words for the sentence-level paraphrasing model. 展开更多
关键词 machine translation System COMBINATION PARAPHRASING neural network
下载PDF
NMT语料库中语符不平衡度的测评研究
10
作者 王海波 余丽丽 王宏伟 《电子学报》 EI CAS CSCD 北大核心 2023年第10期2884-2893,共10页
语符不平衡是神经机器翻译(Neural Machine Translation,NMT)语料库中普遍存在的现象.评估NMT语料库的语符不平衡度对提升语料库质量和翻译效果具有重要意义.针对现有的语符不平衡度测评研究在算法和分词范围上的缺陷与不足,本文提出语... 语符不平衡是神经机器翻译(Neural Machine Translation,NMT)语料库中普遍存在的现象.评估NMT语料库的语符不平衡度对提升语料库质量和翻译效果具有重要意义.针对现有的语符不平衡度测评研究在算法和分词范围上的缺陷与不足,本文提出语符分布离散度算法(Dispersion of Token Distribution,DTD),用以计算语符不平衡度,并扩大分词范围,从字符、子词和词3种粒度对语料库进行评估.实验结果表明,该算法在准确度、有效性和鲁棒性方面较以往研究有较大提升;语料库在不同分词粒度下的语符不平衡度差异很大,其中字符粒度的语符不平衡度最大,子词粒度次之,词粒度最小. 展开更多
关键词 神经机器翻译 语料库 分词 粒度 语符不平衡度
下载PDF
基于自注意力机制神经机器翻译的软件缺陷自动修复方法
11
作者 曹鹤玲 刘昱 韩栋 《电子学报》 EI CAS CSCD 北大核心 2024年第3期945-956,共12页
循环神经网络对于代码序列数据有着良好的处理能力,软件缺陷修复的补丁生成模型大多采用循环神经网络实现.然而,基于循环神经网络的补丁生成模型在处理代码序列中长距离依赖问题时仍然具有局限性,其修复成功率和修复效率较低.针对此问题... 循环神经网络对于代码序列数据有着良好的处理能力,软件缺陷修复的补丁生成模型大多采用循环神经网络实现.然而,基于循环神经网络的补丁生成模型在处理代码序列中长距离依赖问题时仍然具有局限性,其修复成功率和修复效率较低.针对此问题,提出一种基于自注意力神经机器翻译的软件缺陷自动修复方法(Self-attention Neural machine translation based automatic software Repair,SNRepair).首先,为有效缓解源码中的未登录词问题,对数据集引入子词切分技术进行预处理;其次,为解决源代码中棘手的长距离依赖问题并更充分地利用局部信息,构建融合局部建模的Transformer程序补丁生成模型;然后,采用缺陷自动定位技术定位缺陷语句位置,利用参数优化后的Transformer补丁生成模型生成候选补丁;最后,运行测试用例验证候选补丁.在具有395个真实Java软件缺陷的Defects4J缺陷库上实验评估,结果表明SNRepair方法与对比方法比较,修复成功率和修复效率更高. 展开更多
关键词 软件缺陷自动修复 神经机器翻译 自注意力机制 子词切分 局部建模
下载PDF
基于情感语义增强编解码的神经机器翻译方法
12
作者 万飞 《计算机技术与发展》 2024年第9期94-101,共8页
针对目前神经机器翻译模型仅依赖平行语料训练而无法充分挖掘深层语言知识的问题,提出一种基于情感语义增强编解码的神经机器翻译方法,旨在通过引入额外的情感语义,提高模型对语言深层次信息的理解能力。首先,利用word2vec技术获取语料... 针对目前神经机器翻译模型仅依赖平行语料训练而无法充分挖掘深层语言知识的问题,提出一种基于情感语义增强编解码的神经机器翻译方法,旨在通过引入额外的情感语义,提高模型对语言深层次信息的理解能力。首先,利用word2vec技术获取语料中所有单词的词嵌入,将其输入到一个融合模型中进行训练。该融合模型结合了基于GRU和文档嵌入的机制,以获取单词级别和文档级别的情感语义表征;其次,在情感融合阶段,采用加权公式将单词级别和文档级别的情感语义有机地融合,形成更为综合的情感语义表征;最后,将此表征与上下文语义表征按位相加,以全面引入情感信息,并将其作为输入传递到机器翻译模型的编码器和解码器中。在多个基准数据集上的实验显示,相较于传统的Transformer模型,该方法在IWSLT数据集上性能显著提升,BLEU值增加1.3至1.62。在WMT数据集上也取得良好性能,证实了融合情感语义在机器翻译中的有效性。 展开更多
关键词 情感语义 增强编解码 神经机器翻译 TRANSFORMER 平行语料
下载PDF
基于深度编码注意力的XLNet-Transformer汉-马低资源神经机器翻译优化方法 被引量:1
13
作者 占思琦 徐志展 +1 位作者 杨威 谢抢来 《计算机应用研究》 CSCD 北大核心 2024年第3期799-804,810,共7页
神经机器翻译(NMT)在多个领域应用中已取得显著成效,在大规模语料库上已充分论证其优越性。然而,在语料库资源不足的情形下,仍存在较大的改进空间。由于汉语-马来语(汉-马)平行语料的匮乏,直接导致了汉-马机器翻译的翻译效果不佳。为解... 神经机器翻译(NMT)在多个领域应用中已取得显著成效,在大规模语料库上已充分论证其优越性。然而,在语料库资源不足的情形下,仍存在较大的改进空间。由于汉语-马来语(汉-马)平行语料的匮乏,直接导致了汉-马机器翻译的翻译效果不佳。为解决汉-马低资源机器翻译不理想的问题,提出了一种基于深度编码注意力和渐进式解冻的低资源神经机器翻译方法。首先,利用XLNet预训练模型重构编码器,在编码器中使用了XLNet动态聚合模块替代了传统编码层的输出方式,有效弥补了低资源汉-马语料匮乏的瓶颈;其次,在解码器中使用并行交叉注意力模块对传统编码-解码注意力进行了改进,提升了源词和目标词的潜在关系的捕获能力;最后,对提出模型采用渐进式解冻训练策略,最大化释放了模型的性能。实验结果表明,提出方法在小规模的汉-马数据集上得到了显著的性能提升,验证了方法的有效性,对比其他的低资源NMT方法,所提方法结构更为精简,并改进了编码器和解码器,翻译效果提升更加显著,为应对低资源机器翻译提供了有效的策略与启示。 展开更多
关键词 神经网络 汉-马机器翻译 低资源 渐进式解冻 预训练
下载PDF
融合指代消解的神经机器翻译研究
14
作者 冯勤 贡正仙 +1 位作者 李军辉 周国栋 《中文信息学报》 CSCD 北大核心 2024年第6期67-76,共10页
篇章中的同一实体经常会呈现出不同的表述,形成一系列复杂的指代关系,这给篇章翻译带来了很大的挑战。该文重点探索指代消解和篇章神经机器翻译的融合方案,首先为指代链设计相应的指代表征;其次使用软约束和硬约束两种方法在翻译系统中... 篇章中的同一实体经常会呈现出不同的表述,形成一系列复杂的指代关系,这给篇章翻译带来了很大的挑战。该文重点探索指代消解和篇章神经机器翻译的融合方案,首先为指代链设计相应的指代表征;其次使用软约束和硬约束两种方法在翻译系统中实现指代信息的融合。该文建议的方法分别在英语-德语和中文-英语语言对上进行了实验,实验结果表明,相比于同期最好的句子级翻译系统,该方法能使翻译性能获得明显提升。此外,在英语-德语的代词翻译质量的专门评估中,准确率也有显著提升。 展开更多
关键词 指代表征 神经机器翻译 篇章级机器翻译
下载PDF
融合目标端上下文的篇章神经机器翻译
15
作者 贾爱鑫 李军辉 +1 位作者 贡正仙 张民 《中文信息学报》 CSCD 北大核心 2024年第4期59-68,共10页
神经机器翻译在句子级翻译任务上取得了令人瞩目的效果,但是句子级翻译的译文会存在一致性、指代等篇章问题,篇章翻译通过利用上下文信息来解决上述问题。不同于以往使用源端上下文建模的方法,该文提出了融合目标端上下文信息的篇章神... 神经机器翻译在句子级翻译任务上取得了令人瞩目的效果,但是句子级翻译的译文会存在一致性、指代等篇章问题,篇章翻译通过利用上下文信息来解决上述问题。不同于以往使用源端上下文建模的方法,该文提出了融合目标端上下文信息的篇章神经机器翻译。具体地,该文借助推敲网络的思想,对篇章源端进行二次翻译,第一次基于句子级翻译,第二次翻译参考了全篇的第一次翻译结果。基于LDC中英篇章数据集和WMT英德篇章数据集的实验结果表明,在引入较少的参数的条件下,该文方法能显著提高翻译性能。同时,随着第一次翻译(即句子级译文)质量的提升,所提方法也更有效。 展开更多
关键词 神经机器翻译 推敲网络 篇章翻译
下载PDF
面向低资源场景的神经机器翻译方法
16
作者 胡朝东 叶娜 +1 位作者 张桂平 蔡东风 《中文信息学报》 CSCD 北大核心 2024年第6期58-66,共9页
神经机器翻译需要大规模的双语平行语料利用深度学习的方法构建翻译模型,但低资源场景下平行句对缺乏,导致训练的神经机器翻译模型效果较差。无监督神经机器翻译技术仅使用两种语言的单语数据,解决了神经机器翻译对大规模双语平行数据... 神经机器翻译需要大规模的双语平行语料利用深度学习的方法构建翻译模型,但低资源场景下平行句对缺乏,导致训练的神经机器翻译模型效果较差。无监督神经机器翻译技术仅使用两种语言的单语数据,解决了神经机器翻译对大规模双语平行数据的依赖问题。但是无监督神经机器翻译技术存在两个问题,一是对于句法建模能力欠缺;二是在低资源场景下存在的少量双语语料不能用于模型训练,造成双语语料资源浪费。为了解决上述问题,该文提出在无监督神经机器翻译中融合句法知识的方法,使模型可以充分学习句子的句法信息;同时引入少量双语平行语料辅助无监督神经机器翻译训练,使模型直接学习源语言与目标语言单词之间的转换。与基线模型相比较,在英-法和德-英单语新闻数据集上BLEU值分别提升了1.65和1.79。 展开更多
关键词 无监督神经机器翻译 句法知识 去噪自动编码器
下载PDF
维吾尔语机器翻译研究综述
17
作者 哈里旦木·阿布都克里木 侯钰涛 +2 位作者 姚登峰 阿布都克力木·阿布力孜 陈吉尚 《计算机工程》 CSCD 北大核心 2024年第1期1-16,共16页
维吾尔语机器翻译作为我国低资源机器翻译研究的重要任务之一,其发展与应用可以更好地促进不同地区和民族之间的文化交流与贸易往来。然而,维吾尔语作为一种黏着性语言,在机器翻译领域存在形态复杂、语料稀缺等问题。近年来,在维吾尔语... 维吾尔语机器翻译作为我国低资源机器翻译研究的重要任务之一,其发展与应用可以更好地促进不同地区和民族之间的文化交流与贸易往来。然而,维吾尔语作为一种黏着性语言,在机器翻译领域存在形态复杂、语料稀缺等问题。近年来,在维吾尔语机器翻译发展的不同阶段,研究人员针对其特点在算法和模型上不断优化与创新,取得了一定的研究成果,但缺乏系统性的综述。全面回顾维吾尔语机器翻译的相关研究,并根据方法的不同将其分为基于规则和实例的维吾尔语机器翻译、基于统计的维吾尔语机器翻译以及基于神经网络的维吾尔语机器翻译3种类型,同时对相关学术活动和语料库资源进行汇总。为进一步探索维吾尔语机器翻译的潜力,采用ChatGPT模型对维吾尔语-汉语机器翻译任务进行初步研究,实验结果表明,在Few-shot情景下,随着示例数的增加,翻译性能先升后降,在10-shot时表现最佳。此外,思维链方法在维吾尔语机器翻译任务中并未展示出更优的翻译能力。最后对维吾尔语机器翻译未来的研究方向进行了展望。 展开更多
关键词 维吾尔语 基于规则和实例的机器翻译 统计机器翻译 神经机器翻译 大语言模型
下载PDF
神经机器翻译综述 被引量:2
18
作者 章钧津 田永红 +1 位作者 宋哲煜 郝宇峰 《计算机工程与应用》 CSCD 北大核心 2024年第4期57-74,共18页
机器翻译主要研究如何将源语言翻译为目标语言,对于促进民族之间的交流具有重要意义。目前神经机器翻译凭借翻译速度和译文质量成为主流的机器翻译方法。为更好地进行脉络梳理,首先对机器翻译的历史和方法进行研究,并对基于规则的机器... 机器翻译主要研究如何将源语言翻译为目标语言,对于促进民族之间的交流具有重要意义。目前神经机器翻译凭借翻译速度和译文质量成为主流的机器翻译方法。为更好地进行脉络梳理,首先对机器翻译的历史和方法进行研究,并对基于规则的机器翻译、基于统计的机器翻译和基于深度学习的机器翻译三种方法进行对比总结;然后引出神经机器翻译,并对其常见的类型进行讲解;接着选取多模态机器翻译、非自回归机器翻译、篇章级机器翻译、多语言机器翻译、数据增强技术和预训练模型六个主要的神经机器翻译研究领域进行重点介绍;最后从低资源语言、上下文相关翻译、未登录词和大模型四个方面对神经机器翻译的未来进行了展望。通过系统性的介绍以更好地理解神经机器翻译的发展现状。 展开更多
关键词 机器翻译 神经机器翻译 篇章级机器翻译 数据增强 预处理技术
下载PDF
融合乌尔都语词性序列预测的汉乌神经机器翻译
19
作者 陈欢欢 王剑 Muhammad Naeem Ul Hassan 《计算机工程与科学》 CSCD 北大核心 2024年第3期518-524,共7页
面向南亚和东南亚的小语种机器翻译,目前已有不少研究团队开展了深入研究,但作为巴基斯坦官方语言的乌尔都语,由于稀缺的数据资源和与汉语之间的巨大差距,有针对性的汉乌机器翻译方法研究非常稀少。针对这种情况,提出了基于Transformer... 面向南亚和东南亚的小语种机器翻译,目前已有不少研究团队开展了深入研究,但作为巴基斯坦官方语言的乌尔都语,由于稀缺的数据资源和与汉语之间的巨大差距,有针对性的汉乌机器翻译方法研究非常稀少。针对这种情况,提出了基于Transformer的融合乌尔都语词性序列的汉乌神经机器翻译模型。首先利用Transformer对目标语言乌尔都语的词性序列进行预测,然后将翻译模型的预测结果和词性序列模型的预测结果相结合进行联合预测,从而实现语言知识到翻译模型的融入。在现有小规模汉乌数据集上的实验表明,所提方法在数据集上的BLEU值相较于基准模型提升了0.13,取得了较为明显的效果。 展开更多
关键词 TRANSFORMER 神经机器翻译 乌尔都语 词性序列
下载PDF
基于降噪原型序列的汉越神经机器翻译
20
作者 杨汉清 赖华 +1 位作者 于志强 余正涛 《厦门大学学报(自然科学版)》 CAS CSCD 北大核心 2024年第4期705-713,共9页
[目的]在汉越低资源场景下,平行语料匮乏,原型序列蕴含庞杂的信息,直接使用会增加翻译模型训练的难度,甚至引入噪声,故对原型序列的降噪策略进行研究.[方法]首先,利用跨语言检索得到原型序列;其次,基于实体词典对原型序列中的噪声信息... [目的]在汉越低资源场景下,平行语料匮乏,原型序列蕴含庞杂的信息,直接使用会增加翻译模型训练的难度,甚至引入噪声,故对原型序列的降噪策略进行研究.[方法]首先,利用跨语言检索得到原型序列;其次,基于实体词典对原型序列中的噪声信息进行掩盖,再综合稀有词词频及语义相似度,得到原型序列的参考价值;最后使用额外的编码器接收原型序列,并允许解码器到两个编码器间建立注意力机制.[结果]相比基线模型,相似度评分、稀有词词频、依据实体词典降噪,以及3种降噪融合的策略使汉越神经机器翻译的性能分别提升0.24,0.12,0.29,以及0.69个百分点的BLEU值.[结论]经降噪策略处理的原型序列能提升汉越神经机器翻译的性能. 展开更多
关键词 汉越神经机器翻译 低资源 原型序列 降噪
下载PDF
上一页 1 2 16 下一页 到第
使用帮助 返回顶部