期刊文献+
共找到5,781篇文章
< 1 2 250 >
每页显示 20 50 100
Improving Phrase-Based Statistical Machine Translation Models by Incorporating Syntax-Based Language Models
1
作者 陈毅东 史晓东 《Journal of Donghua University(English Edition)》 EI CAS 2010年第2期185-188,共4页
This paper proposed a method to incorporate syntax-based language models in phrase-based statistical machine translation (SMT) systems. The syntax-based language model used in this paper is based on link grammar,which... This paper proposed a method to incorporate syntax-based language models in phrase-based statistical machine translation (SMT) systems. The syntax-based language model used in this paper is based on link grammar,which is a high lexical formalism. In order to apply language models based on link grammar in phrase-based models,the concept of linked phrases,an extension of the concept of traditional phrases in phrase-based models was brought out. Experiments were conducted and the results showed that the use of syntax-based language models could improve the performance of the phrase-based models greatly. 展开更多
关键词 statistical machine translation phrase-based translation models syntax-based language models linkage grammar
下载PDF
Integrating Deep Learning and Machine Translation for Understanding Unrefined Languages
2
作者 Hong Geun Ji Soyoung Oh +2 位作者 Jina Kim Seong Choi Eunil Park 《Computers, Materials & Continua》 SCIE EI 2022年第1期669-678,共10页
In the field of natural language processing(NLP),the advancement of neural machine translation has paved the way for cross-lingual research.Yet,most studies in NLP have evaluated the proposed language models on well-r... In the field of natural language processing(NLP),the advancement of neural machine translation has paved the way for cross-lingual research.Yet,most studies in NLP have evaluated the proposed language models on well-refined datasets.We investigatewhether amachine translation approach is suitable for multilingual analysis of unrefined datasets,particularly,chat messages in Twitch.In order to address it,we collected the dataset,which included 7,066,854 and 3,365,569 chat messages from English and Korean streams,respectively.We employed several machine learning classifiers and neural networks with two different types of embedding:word-sequence embedding and the final layer of a pre-trained language model.The results of the employed models indicate that the accuracy difference between English,and English to Korean was relatively high,ranging from 3%to 12%.For Korean data(Korean,and Korean to English),it ranged from 0%to 2%.Therefore,the results imply that translation from a low-resource language(e.g.,Korean)into a high-resource language(e.g.,English)shows higher performance,in contrast to vice versa.Several implications and limitations of the presented results are also discussed.For instance,we suggest the feasibility of translation from resource-poor languages for using the tools of resource-rich languages in further analysis. 展开更多
关键词 TWITCH MULTILINGUAL machine translation machine learning
下载PDF
Improving Low-Resource Machine Translation Using Reinforcement Learning from Human Feedback
3
作者 Liqing Wang Yiheng Xiao 《Intelligent Automation & Soft Computing》 2024年第4期619-631,共13页
Neural Machine Translation is one of the key research directions in Natural Language Processing.However,limited by the scale and quality of parallel corpus,the translation quality of low-resource Neural Machine Transl... Neural Machine Translation is one of the key research directions in Natural Language Processing.However,limited by the scale and quality of parallel corpus,the translation quality of low-resource Neural Machine Translation has always been unsatisfactory.When Reinforcement Learning from Human Feedback(RLHF)is applied to lowresource machine translation,commonly encountered issues of substandard preference data quality and the higher cost associated with manual feedback data.Therefore,a more cost-effective method for obtaining feedback data is proposed.At first,optimizing the quality of preference data through the prompt engineering of the Large Language Model(LLM),then combining human feedback to complete the evaluation.In this way,the reward model could acquire more semantic information and human preferences during the training phase,thereby enhancing feedback efficiency and the result’s quality.Experimental results demonstrate that compared with the traditional RLHF method,our method has been proven effective on multiple datasets and exhibits a notable improvement of 1.07 in BLUE.Meanwhile,it is also more favorably received in the assessments conducted by human evaluators and GPT-4o. 展开更多
关键词 Low-resource neural machine translation RLHF prompt engineering LLM
下载PDF
Translation Strategies of Ambiguity in English Language and Literature
4
作者 Ying Gao 《Journal of Contemporary Educational Research》 2024年第2期138-143,共6页
The translation of English language and literary works has always been crucial for cross-cultural communication.However,a key challenge in translating such works is the accurate communication of ambiguity,which refers... The translation of English language and literary works has always been crucial for cross-cultural communication.However,a key challenge in translating such works is the accurate communication of ambiguity,which refers to expressions deliberately used in English texts with unclear meanings.These expressions are often poetic and carry deep symbols and implications,adding a unique charm to literary works.This article explores the manifestations of ambiguity in English language and literature and the translation strategies that can be employed to optimize the translation of ambiguity in the English language and literature. 展开更多
关键词 ENGLISH language and literature AMBIGUITY translation
下载PDF
Translation of English Language into Urdu Language Using LSTM Model
5
作者 Sajadul Hassan Kumhar Syed Immamul Ansarullah +3 位作者 Akber Abid Gardezi Shafiq Ahmad Abdelaty Edrees Sayed Muhammad Shafiq 《Computers, Materials & Continua》 SCIE EI 2023年第2期3899-3912,共14页
English to Urdu machine translation is still in its beginning and lacks simple translation methods to provide motivating and adequate English to Urdu translation.In order tomake knowledge available to the masses,there... English to Urdu machine translation is still in its beginning and lacks simple translation methods to provide motivating and adequate English to Urdu translation.In order tomake knowledge available to the masses,there should be mechanisms and tools in place to make things understandable by translating from source language to target language in an automated fashion.Machine translation has achieved this goal with encouraging results.When decoding the source text into the target language,the translator checks all the characteristics of the text.To achieve machine translation,rule-based,computational,hybrid and neural machine translation approaches have been proposed to automate the work.In this research work,a neural machine translation approach is employed to translate English text into Urdu.Long Short Term Short Model(LSTM)Encoder Decoder is used to translate English to Urdu.The various steps required to perform translation tasks include preprocessing,tokenization,grammar and sentence structure analysis,word embeddings,training data preparation,encoder-decoder models,and output text generation.The results show that the model used in the research work shows better performance in translation.The results were evaluated using bilingual research metrics and showed that the test and training data yielded the highest score sequences with an effective length of ten(10). 展开更多
关键词 machine translation Urdu language word embedding
下载PDF
Neural Machine Translation Models with Attention-Based Dropout Layer
6
作者 Huma Israr Safdar Abbas Khan +3 位作者 Muhammad Ali Tahir Muhammad Khuram Shahzad Muneer Ahmad Jasni Mohamad Zain 《Computers, Materials & Continua》 SCIE EI 2023年第5期2981-3009,共29页
In bilingual translation,attention-based Neural Machine Translation(NMT)models are used to achieve synchrony between input and output sequences and the notion of alignment.NMT model has obtained state-of-the-art perfo... In bilingual translation,attention-based Neural Machine Translation(NMT)models are used to achieve synchrony between input and output sequences and the notion of alignment.NMT model has obtained state-of-the-art performance for several language pairs.However,there has been little work exploring useful architectures for Urdu-to-English machine translation.We conducted extensive Urdu-to-English translation experiments using Long short-term memory(LSTM)/Bidirectional recurrent neural networks(Bi-RNN)/Statistical recurrent unit(SRU)/Gated recurrent unit(GRU)/Convolutional neural network(CNN)and Transformer.Experimental results show that Bi-RNN and LSTM with attention mechanism trained iteratively,with a scalable data set,make precise predictions on unseen data.The trained models yielded competitive results by achieving 62.6%and 61%accuracy and 49.67 and 47.14 BLEU scores,respectively.From a qualitative perspective,the translation of the test sets was examined manually,and it was observed that trained models tend to produce repetitive output more frequently.The attention score produced by Bi-RNN and LSTM produced clear alignment,while GRU showed incorrect translation for words,poor alignment and lack of a clear structure.Therefore,we considered refining the attention-based models by defining an additional attention-based dropout layer.Attention dropout fixes alignment errors and minimizes translation errors at the word level.After empirical demonstration and comparison with their counterparts,we found improvement in the quality of the resulting translation system and a decrease in the perplexity and over-translation score.The ability of the proposed model was evaluated using Arabic-English and Persian-English datasets as well.We empirically concluded that adding an attention-based dropout layer helps improve GRU,SRU,and Transformer translation and is considerably more efficient in translation quality and speed. 展开更多
关键词 Natural language processing neural machine translation word embedding ATTENTION PERPLEXITY selective dropout regularization URDU PERSIAN Arabic BLEU
下载PDF
Alphabet-Level Indian Sign Language Translation to Text Using Hybrid-AO Thresholding with CNN
7
作者 Seema Sabharwal Priti Singla 《Intelligent Automation & Soft Computing》 SCIE 2023年第9期2567-2582,共16页
Sign language is used as a communication medium in the field of trade,defence,and in deaf-mute communities worldwide.Over the last few decades,research in the domain of translation of sign language has grown and becom... Sign language is used as a communication medium in the field of trade,defence,and in deaf-mute communities worldwide.Over the last few decades,research in the domain of translation of sign language has grown and become more challenging.This necessitates the development of a Sign Language Translation System(SLTS)to provide effective communication in different research domains.In this paper,novel Hybrid Adaptive Gaussian Thresholding with Otsu Algorithm(Hybrid-AO)for image segmentation is proposed for the translation of alphabet-level Indian Sign Language(ISLTS)with a 5-layer Convolution Neural Network(CNN).The focus of this paper is to analyze various image segmentation(Canny Edge Detection,Simple Thresholding,and Hybrid-AO),pooling approaches(Max,Average,and Global Average Pooling),and activation functions(ReLU,Leaky ReLU,and ELU).5-layer CNN with Max pooling,Leaky ReLU activation function,and Hybrid-AO(5MXLR-HAO)have outperformed other frameworks.An open-access dataset of ISL alphabets with approx.31 K images of 26 classes have been used to train and test the model.The proposed framework has been developed for translating alphabet-level Indian Sign Language into text.The proposed framework attains 98.95%training accuracy,98.05%validation accuracy,and 0.0721 training loss and 0.1021 validation loss and the perfor-mance of the proposed system outperforms other existing systems. 展开更多
关键词 Sign language translation CNN THRESHOLDING Indian sign language
下载PDF
Neural Machine Translation by Fusing Key Information of Text
8
作者 Shijie Hu Xiaoyu Li +8 位作者 Jiayu Bai Hang Lei Weizhong Qian Sunqiang Hu Cong Zhang Akpatsa Samuel Kofi Qian Qiu Yong Zhou Shan Yang 《Computers, Materials & Continua》 SCIE EI 2023年第2期2803-2815,共13页
When the Transformer proposed by Google in 2017,it was first used for machine translation tasks and achieved the state of the art at that time.Although the current neural machine translation model can generate high qu... When the Transformer proposed by Google in 2017,it was first used for machine translation tasks and achieved the state of the art at that time.Although the current neural machine translation model can generate high quality translation results,there are still mistranslations and omissions in the translation of key information of long sentences.On the other hand,the most important part in traditional translation tasks is the translation of key information.In the translation results,as long as the key information is translated accurately and completely,even if other parts of the results are translated incorrect,the final translation results’quality can still be guaranteed.In order to solve the problem of mistranslation and missed translation effectively,and improve the accuracy and completeness of long sentence translation in machine translation,this paper proposes a key information fused neural machine translation model based on Transformer.The model proposed in this paper extracts the keywords of the source language text separately as the input of the encoder.After the same encoding as the source language text,it is fused with the output of the source language text encoded by the encoder,then the key information is processed and input into the decoder.With incorporating keyword information from the source language sentence,the model’s performance in the task of translating long sentences is very reliable.In order to verify the effectiveness of the method of fusion of key information proposed in this paper,a series of experiments were carried out on the verification set.The experimental results show that the Bilingual Evaluation Understudy(BLEU)score of the model proposed in this paper on theWorkshop on Machine Translation(WMT)2017 test dataset is higher than the BLEU score of Transformer proposed by Google on the WMT2017 test dataset.The experimental results show the advantages of the model proposed in this paper. 展开更多
关键词 Key information TRANSFORMER FUSION neural machine translation
下载PDF
Arabic Sign Language Gesture Classification Using Deer Hunting Optimization with Machine Learning Model
9
作者 Badriyya B.Al-onazi Mohamed K.Nour +6 位作者 Hussain Alshahran Mohamed Ahmed Elfaki Mrim M.Alnfiai Radwa Marzouk Mahmoud Othman Mahir M.Sharif Abdelwahed Motwakel 《Computers, Materials & Continua》 SCIE EI 2023年第5期3413-3429,共17页
Sign language includes the motion of the arms and hands to communicate with people with hearing disabilities.Several models have been available in the literature for sign language detection and classification for enha... Sign language includes the motion of the arms and hands to communicate with people with hearing disabilities.Several models have been available in the literature for sign language detection and classification for enhanced outcomes.But the latest advancements in computer vision enable us to perform signs/gesture recognition using deep neural networks.This paper introduces an Arabic Sign Language Gesture Classification using Deer Hunting Optimization with Machine Learning(ASLGC-DHOML)model.The presented ASLGC-DHOML technique mainly concentrates on recognising and classifying sign language gestures.The presented ASLGC-DHOML model primarily pre-processes the input gesture images and generates feature vectors using the densely connected network(DenseNet169)model.For gesture recognition and classification,a multilayer perceptron(MLP)classifier is exploited to recognize and classify the existence of sign language gestures.Lastly,the DHO algorithm is utilized for parameter optimization of the MLP model.The experimental results of the ASLGC-DHOML model are tested and the outcomes are inspected under distinct aspects.The comparison analysis highlighted that the ASLGC-DHOML method has resulted in enhanced gesture classification results than other techniques with maximum accuracy of 92.88%. 展开更多
关键词 machine learning sign language recognition multilayer perceptron deer hunting optimization densenet
下载PDF
Research on system combination of machine translation based on Transformer
10
作者 刘文斌 HE Yanqing +1 位作者 LAN Tian WU Zhenfeng 《High Technology Letters》 EI CAS 2023年第3期310-317,共8页
Influenced by its training corpus,the performance of different machine translation systems varies greatly.Aiming at achieving higher quality translations,system combination methods combine the translation results of m... Influenced by its training corpus,the performance of different machine translation systems varies greatly.Aiming at achieving higher quality translations,system combination methods combine the translation results of multiple systems through statistical combination or neural network combination.This paper proposes a new multi-system translation combination method based on the Transformer architecture,which uses a multi-encoder to encode source sentences and the translation results of each system in order to realize encoder combination and decoder combination.The experimental verification on the Chinese-English translation task shows that this method has 1.2-2.35 more bilingual evaluation understudy(BLEU)points compared with the best single system results,0.71-3.12 more BLEU points compared with the statistical combination method,and 0.14-0.62 more BLEU points compared with the state-of-the-art neural network combination method.The experimental results demonstrate the effectiveness of the proposed system combination method based on Transformer. 展开更多
关键词 TRANSFORMER system combination neural machine translation(NMT) attention mechanism multi-encoder
下载PDF
Short-Term Memory Capacity across Time and Language Estimated from Ancient and Modern Literary Texts. Study-Case: New Testament Translations
11
作者 Emilio Matricciani 《Open Journal of Statistics》 2023年第3期379-403,共25页
We study the short-term memory capacity of ancient readers of the original New Testament written in Greek, of its translations to Latin and to modern languages. To model it, we consider the number of words between any... We study the short-term memory capacity of ancient readers of the original New Testament written in Greek, of its translations to Latin and to modern languages. To model it, we consider the number of words between any two contiguous interpunctions I<sub>p</sub>, because this parameter can model how the human mind memorizes “chunks” of information. Since I<sub>P</sub> can be calculated for any alphabetical text, we can perform experiments—otherwise impossible— with ancient readers by studying the literary works they used to read. The “experiments” compare the I<sub>P</sub> of texts of a language/translation to those of another language/translation by measuring the minimum average probability of finding joint readers (those who can read both texts because of similar short-term memory capacity) and by defining an “overlap index”. We also define the population of universal readers, people who can read any New Testament text in any language. Future work is vast, with many research tracks, because alphabetical literatures are very large and allow many experiments, such as comparing authors, translations or even texts written by artificial intelligence tools. 展开更多
关键词 Alphabetical languages Artificial Intelligence Writing GREEK LATIN New Testament Readers Overlap Probability Short-Term Memory Capacity TEXTS translation Words Interval
下载PDF
Improvements of Google Neural Machine Translation
12
作者 李瑞 蒋美佳 《海外英语》 2017年第15期132-134,共3页
Machine Translation has been playing an important role in modern society due to its effectiveness and efficiency,but the great demand for corpus makes it difficult for users to use traditional Machine Translation syst... Machine Translation has been playing an important role in modern society due to its effectiveness and efficiency,but the great demand for corpus makes it difficult for users to use traditional Machine Translation systems.To solve this problem and improve translation quality,in November 2016,Google introduces Google Neural Machine Translation system,which implements the latest techniques to achieve better outcomes.The conspicuous achievement has been proved by experiments using BLEU score to measure performance of different systems.With GNMT,the gap between human and machine translation is narrowing. 展开更多
关键词 machine translation machine translation improvement translation google neural machine translation neural machine translation
下载PDF
A Robust Model for Translating Arabic Sign Language into Spoken Arabic Using Deep Learning
13
作者 Khalid M.O.Nahar Ammar Almomani +1 位作者 Nahlah Shatnawi Mohammad Alauthman 《Intelligent Automation & Soft Computing》 SCIE 2023年第8期2037-2057,共21页
This study presents a novel and innovative approach to auto-matically translating Arabic Sign Language(ATSL)into spoken Arabic.The proposed solution utilizes a deep learning-based classification approach and the trans... This study presents a novel and innovative approach to auto-matically translating Arabic Sign Language(ATSL)into spoken Arabic.The proposed solution utilizes a deep learning-based classification approach and the transfer learning technique to retrain 12 image recognition models.The image-based translation method maps sign language gestures to corre-sponding letters or words using distance measures and classification as a machine learning technique.The results show that the proposed model is more accurate and faster than traditional image-based models in classifying Arabic-language signs,with a translation accuracy of 93.7%.This research makes a significant contribution to the field of ATSL.It offers a practical solution for improving communication for individuals with special needs,such as the deaf and mute community.This work demonstrates the potential of deep learning techniques in translating sign language into natural language and highlights the importance of ATSL in facilitating communication for individuals with disabilities. 展开更多
关键词 Sign language deep learning transfer learning machine learning automatic translation of sign language natural language processing Arabic sign language
下载PDF
A Real-Time Automatic Translation of Text to Sign Language
14
作者 Muhammad Sanaullah Babar Ahmad +4 位作者 Muhammad Kashif Tauqeer Safdar Mehdi Hassan Mohd Hilmi Hasan Norshakirah Aziz 《Computers, Materials & Continua》 SCIE EI 2022年第2期2471-2488,共18页
Communication is a basic need of every human being;by this,they can learn,express their feelings and exchange their ideas,but deaf people cannot listen and speak.For communication,they use various hands gestures,also ... Communication is a basic need of every human being;by this,they can learn,express their feelings and exchange their ideas,but deaf people cannot listen and speak.For communication,they use various hands gestures,also known as Sign Language(SL),which they learn from special schools.As normal people have not taken SL classes;therefore,they are unable to perform signs of daily routine sentences(e.g.,what are the specifications of this mobile phone?).A technological solution can facilitate in overcoming this communication gap by which normal people can communicate with deaf people.This paper presents an architecture for an application named Sign4PSL that translates the sentences to Pakistan Sign Language(PSL)for deaf people with visual representation using virtual signing character.This research aims to develop a generic independent application that is lightweight and reusable on any platform,including web and mobile,with an ability to perform offline text translation.The Sign4PSL relies on a knowledge base that stores both corpus of PSL Words and their coded form in the notation system.Sign4PSL takes English language text as an input,performs the translation to PSL through sign language notation and displays gestures to the user using virtual character.The system is tested on deaf students at a special school.The results have shown that the students were able to understand the story presented to them appropriately. 展开更多
关键词 Sign language sign markup language deaf communication hamburg notations machine translation
下载PDF
Progress in Machine Translation 被引量:1
15
作者 Haifeng Wang Hua Wu +2 位作者 Zhongjun He Liang Huang Kenneth Ward Church 《Engineering》 SCIE EI CAS 2022年第11期143-153,共11页
After more than 70 years of evolution,great achievements have been made in machine translation.Especially in recent years,translation quality has been greatly improved with the emergence of neural machine translation(... After more than 70 years of evolution,great achievements have been made in machine translation.Especially in recent years,translation quality has been greatly improved with the emergence of neural machine translation(NMT).In this article,we first review the history of machine translation from rule-based machine translation to example-based machine translation and statistical machine translation.We then introduce NMT in more detail,including the basic framework and the current dominant framework,Transformer,as well as multilingual translation models to deal with the data sparseness problem.In addition,we introduce cutting-edge simultaneous translation methods that achieve a balance between translation quality and latency.We then describe various products and applications of machine translation.At the end of this article,we briefly discuss challenges and future research directions in this field. 展开更多
关键词 machine translation Neural machine translation Simultaneous translation
下载PDF
Improving Parallel Corpus Quality for Chinese-Vietnamese Statistical Machine Translation
16
作者 Huu-anh Tran Yuhang Guo +2 位作者 Ping Jian Shumin Shi Heyan Huang 《Journal of Beijing Institute of Technology》 EI CAS 2018年第1期127-136,共10页
The performance of a machine translation system heavily depends on the quantity and quality of the bilingual language resource. However,getting a parallel corpus,which has a large scale and is of high quality,is a ver... The performance of a machine translation system heavily depends on the quantity and quality of the bilingual language resource. However,getting a parallel corpus,which has a large scale and is of high quality,is a very difficult task especially for low resource languages such as Chinese-Vietnamese. Fortunately,multilingual user generated contents( UGC),such as bilingual movie subtitles,provide us access to automatic construction of the parallel corpus. Although the amount of UGC parallel corpora can be considerable,the original corpus is not suitable for statistical machine translation( SMT) systems. The corpus may contain translation errors,sentence mismatching,free translations,etc. To improve the quality of the bilingual corpus for SMT systems,three filtering methods are proposed: sentence length difference,the semantic of sentence pairs,and machine learning. Experiments are conducted on the Chinese to Vietnamese translation corpus.Experimental results demonstrate that all the three methods effectively improve the corpus quality,and the machine translation performance( BLEU score) can be improved by 1. 32. 展开更多
关键词 parallel corpus filtering low resource languages bilingual movie subtitles machine translation Chinese-Vietnamese translation
下载PDF
On Foreign Language Creation and Rootless Back Translation -A Case Study of Snow Flower and the Secret Fan
17
作者 GUO Ting 《Journal of Literature and Art Studies》 2017年第10期1354-1364,共11页
With Adaptation Theory as its theoretical basis, this research makes comparison between Lisa See's original version of Snow Flower and the Secret Fan and Xin Yuanjie's Chinese translation based on a self-compiled ma... With Adaptation Theory as its theoretical basis, this research makes comparison between Lisa See's original version of Snow Flower and the Secret Fan and Xin Yuanjie's Chinese translation based on a self-compiled manual annotation English-Chinese bilingual corpus. It aims at exploring the choice of language and the choice of translation method in the foreign language creation and rootless back-translation of Snow Flower and the Secret Fan, discussing whether these choices adapt to the source language or the target language, and finding out the communicative effects they have achieved. The preliminary results show that: the choice of language and the choice of translation method in Foreign Language Creation mainly adapt to the target language to make easier for the target English readers to understand and partly adapt to the source language to keep local flavor; the choice of language and the choice of translation method in rootless back translation mainly adapt to the target language to make the translation authentic, accurate and smooth, and adaption to source language will lead to translationese. This research sheds new light on this special kind of writing and translation. 展开更多
关键词 foreign language creation rootless back translation self-compiled corpora choice of language choice of translation method
下载PDF
Graph-based Lexicalized Reordering Models for Statistical Machine Translation
18
作者 SU Jinsong LIU Yang +1 位作者 LIU Qun DONG Huailin 《China Communications》 SCIE CSCD 2014年第5期71-82,共12页
Lexicalized reordering models are very important components of phrasebased translation systems.By examining the reordering relationships between adjacent phrases,conventional methods learn these models from the word a... Lexicalized reordering models are very important components of phrasebased translation systems.By examining the reordering relationships between adjacent phrases,conventional methods learn these models from the word aligned bilingual corpus,while ignoring the effect of the number of adjacent bilingual phrases.In this paper,we propose a method to take the number of adjacent phrases into account for better estimation of reordering models.Instead of just checking whether there is one phrase adjacent to a given phrase,our method firstly uses a compact structure named reordering graph to represent all phrase segmentations of a parallel sentence,then the effect of the adjacent phrase number can be quantified in a forward-backward fashion,and finally incorporated into the estimation of reordering models.Experimental results on the NIST Chinese-English and WMT French-Spanish data sets show that our approach significantly outperforms the baseline method. 展开更多
关键词 natural language processing statistical machine translation lexicalized reordering model reordering graph
下载PDF
Dependency-Based Local Attention Approach to Neural Machine Translation 被引量:2
19
作者 Jing Qiu Yan Liu +4 位作者 Yuhan Chai Yaqi Si Shen Su Le Wang Yue Wu 《Computers, Materials & Continua》 SCIE EI 2019年第5期547-562,共16页
Recently dependency information has been used in different ways to improve neural machine translation.For example,add dependency labels to the hidden states of source words.Or the contiguous information of a source wo... Recently dependency information has been used in different ways to improve neural machine translation.For example,add dependency labels to the hidden states of source words.Or the contiguous information of a source word would be found according to the dependency tree and then be learned independently and be added into Neural Machine Translation(NMT)model as a unit in various ways.However,these works are all limited to the use of dependency information to enrich the hidden states of source words.Since many works in Statistical Machine Translation(SMT)and NMT have proven the validity and potential of using dependency information.We believe that there are still many ways to apply dependency information in the NMT structure.In this paper,we explore a new way to use dependency information to improve NMT.Based on the theory of local attention mechanism,we present Dependency-based Local Attention Approach(DLAA),a new attention mechanism that allowed the NMT model to trace the dependency words related to the current translating words.Our work also indicates that dependency information could help to supervise attention mechanism.Experiment results on WMT 17 Chineseto-English translation task shared training datasets show that our model is effective and perform distinctively on long sentence translation. 展开更多
关键词 Neural machine translation attention mechanism dependency parsing
下载PDF
Corpus Augmentation for Improving Neural Machine Translation 被引量:2
20
作者 Zijian Li Chengying Chi Yunyun Zhan 《Computers, Materials & Continua》 SCIE EI 2020年第7期637-650,共14页
The translation quality of neural machine translation(NMT)systems depends largely on the quality of large-scale bilingual parallel corpora available.Research shows that under the condition of limited resources,the per... The translation quality of neural machine translation(NMT)systems depends largely on the quality of large-scale bilingual parallel corpora available.Research shows that under the condition of limited resources,the performance of NMT is greatly reduced,and a large amount of high-quality bilingual parallel data is needed to train a competitive translation model.However,not all languages have large-scale and high-quality bilingual corpus resources available.In these cases,improving the quality of the corpora has become the main focus to increase the accuracy of the NMT results.This paper proposes a new method to improve the quality of data by using data cleaning,data expansion,and other measures to expand the data at the word and sentence-level,thus improving the richness of the bilingual data.The long short-term memory(LSTM)language model is also used to ensure the smoothness of sentence construction in the process of sentence construction.At the same time,it uses a variety of processing methods to improve the quality of the bilingual data.Experiments using three standard test sets are conducted to validate the proposed method;the most advanced fairseq-transformer NMT system is used in the training.The results show that the proposed method has worked well on improving the translation results.Compared with the state-of-the-art methods,the BLEU value of our method is increased by 2.34 compared with that of the baseline. 展开更多
关键词 Neural machine translation corpus argumentation model improvement deep learning data cleaning
下载PDF
上一页 1 2 250 下一页 到第
使用帮助 返回顶部