期刊文献+
共找到1篇文章
< 1 >
每页显示 20 50 100
A Substitution-Translation-Restoration Framework for Handling Unknown Words in Statistical Machine Translation 被引量:2
1
作者 张家俊 翟飞飞 宗成庆 《Journal of Computer Science & Technology》 SCIE EI CSCD 2013年第5期907-918,共12页
Unknown words are one of the key factors that greatly affect the translation quality. Traditionally, nearly all the related researches focus on obtaining the translation of the unknown words. However, these approaches... Unknown words are one of the key factors that greatly affect the translation quality. Traditionally, nearly all the related researches focus on obtaining the translation of the unknown words. However, these approaches have two disadvantages. On the one hand, they usually rely on many additional resources such as bilingual web data; on the other hand, they cannot guarantee good reordering and lexical selection of surrounding words. This paper gives a new perspective on handling unknown words in statistical machine translation (SMT). Instead of making great efforts to find the translation of unknown words, we focus on determining the semantic function of the unknown word in the test sentence and keeping the semantic function unchanged in the translation process. In this way, unknown words can help the phrase reordering and lexical selection of their surrounding words even though they still remain untranslated. In order to determine the semantic function of an unknown word, we employ the distributional semantic model and the bidirectional language model. Extensive experiments on both phrase-based and linguistically syntax-based SMT models in Chinese-to-English translation show that our method can substantially improve the translation quality. 展开更多
关键词 statistical machine translation distributional semantics bidirectional language model
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部