Statistical machine translation for low-resource language suffers from the lack of abundant training corpora. Several methods, such as the use of a pivot language, have been proposed as a bridge to translate from one ...Statistical machine translation for low-resource language suffers from the lack of abundant training corpora. Several methods, such as the use of a pivot language, have been proposed as a bridge to translate from one language to another. However, errors will accumulate during the extensive translation pipelines. In this paper, we propose an approach to low-resource language translation by exploiting the pronunciation correlations between languages. We find that the pronunciation features can improve both Chinese-Vietnamese and Vietnamese- Chinese translation qualities. Experimental results show that our proposed model yields effective improvements, and the translation performance (bilingual evaluation understudy score) is improved by a maximum value of 1.03.展开更多
基金supported by the National key Basic Research and Development(973)Program of China(No.2013CB329303)the National Natural Science Foundation of China(Nos.61502035,61132009,and 61671064)Beijing Advanced Innovation Center for Imaging Technology(No.BAICIT-2016007)
文摘Statistical machine translation for low-resource language suffers from the lack of abundant training corpora. Several methods, such as the use of a pivot language, have been proposed as a bridge to translate from one language to another. However, errors will accumulate during the extensive translation pipelines. In this paper, we propose an approach to low-resource language translation by exploiting the pronunciation correlations between languages. We find that the pronunciation features can improve both Chinese-Vietnamese and Vietnamese- Chinese translation qualities. Experimental results show that our proposed model yields effective improvements, and the translation performance (bilingual evaluation understudy score) is improved by a maximum value of 1.03.