This paper describes the experiments with Korean-to-Vietnamese statistical machine translation(SMT). The fact that Korean is a morphologically complex language that does not have clear optimal word boundaries causes a...This paper describes the experiments with Korean-to-Vietnamese statistical machine translation(SMT). The fact that Korean is a morphologically complex language that does not have clear optimal word boundaries causes a major problem of translating into or from Korean. To solve this problem, we present a method to conduct a Korean morphological analysis by using a pre-analyzed partial word-phrase dictionary(PWD).Besides, we build a Korean-Vietnamese parallel corpus for training SMT models by collecting text from multilingual magazines. Then, we apply such a morphology analysis to Korean sentences that are included in the collected parallel corpus as a preprocessing step. The experiment results demonstrate a remarkable improvement of Korean-to-Vietnamese translation quality in term of bi-lingual evaluation understudy(BLEU).展开更多
Companies like Google, MSN and Yahoo provide translation services on their websites, generating translations based on statistical bilingual text corpora. Human translation seems to be inferior in face of huge amount o...Companies like Google, MSN and Yahoo provide translation services on their websites, generating translations based on statistical bilingual text corpora. Human translation seems to be inferior in face of huge amount of information and fast development of computer science. Despite the functions and versatility of statistical machine translation, it may never take the place of human effort. Teachers are supposed to guide the students in using online translation system.展开更多
基金supported by the Institute for Information&communications Technology Promotion under Grant No.R0101-16-0176the Project of Core Technology Development for Human-Like Self-Taught Learning Based on Symbolic Approach
文摘This paper describes the experiments with Korean-to-Vietnamese statistical machine translation(SMT). The fact that Korean is a morphologically complex language that does not have clear optimal word boundaries causes a major problem of translating into or from Korean. To solve this problem, we present a method to conduct a Korean morphological analysis by using a pre-analyzed partial word-phrase dictionary(PWD).Besides, we build a Korean-Vietnamese parallel corpus for training SMT models by collecting text from multilingual magazines. Then, we apply such a morphology analysis to Korean sentences that are included in the collected parallel corpus as a preprocessing step. The experiment results demonstrate a remarkable improvement of Korean-to-Vietnamese translation quality in term of bi-lingual evaluation understudy(BLEU).
文摘Companies like Google, MSN and Yahoo provide translation services on their websites, generating translations based on statistical bilingual text corpora. Human translation seems to be inferior in face of huge amount of information and fast development of computer science. Despite the functions and versatility of statistical machine translation, it may never take the place of human effort. Teachers are supposed to guide the students in using online translation system.