期刊文献+
共找到1篇文章
< 1 >
每页显示 20 50 100
Alignment of the Polish-English Parallel Text for a Statistical Machine "Translation
1
作者 krzysztof wolk krzysztof marasek 《Computer Technology and Application》 2013年第11期575-583,共9页
Text alignment is crucial to the accuracy of MT (Machine Translation) systems, some NLP (Natural Language Processing) tools or any other text processing tasks requiring bilingual data. This research proposes a lan... Text alignment is crucial to the accuracy of MT (Machine Translation) systems, some NLP (Natural Language Processing) tools or any other text processing tasks requiring bilingual data. This research proposes a language independent sentence alignment approach based on Polish (not position-sensitive language) to English experiments. This alignment approach was developed on the TED (Translanguage English Database) talks corpus, but can be used for any text domain or language pair. The proposed approach implements various heuristics for sentence recognition. Some of them value synonyms and semantic text structure analysis as a part of additional information. Minimization of data loss was ensured. The solution is compared to other sentence alignment implementations. Also an improvement in MT system score with text processed with the described tool is shown. 展开更多
关键词 Text alignment NLP tools machine learning text corpora processing
下载PDF
上一页 1 下一页 到第
使用帮助 返回顶部