摘要
提出一种改进的短语抽取算法,该算法首先考虑词对齐矩阵中一个汉语词对齐到多个维语词的情况(包括不连续),然后采用Och方法进行判断。如果满足条件则进行短语抽取。试验结果表明,改进后的短语抽取算法能够抽取出更多汉维短语对,提高短语翻译对抽取的效果。
Proposes an improved algorithm of phrases extraction, firstly this algorithm considers a Chinese word to muhi-Uyghur words (including nonconsecutive), it also uses Och's method. If it meets the condition,this algorithm will extracts phrases.Experiment shows that this algorithm can extract more Chinese-Uyghur phrase pairs, so it is effective in phrase translation extraction.
出处
《现代计算机》
2010年第5期9-11,共3页
Modern Computer
基金
国家自然科学基金(No.60663006
60763006)
关键词
基于短语的统计机器翻译
短语抽取
汉维短语对
翻译模型
Phrase-Based Statistical Machine Translation
Phrase Extraction
Chinese-Uyghur Phrase Pairs
Translation Model