期刊文献+

融合深度学习特征的汉维短语表过滤研究 被引量:1

Research on Chinese-Uyghur Phrase Table Filtering Integrating Deep Learning Features
下载PDF
导出
摘要 汉维机器翻译面临着汉维语言构词、语序差异性大,短语表冗余、不合理信息较多,双语资源匮乏以及相应形态分析工具性能欠佳等挑战,严重影响了汉维机器翻译译文质量。针对汉维短语表中出现较多的不合理短语对,影响翻译性能及解码效率这一问题,提出一种融合汉维短语对循环神经网络特征和汉维短语对上下文特征等深度学习特征,以及汉维短语对平均词共现特征这一浅层特征的汉维短语表过滤模型。该模型基于短语对循环神经网络特征、上下文特征以及平均词共现特征,并将各个特征概率及训练实例输入到基于朴素贝叶斯分类器的短语表过滤模型进行训练。该模型结合了汉维候选短语之间更为丰富的语义及上下文信息。实验结果表明,提出的短语表过滤方法能够有效地去除汉维短语表中的不合理短语,汉维机器翻译性能及其解码效率都有所提高。 Chinese-Uyghur machine translation is faced with challenges such as difference of word formation and word order between Chinese and Uyghur,phrase table redundancy,unreasonable phrase pairs,lacking of bilingual resources and poor performance of corresponding morphological analysis tools,which seriously affect the performance of Chinese-Uyghur machine translation model. To solve these problems in Chinese-Uyghur phrase table that many unreasonable phrase pairs exist and affect the performance and productivity of translation model,we propose a Chinese-Uyghur phrase table filtering model integrating deep learning features like recurrent neural network feature and context feature of Chinese-Uyghur phrase pair and shallowfeature like average co-occurrence feature.The model is on the basis of phrases for circulation neural network feature,context feature,and the average word co-occurrence feature,and the characteristics of probability and examples of training are input to phrases list filtering model based on Naive Bayesian classifier for training.This model combines the richer semantic and contextual information between the candidate phrases of Chinese-Uyghur.Experiment shows that the proposed phrase table filtering method can effectively eliminate the unreasonable phrases in the phrase table of Chinese-Uyghur and improve the translation performance and decoding efficiency of Chinese-Uyghur translation machine.
作者 朱顺乐 ZHU Shun-le(Zhejiang Ocean University,Zhoushan 316000,China)
机构地区 浙江海洋大学
出处 《计算机技术与发展》 2018年第7期149-154,共6页 Computer Technology and Development
基金 浙江省自然科学基金资助项目(LY16F020014) 浙江省自然科学基金青年科学基金项目(LQ16A010003)
关键词 循环神经网络 贝叶斯定理 非连续元 短语表过滤 汉维翻译 recurrent neural network Naive Bayes skip-gram phrase table filtering Chinese-Uyghur translation
  • 相关文献

参考文献10

二级参考文献77

共引文献62

同被引文献14

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部