期刊文献+

基于话题翻译模型的双语文本纠错 被引量:1

TOPICS TRANSLATION MODEL-BASED BILINGUAL TEXT ERRORS CORRECTION
下载PDF
导出
摘要 近年来,随着信息全球化的影响,社交网络文本上的多语言混合现象越来越普遍。许多中文文本中混杂着其他语言的情况已很常见。绝大多数现有的自然语言处理算法都是基于单一语言的,并不能很好地处理多语言混合的文本,因此在进行其他自然语言处理任务之前对文本进行预处理显得尤为重要。面对网络文本语义空间双语对齐语料的匮乏,提出一种基于话题翻译模型的方法,利用不同语义空间的语料计算网络文本语义空间的双语对齐概率,再结合神经网络语言模型将网络混合文本中的英文翻译成对应中文。实验在人工标注的测试语料上进行,实验结果表明,通过不同的对比试验证明文中的方法是有效的,能提升翻译正确率。 Along with the globalisation of information in recent years,multilingual mixing phenomena have become increasingly popular in social networks texts. It is quite common in Chinese texts that other languages are mixed. Since most of the existing natural language processing algorithm is the monolingual task-based,the multilingual mixed text can't be well processed,therefore it is crucial to pre-process the text before carrying out other natural language processing tasks. For the lack of the corpus of bilingual alignment in network text semantic space,we proposed a topics translation model-based method,it calculates the probability of bilingual alignment of network text semantic space using the corpus in different semantic spaces,then incorporates neural network language model to translate the English in mixed network text to corresponding Chinese text. The experiment was set on a manual labelled test corpus. Experimental result indicated that through different comparative experiments it was proved that the proposed approach was effective and was able to improve translation accuracy.
作者 陈欢 张奇
出处 《计算机应用与软件》 CSCD 2016年第3期284-287,共4页 Computer Applications and Software
关键词 网络文本 话题翻译模型 神经网络语言模型 Network text Topics translation model Neural network language model
  • 相关文献

参考文献10

  • 1Aw A T,Zhang M,Xiao J,et al.A phrase-based statistical model for SMS text normalization[C]//Proceedings of the COLING/ACL on Main conference poster sessions.Association for Computational Linguistics,2006:33-40.
  • 2Kobus C,Yvon F,Damnati G.Normalizing SMS:are two metaphors better than one?[C]//Proceedings of the 22nd International Conference on Computational Linguistics-Volume 1.Association for Computational Linguistics,2008:441-448.
  • 3Han B,Baldwin T.Lexical normalisation of short text messages:Makn sens a#twitter[C]//Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics:Human Language Technologies-Volume 1.Association for Computational Linguistics,2011:368-378.
  • 4Liu F,Weng F,Jiang X.A broad-coverage normalization system for social media language[C]//Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics:Long Papers-Volume 1.Association for Computational Linguistics,2012:1035-1044.
  • 5Han B,Cook P,Baldwin T.Automatically constructing a normalisation dictionary for microblogs[C]//Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning.Association for Computational Linguistics,2012:421-432.
  • 6Wang P,Ng H T.A beam-search decoder for normalization of social media text with application to machine translation[C]//Proceedings of NAACL-HLT,2013:471-481.
  • 7Su J,Wu H,Wang H,et al.Translation model adaptation for statistical machine translation with monolingual topic information[C]//Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics:Long Papers-Volume 1.Association for Computational Linguistics,2012:459-468.
  • 8Huang E H,Socher R,Manning C D,et al.Improving word representations via global context and multiple word prototypes[C]//Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics:Long Papers-Volume 1.Association for Computational Linguistics,2012:873-882.
  • 9Gruber A,Weiss Y,Rosen-Zvi M.Hidden topic Markov models[C]//International Conference on Artificial Intelligence and Statistics,2007:163-170.
  • 10Zhang Q,Chen H,Huang X.Chinese-English mixed text normalization[C]//Proceedings of the 7th ACM international conference on Web search and data mining.ACM,2014:433-442.

同被引文献3

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部