基于多模型融合的人名翻译系统被引量：2

Chinese-English Back-Transliteration of Human Name Based on Multiple Models

下载PDF

导出

摘要该文提出了一种基于加权有限状态转化器(WFST)的多模型融合人名翻译框架。该框架以两个基于字符的转换模型和两个基于发音的转换模型为核心,通过加权有限状态转换器将多模型进行融合实现对人名的翻译。与单个模型相比,该文提出的方法的优势在于通过从各种信息源得到的数据价值的最大化。实验结果表明,基于多模型融合方法的人名翻译的错误率比单一模型的人名翻译的错误率降低了7.14%。 This paper proposes a novel framework for Chinese-English name back-transliteration based on multiple models by using weighted finite-state transducers （WFST）. Two grapheme-based models and two phoneme-based models are kernel of this framework. Combining those models with unified framework of WFST, we can build a system for Chinese English name back transliteration. Compared with single-model systems, the advantage of this method lies in combining those information from different models and maximizing the data available. Our experiments show that the proposed framework reduces 7.14% in error rate compared with the single-model.

作者庞薇徐波

机构地区中国科学院自动化研究所数字内容技术研究中心中国科学院自动化研究所模式识别国家重点实验室

出处《中文信息学报》 CSCD 北大核心 2009年第1期44-49,共6页 Journal of Chinese Information Processing

基金国家863计划资助项目(2006AA01Z194)

关键词计算机应用中文信息处理多模型融合音译命名实体加权有限状态转换器 computer application Chinese information processing multiple model combination transliteration named entity WFST

分类号 TP391 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献12

1Asanee Kawtrakul, Amarin Deemagarn Backward transliteration for Thai document retrieval[C]//Proceedings of the 1998 IEEE Asia-Pacific Conference on Circuits and Systems (APCCAS), 1998: 563-566.
2H.M. Meng, W.K. Lo, B. Chen and K. Tang. Generating phonetic cognates to handle named entities in English-Chinese cross-language spoken document retrieval [C]//Proceedings of the Automatic Speech Recognition and Understanding Workshop. 2001.
3Bonnie Glover Stalls and Kevin Knight. Translating Names and Technical Terms in Arabic Text [C]// Proceedings of the COLING/ACL Workshop on Computational Approaches to Semitic Languages 1998.
4Y. Al-Onaizan and K. Knight. Translating named entities using monolingual and bilingual resources [C]//Proceedings of the ACL, Philadelphia, PA: 2002: 400-408.
5P. Virga and S. Khudanpur. Transliteration ofproper names in cross-lingual information retrieval. [C]// Proceedings of the ACL Workshop on Multi-lingual Named Entity Recognition. 2003.
6Peter F. Brown, Stephen A. Della Pietra. Mercer. The Mathematics of Statistical Machine Translation [J]. Parameter Estimation. Computational Linguistics, 1993, 19(2):263-311.
7Kevin Knight and Yaser Al-Onaizan. Translation with finite-state devices [C]//Proc. of the 4th AMTA Conference. 1998.
8Shankar Kumar and William Byrne. A weighted finite state transducer implementation of the alignment template model for statistical machine translation [C]//Proc. of Human Language Technology Conference of the North Ameriean Chapter of the Association for Computational Linguistics (HLT-NAACI.), 2003: 142-149.
9Philipp Koehn, Franz Josef Och & Daniel Marcuf Statistical Phrase-Based Translation [C]//Proceedings of the Human Language Technology Conference (HLT) , 2004: 127-133.
10Ashish Venugopal and Stephan Vogel. Considerations in maximum mutual information and minimum classification error training for statistical machine translation[C]//Proceedings of the Tenth Conference of the European Association for Machine Translation (EAMT-05). 2005.

同被引文献32

1张永臣,孙乐,李飞,李文波,西野文人,于浩,方高林.基于Web数据的特定领域双语词典抽取[J].中文信息学报,2006,20(2):16-23. 被引量：11
2李中国,刘颖.边界模板和局部统计相结合的中国人名识别[J].中文信息学报,2006,20(5):44-50. 被引量：13
3蒋龙,周明,简立峰.利用音译和网络挖掘翻译命名实体[J].中文信息学报,2007,21(1):23-29. 被引量：11
4艾山.吾买尔,吐尔根.伊布拉音.英文维文人名机器翻译算法的研究与实现[J].新疆大学学报（自然科学版）,2007,24(1):97-101. 被引量：8
5Ph (a)m Vǎn H(a)i,Lê Vn Dǎng.Ch'Hán và Ti(e)ng Hán-Vi(ê)t[EB/OL].http://www.viethoc.org/ehol dings/PhamVanHai/ChuHanvaTiengHanViet.pdf.2005.
6梁远.实用越汉分类词典[M].北京:民族出版社,2007.
7邹波,赵军.英汉人名音译方法研究[C].第四届全国学生计算语言学研讨会会议论文集,2008:232-238.
8赵明明,洪宇,姚建民,朱巧明.基于音译和网络的命名实体翻译方法研究[C].第六届全国信息检索学术会议论文集,黑龙江:中国中文信息学会,2010:357-365.
9Stephen Wan,Cornelia Verspoor. Automatic English-Chinese Name Transliteration for development of Multilingual Resources [C]//Processings of Coling-ACL 1998 : 1352-1356.
10Kevin Knight,Jonathan Graehl.Machine transliteration[J]. Computational Linguistics. 1998,24(4): 599-612.

引证文献2

1申文明,刘连芳,黄家裕,温家凯.基于概率模型的汉语和越南语的人名音译方法[J].广西科学院学报,2010,26(4):439-442. 被引量：1
2刘颖,曹项.基于熵模型的英汉人名对齐[J].中文信息学报,2016,30(3):52-59. 被引量：1

二级引证文献2

1黄家裕,刘连芳,邓姿娴,温家凯.东南亚语言及信息处理研究进展[J].广西科学院学报,2018,34(1):27-31. 被引量：1
2张金鹏,苏姣,杨蓓,张占.融合人名知识分布特征的汉泰双语人名对齐[J].计算机工程与应用,2019,55(23):163-169.

1陆梨花,张连海,陈琦.基于加权有限状态转换器的语音查询项检索技术[J].数据采集与处理,2015,30(2):390-398. 被引量：2
2项保,张国喜.汉藏机器翻译中汉族人名翻译问题探讨[J].青海师范大学学报（自然科学版）,2011,27(4):88-90. 被引量：3
3李伟,吴及,吕萍.低空间复杂度的加权有限状态转换器合成算法[J].计算机应用研究,2011,28(8):2931-2934.
4胡茹.一种嵌入词义消歧的机器翻译框架[J].黑龙江科技信息,2014(30):126-126.
5刘红星,屈染生.组合模型信息的人工神经网络集成[J].中国机械工程,1998,9(4):38-39.
6陆梨花,张连海.基于音素混淆模型的集外词查询项扩展方法[J].信息工程大学学报,2014,15(4):459-465. 被引量：1
7胡莹.三维建模流程的优化和简化[J].湖南师范大学自然科学学报,2014,37(2):90-94. 被引量：8
8李晔,张保威.基于过程优化的三维建模流程优化方法[J].信阳师范学院学报（自然科学版）,2013,26(3):432-435. 被引量：2
9王劲松,李宗育,隋雷.基于灰色神经网络的战场态势分析及预测[J].电光与控制,2015,22(12):15-19. 被引量：5
10王艳芳.英汉文学作品中人名的寓意及翻译[J].中国科技信息,2008(9):267-268. 被引量：2

中文信息学报

2009年第1期

浏览历史

内容加载中请稍等...

基于多模型融合的人名翻译系统被引量：2

参考文献12

同被引文献32

引证文献2

二级引证文献2

相关作者

相关机构

相关主题

浏览历史

基于多模型融合的人名翻译系统 被引量：2

参考文献12

同被引文献32

引证文献2

二级引证文献2

相关作者

相关机构

相关主题

浏览历史

基于多模型融合的人名翻译系统被引量：2