期刊文献+

维吾尔语多音词消歧混合方法

Hybrid algorithm of polyphonic word disambiguation in Uyghur language
下载PDF
导出
摘要 维吾尔语中存在的形同音不同单词(多音词)的正确发音是影响合成系统可懂读的重要原因之一。维吾尔语单词由词根和词缀构成,虽然多音词词根数量不多,但多音词词根连接各种词缀则构成了大量的多音词。将维吾尔语中经常用错的16个多音词词根作为研究对象,以多音词的不同特点为出发点,采取不同的规则,结合最大熵模型方法来处理不符规则的多音词,同时用似然比方法选取关键词,并用贪婪算法选择最佳特征模板。经过性能测试,该算法多音词消歧平均准确率达到87.7%。 The correct pronunciation of polyphonic word is one of the important factors that affect the Uyghur speech synthesis intelligibllity.A word consists of stem and affix in Uyghur language, although there is a few polyphone stems, but a large number of polyphonic words are constituted by jointing of affix and polyphonic stem.This paper selects 16 polyphonic stems which are frequently used and often read wrong in Uyghur language to study,presents a different rule based method and adopts the maximum entropy model for disambiguation of polyphonic words which does not meet the rules on the basis of the different features of polyphones.Simultaneously, log-likelihood ratio is used to extract keywords and greedy algorithm is used to select best feature set.The performance test of the algorithm shows that the average precision of polyphonic word disambiguation is up to 87.7%.
出处 《计算机工程与应用》 CSCD 北大核心 2011年第35期158-160,170,共4页 Computer Engineering and Applications
基金 国家自然科学基金(No.61065005 No.61062008) 新疆维吾尔自治区多语种信息技术实验室开放项目(No.XJDX0905)~~
关键词 维吾尔语 多音词 最大熵模型 Uyghur language polyphonic word maximum entropy model
  • 相关文献

参考文献10

  • 1Zhang H.Disambiguation of Chinese polyphonic characters[C]// Proceedings of the International Workshop on Multimedia Annotation, Tokyo, January 2001.
  • 2Yarowsky D.Homograph disambigfiation in speech synthesis[M]. [S.1.] : Springer-Verlag, 1997:159-175.
  • 3Wang W,Hwang S,Chen S.The broad study of homograph disambiguity for Mandarin speech synthesis[C]//Proceedings of the ICSLP'96,1996: 1389-1392.
  • 4Zheng M, Shi Q.Grapheme-to-phoneme conversion based on TBL algorithm in Mandarin TTS system[C]//Proceedings of the INTERSPEECH2005,2005 : 1897-1900.
  • 5Liu F, Shi Q,Tao J.Tree-guided transformation-based homograph disambiguation in Mandarin TTS system[C]//Proceedings of the ICASSP2008,2008 : 4657-4660.
  • 6阿不都沙拉木·阿巴斯.维吾尔语同音词词典[M].北京:民族出版社,1996.
  • 7Berger A L, Della Pietra S A, Della Pietra V J.A maximum entropy approach to natural language processing[J].Computational Linguistics, 1996,22( 1 ) :39-71.
  • 8姑丽加玛丽.麦麦提艾力,艾斯卡尔.肉孜,艾斯卡尔.艾木都拉.三音素模型的维吾尔语最佳文本选取算法[J].计算机工程与应用,2009,45(18):242-244. 被引量:5
  • 9Braga D, Coelho L, Gil F, et al.Homograph ambiguity resolution in front-end design for Portuguese TTS systems[C]//Proceedings of the INTERSPEECH2007,2007:1761-1764.
  • 10Lv X,Liu Z,Zhao T,et al.Dealing with polyphone in text-tospeech system using How-Net[C]//Proceedings of the NCMMSC6 2001 : 159-162.

二级参考文献9

共引文献4

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部