期刊文献+

英语语音合成中基于有限泛化法的字素切分规则的机器学习 被引量:2

English grapheme segmentation rules learning based on the finite generalization algorithm in English speech synthesis
下载PDF
导出
摘要 在英语语音合成中,由于英语有着几乎无限多的词汇,因此不可能创建包含所有词汇的词库。对于未包含在词库中的英语单词,通过“字母转换成音素(L2P)”算法自动生成其音标是一个最好的解决办法。而L2P首要的任务就是字素切分。为此,文中提出了一种有限泛化法(FGA)的机器学习算法,用于进行字素切分规则学习。用于学习的词典库有27 040个单词,其中90%的词用于规则学习,剩下的10%用于测试。经过10轮交叉验证,学习实例和测试实例的平均实例切分正确率为99.84%和97.88%,平均单词切分正确率为99.72%和96.35%;平均规则数为472个。 Letter-to-Phoneme Conversion(L2P) is a very important component in English speech synthesis system. The first task of L2P is grapheme segmentation. A machine learning method named the Finite Generalization Algorithm (FGA) was presented, which was used to learn rules of English grapheme segmentation. The average accuracies of training and testing sets were 99.84% and 97.88% respectively for instances segmentation, and 99.72% and 96.35% respectively for words segmentation. The average number of rules is 472, about 1 rule per 52 words.
出处 《计算机应用》 CSCD 北大核心 2005年第9期2010-2014,共5页 journal of Computer Applications
关键词 语音合成 字母转换成音素(L2P) 机器学习 有限泛化 speech synthesis letter-to-phoneme conversion(L2P) machine learning finite generalization
  • 相关文献

参考文献6

  • 1MORRIS I. The American Heritage Dictionary[M]. Houghton Mifflin. Boston: MA. 1991.
  • 2MACKAY IRA. Phonetics: The Science of Speech Production [M].Pro. Ed, Austin: Texas, 1987.
  • 3PAGEL V, LENZO K, BLACK A. Letter to sound rules for accented lexicon compression[A]. ICSLP98 [C], Sydney, Australia,1998.
  • 4ZHANG J, HAMILTON HJ, GALLOWAY B. English Graphemes and their Corresponding Sound Units[A]. Proceedings of Pacific Association for Computational Linguistics[C], Ohme, Japan, September, 1997. 351-362
  • 5李智强.生成音系学的音节理论[J].外语教学与研究,1997,29(4):5-12. 被引量:18
  • 6MITCHELL T. Machine Learning[M]. McGraw Hill, 1997.STONE M. Cross-validation choice and assessment of statistical predictions[J]. Journal of the Royal Statistical Society, 1974, vol.B36:111 - 147.邵峰晶,于中清.数据挖掘原理与算法[M].北京:中国水利水电出版社.2003.8HAMILTON HJ, ZHANG J. The Iterated Version Space Algorithm [A]. Proc. of Ninth Florida Artificial Intelligence Research Symposium(FLAIRS'96) [C]. Daytona Beach, Florida, 1996. 209 - 213.

共引文献17

同被引文献8

  • 1K Torkolla.An efficient way to learn English graphemeto-phoneme rules automatically.ICASSP,Minneapolis,1993; 2:199~202
  • 2V Pagel,K Lenzo,A Black.Letter to sound rules for accented lexicon compression[C].In:ICSLP98,Sydney,Australia,1998
  • 3I R A Mackay.Phonetics:The Science of Speech Production[M].Pro Ed,Austin,Texas,1987
  • 4T Mitchell.Machine Learning[M].McGraw Hill,1997
  • 5Stone M.Cross-validation choice and assessment of statistical predictions[J].Journal of the Royal Statistical Society,B-36,1974:111~147
  • 6Zhang J,Hamilton H J.Learning English Pronunciation Rules:A Machine Learning Approach[C].In:PWTL,IEICE Japan,Iizuka,Fukuoka,Japan,1997:104~115
  • 7Walter Daelemans,Antal Van Den Bosch,Ton Weijters. IGTree: Using Trees for Compression and Classification in Lazy Learning Algorithms[J] 1997,Artificial Intelligence Review(1-5):407~423
  • 8David W. Aha,Dennis Kibler,Marc K. Albert. Instance-Based Learning Algorithms[J] 1991,Machine Learning(1):37~66

引证文献2

二级引证文献3

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部