一种基于实例学习的高精度英文未登录词发音的自动预测方法

A High Accuracy Approach for Prediction of Pronunciation for Out-of-Vocabulary English Words Based on Exemplar Learning

下载PDF

导出

摘要在英文TTS(texttospeech)系统中 ,需要根据文本中每一个单词的发音来合成语音由于在真实文本的处理中 ,无论词典规模如何大 ,都不可能包括文本中的每一个单词 ,所以需要使用某种算法来预测词典中未登录单词的发音介绍了一种基于实例学习的方法 ,并在一个大规模的英语词典上进行了性能评测结果表明 ,这种方法的单词发音正确率可以达到 70 1% 。 In TTS(text to speech)systems, the pronunciation of each word is needed to synthesize the voice Because every word in the text can not be listed exhaustively when processing the real world documents, no matter what the scope of dictionary is, some kinds of algorithms are needed to automatically predict the pronunciation of word which is not included in the lexicon In this paper an approach based on exemplar learning is introduced and its performance evaluated on a large scale English dictionary Experimental results show that this method can achieve accuracy of 70 1%, obviously higher than the published approaches

作者王浩陈桂林徐良贤

机构地区上海交通大学计算机科学与技术系摩托罗拉中国研究中心

出处《计算机研究与发展》 EI CSCD 北大核心 2004年第5期796-801,共6页 Journal of Computer Research and Development

关键词机器学习实例学习 machine learning exemplar learning

分类号 TP181 [自动化与计算机技术—控制理论与控制工程]

引文网络
相关文献

参考文献8

1M Dedina,H Nusbaum.PRONOUNCE:A Program for Pronunciation by Analogy.Computer Speech and Language,1991,5(1):55～64
2R I Damper,Y Marchand,M J Adamson et al.A comparison of letter-to-sound conversion techniques for English text-to-speech synthesis.Proceedings of the Institute of Acoustics,1998,20(6):245～254
3Y Marchand,R I Damper.A multi-strategy approach to improving pronunciation by analogy.Computational Linguistics,2000,26(2):195～219
4V Pagel,K Lenzo,A Black.Letter to sound rules for accented lexicon compression.In:Robert H Mannell,Jordi Robert-Ribes eds.Proc of the 5th Int'l Conf on Spoken Language Processing,v91.5.Sydney,Australia:Australian Speech and Technology Association,Incorporated(ASSTA),1998.2015～2018
5H S Elovitz,Johnson R,McHngh A et al.Letter-to-sound rules for automatic translation of English text to phonetics.IEEE Trans on Acoustics,Speech and Signal Processing,1976,24(6):446～459
6NMcCulloch MBedworth JBridle.NETspeak—A re—9.Elovitz的规则集合采用的发音音标集合与CMU词典有所不同,我们将其发音转换为CMU使用的音标集合,可能对最终的精度结果有一定影响.implementation of NETtalk[J].Computer Speech and Language,1987,(2):284-301.
7C Stanfill,D Waltz.Toward memory-based reasoning.Communications of the ACM,1986,29(12):1212～1228
8T G Dietterich,G Bakiri.Error-correcting output codes:A general method for improving multiclass inductive learning programs.In:Dannenberg ed.Proc of the 9th National Conf on Artificial Intelligence,Vol 2.Anaheim,Califomia:AAAI Press,1991.572～577

1徐睿,王惠临.基于实例学习在浅层句法分析中的应用[J].情报科学,2010,28(2):248-251.
2徐睿.基于实例学习在自然语言处理中的应用研究[J].科技资讯,2009,7(19):17-17.
3胡飞,徐大华.英文单词——取词翻译软件的设计[J].农业网络信息,2004(9):36-37.
4李盛,杨尔弘.一种基于聚类的汉语词语知识的获取方法[J].计算机工程与应用,2003,39(15):95-98. 被引量：2
5崔香芝,潘存海,裴志军.机器学习在产品信息字符视觉检测中的应用[J].天津科技大学学报,2009,24(2):47-50. 被引量：1
6徐彩虹,刘志,潘翔,冯毅攀.一种基于实例学习的三维模型检索匹配方法[J].浙江工业大学学报,2012,40(3):326-330. 被引量：9
7章维一,侯丽雅.基于实例学习的神经网络及其在故障诊断中的应用[J].电子学报,1999,27(8):5-8. 被引量：1
8田新广,段洣毅,孙春来,李文法.采用shell命令和隐Markov模型进行网络用户行为异常检测[J].应用科学学报,2008,26(2):175-181. 被引量：1
9边缘.不知不觉背单词[J].新潮电子,2000(12):35-35.
10于琦,周勇.一种基于本体的异构数据源模式集成[J].计算机技术与发展,2008,18(2):34-37. 被引量：6

计算机研究与发展

2004年第5期

浏览历史

内容加载中请稍等...

一种基于实例学习的高精度英文未登录词发音的自动预测方法

参考文献8

相关作者

相关机构

相关主题

浏览历史