期刊文献+

汉语文语转换系统中停顿指数的自动标注 被引量:6

Assigning Break Indices for Unrestricted Texts in Mandarin Text to Speech System
下载PDF
导出
摘要 本文采用了一个基于C TOBI的停顿指数标注的语料库 ,利用有指导的学习方法对自动停顿指数标注方面做了一些有益的探索。本文共实现了三种方法 :基本的马尔科夫模型 ,引入了词长信息的马尔科夫模型 ,引入词长信息的马尔科夫模型结合基于转换的错误驱动的学习方法。然后通过对 30 0 0句的真实文本进行开放测试 ,以基本的马尔科夫模型的结果作为基准 ,实验结果不断改进 ,最终达到了 78 6 %的准确率 ,错误代价降低了 14 5 % This paper uses a corpus with break indices based on C-TOBI. Applying supervised learning method, some useful attempts are made in the field of automatic break indices intonation. Three approaches, namely, the basic Markov model approach, the Markov model using word length approach, and the Markov model using word length combining transformation-based error-driven learning approach, are presented. After implementing these three approaches, open tests are made on a corpus of 3,000 sentences. The performances are getting better and the last approach produces the highest accuracy, 78.5%, and results in 14.5% decrease in error-cost taking the result of Markov model as baseline.
出处 《中文信息学报》 CSCD 北大核心 2004年第5期48-55,共8页 Journal of Chinese Information Processing
基金 国家自然科学基金资助项目 (6 0 2 0 30 2 0 )
关键词 计算机应用 中文信息处理 文语转换 停顿指数 马尔科夫模型 基于转换的错误驱动的学习 computer application Chinese information processing text to speech break indices Markov model transformation-based error-driven learning
  • 相关文献

参考文献17

  • 1MinChu, Yao Qian. Locating boundaries for prosodic constituents in unrestricted Mandarin texts[J]. Computational Linguistics and Chinese Language Processing.2001,16(1): 1 - 22.
  • 2ToBI Intonation Transcription Summary.http://www. cs. indiana. edu/- port/teach/306/tobi. summary. html.
  • 3Yao Qian, Min Chu. Segmenting unrestricted Chinese text into prosodic words instead of lexical words[ A]. Proc. of ICASSP2001,Salt Lake City.
  • 4Alan. W.Black, PaulTaylor. Assigning phrase breaks from part-of-speech sequences[J]. Computer Speech and Language. 1998, (12) :99 - 117.
  • 5Pan-Mandarin ToBI System.http://people. cohums. ohio-state. edu/chan9/MToBI. htm.
  • 6C-ToBI: Prosodic labeling system for Chinese.http://www. cass. net. cn/chinese/s18 - yys/yuyin/product/preduct _ 10. btm.
  • 7E. Brill. A Simple Rule-based Part-of-speech Tagger[ A]. In: Proceedings of the Third Conference on Applied natural Language Processing[C]. ACL.Trento,Italy. 1992:152- 155.
  • 8E. Brill. A Rule-based Approach to Prepositional Phrase Attachment Disambiguation[ A]. Proceedings of the 15th International Conference on Gomputational Linguistics[C]. 1994:1198 - 1204.
  • 9E. Brill. Automatic Grammar Induction and Parsing Free Text: A Transformation-based Approach[ A]. In: Proceeding of the ARPA Human Language Technology Workshop[ C]. Princeton,N.J. 1993:259- 265.
  • 10李智强.韵律研究和韵律标音[J].语言文字应用,1998(1):107-111. 被引量:3

二级参考文献41

共引文献74

同被引文献101

引证文献6

二级引证文献19

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部