期刊文献+
共找到2篇文章
< 1 >
每页显示 20 50 100
The text design for continuous speech database of standard Chinese
1
作者 ZU Yiqing(Institute of Linguistics, Chinese Academy of Social Sciences Beijing 100732) 《Chinese Journal of Acoustics》 1999年第1期56-69,共14页
Well developed continuous speech recognition and synthesis systems demand a high quality continuous speech database which is compact and valid, and whose scientific design would benefit from incorporating linguistic a... Well developed continuous speech recognition and synthesis systems demand a high quality continuous speech database which is compact and valid, and whose scientific design would benefit from incorporating linguistic and phonetic knowledge. It is argued that at the present stage the database should be limited to read speech. To describe those very complex variabilities in continuous speech, the following speech units are proposed: (1) 401syllables without tone; (2) 415 inter-syllabic diphones, (3) 3035 inter-syllabic triphones, (4) 781 inter-syllabic final-initial structures. The 17 basic sefltence patterns in standard Chinese are summarized to cover the most important prosodic phenomena. By using the automatic method,2393 sentences and 388 phrases are selected by above phonetic rules from a large corpus, which includes People's Daily in recent years, TV play scripts and dictionary entries, as the reading text of continuous speech recognition database in standard Chinese. This set of sentences and pbrases covers 99.8% syllables without counting tones, 100% inter-syllable diphones, 99.6% inter-syllable triphones and 100% sentence patterns. 展开更多
关键词 The text design for continuous speech database of standard Chinese
原文传递
A robust feature extraction approach based on an auditory model for classification of speech and expressiveness 被引量:5
2
作者 孙颖 V.Werner 张雪英 《Journal of Central South University》 SCIE EI CAS 2012年第2期504-510,共7页
Based on an auditory model, the zero-crossings with maximal Teager energy operator (ZCMT) feature extraction approach was described, and then applied to speech and emotion recognition. Three kinds of experiments were ... Based on an auditory model, the zero-crossings with maximal Teager energy operator (ZCMT) feature extraction approach was described, and then applied to speech and emotion recognition. Three kinds of experiments were carried out. The first kind consists of isolated word recognition experiments in neutral (non-emotional) speech. The results show that the ZCMT approach effectively improves the recognition accuracy by 3.47% in average compared with the Teager energy operator (TEO). Thus, ZCMT feature can be considered as a noise-robust feature for speech recognition. The second kind consists of mono-lingual emotion recognition experiments by using the Taiyuan University of Technology (TYUT) and the Berlin databases. As the average recognition rate of ZCMT approach is 82.19%, the results indicate that the ZCMT features can characterize speech emotions in an effective way. The third kind consists of cross-lingual experiments with three languages. As the accuracy of ZCMT approach only reduced by 1.45%, the results indicate that the ZCMT features can characterize emotions in a language independent way. 展开更多
关键词 speech recognition emotion recognition zero-crossings Teager energy operator speech database
下载PDF
上一页 1 下一页 到第
使用帮助 返回顶部