从任意连续语音中实时提取说话人特征及三维显示

Realtime Extraction and 3-Dimension Display of a Speaker's Features from Text-Unconfined Continuous Speech

下载PDF

导出

摘要本文以最大熵谱法估计得到的多幅浊音的线性预测反射系数矢量序列的长期平均矢量作为说话人特征,定义了特征矢量的平均说话人自差异函数、平均说话人互差异函数和平均说话人互自差异比函数,并对不同说话人特征间的可区分性进行了分析.采用伪彩色编码原理,实现特征矢量的三维显示。设计了PC/AT和TMS 32010组成主从系统,使语音采样和参劲估计同步进行,达到了实时性.实验结果表明,所提取的特征有较好地区分说话人的性能,显示方法。可读性好、易于直观分析和整体观察.系统实时性好. A method of realtime extraction and 3-dimension display of a speaker's features from text-unconfined continuous speech is pressnted. The speaker's feature vectors are represented by the long-term averaged vector sequencies of the LPC reflecting coefficients estimated by the Maximum Entropy Spectral Estimation method from the speech. The average within the speaker varince function, the average between the speaker varince function and theratial of average between-within the speaker varince function are defined and used to analyze the discrimination of the speaker's features.The 3-dimension display of the feature vectors is implemented with the principal of pseudocolor encoding. A master-slaver system with IBM-PC/AT and TMS32010 is designed, which leads to the data sampling and parameter estimating syneh-ronicaily. The result shows that the features are well discriminative and the display is good in readability, visual analysis and whole observation.

作者俞振利张礼和

机构地区杭州大学电子工程系

出处《杭州大学学报（自然科学版）》 CSCD 1992年第4期390-397,共8页 Journal of Hangzhou University Natural Science Edition

关键词语音说话人特征提取实时处理 speech speaker feature extraction display realtime processing

分类号 TP391.42 [自动化与计算机技术—计算机应用技术]