期刊文献+

基于SPWD时频脊特征提取的汉语声调识别 被引量:3

CHINESE TONE RECOGNITION BASED ON SPWD TIME-FREQUENCY RIDGE FEATURE EXTRACTION
下载PDF
导出
摘要 针对语音信号的非平稳性,采用SPWD(smoothed pseudo Wigner-Ville distribution)将韵母语音信号在时频面清晰地表现出来。不同声调语音的时频脊的变化特征不同。利用阈值和细化处理将SPWD时频矩阵转变成二值矩阵图像,利用Hough变换提取脊线;而第三声时频脊是曲线,将Hough变换求取的线段用最小二乘法多项式进行拟合;在脊线段上等间距选取若干个点,将点集和其一阶差分作为时频脊特征,利用高斯混合模型进行识别分类。仿真实验结果表明,该方法很好地对声调进行了识别,平均识别率为86.48%,第二声识别率提高的幅度最大,提高了5.18%;在不同的信噪比下,识别率最大可提高5.62%。 For the non-stability of speech signals, we use SPWD to clearly manifest the vowel speech signals on time-frequency plane. The variation features of time-frequency ridges differ from different speech tones. We use threshold and refined processing to convert SPWD time-frequency matrix to a binary matrix image, and use Hough transform to extract ridge lines. But the time-frequency ridge of the third tone is curve, the line segment obtained by Hough transform is fitted with the method of least squares polynomial; We select some points equidistantly on time-frequency ridge line, and use the point set and its first difference as the feature of the time-frequency ridge, then use Gaussian mixture model (GMM) to conduct recognition and classification. Simulation experimental results show that this method is very good to the tone recognition and its average recognition rate is 86.48%. The improvement extent of the second tone' s recognition rate is the highest, as high as5.18%. And in different SNR, the maximum improvement of recognition rate reaches 5.62%.
出处 《计算机应用与软件》 CSCD 北大核心 2014年第3期142-145,共4页 Computer Applications and Software
基金 国家自然科学基金项目(61075008)
关键词 声调识别 平滑伪Wigner—Ville分布 时频脊 HOUGH变换 最小二乘法多项式拟合 Tone recognition Smoothing pseudo Wigner-Ville distribution Time-frequency ridge Hough transform Polynomial fitting with Least squares method
  • 相关文献

参考文献6

二级参考文献19

  • 1关存太,陈永彬.非特定人四声识别[J].声学学报,1993,18(5):379-385. 被引量:4
  • 2徐士林.四声模糊识别方法[J].电子学报,1996,24(1):119-121. 被引量:12
  • 3杨行骏 迟惠生.语音信号数字处理[M].北京:电子工业出版社,1995..
  • 4Rabiner L R, Cheng M J, Rosenberg A E, et al. A comparative pelformance study of several pitch detection algorithms. IEEE Trans. on ASSP, 1976, ASSP 24(5):399-418.
  • 5Mallat S and Zhong S. Characterization of signals from multiscale edges. IEEE Trans. on PAMI, 1992, 14(7):710-732.
  • 6Rabiner L,Cheng M,Rosenberg A,et al.A comparative performance study of several pitch detection algorithms[J].IEEE Trans on Acoustics,Speech,and Signal Processing,1976,24 (5):399 -417.
  • 7Noll A M.Cepstrum pitch determination[J].Journal of the Acoustic Society of America,1967,41 (2):293-309.
  • 8Kadambe S,Boudreaux-Bartels G.Application of the wavelet transform for pitch detection of speech signals[J].IEEE Trans on Information Theory,1992,38(2):917 -924.
  • 9Huang D,Lin W,Rahardja S.Speech pitch detection in noisy environment using multi-rate adaptive lossless FIR filters[A].Proceedings of the International Symposium on Circuits and Systems 2004 (ISCAS ' 04)[C].[S.l.]:IEEE,2004.Ⅲ -429-432.
  • 10Xu X,Miyanaga Y.A robust pitch detection in noisy speech with band-pass filtering on modulation spectra[A].Proceedings of International Symposium on Communications and Information Technology (ISCIT2005)[C].[S.l.]:IEEE,2005.266 -269.

共引文献34

同被引文献32

引证文献3

二级引证文献10

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部