摘要
语谱图是显示时变频谱幅度特征的图形,基于梯度方向直方图建立语谱图之间的映射关系,找出它们对应的频率结构,为说话人规整及进一步的语音处理提供了途径。在提取特征参数之前,用梯度方向直方图描述语谱图中点的特征,进而实现两个说话人的语谱图在频率轴上的非线性映射,其实质是在频率点相似性的条件下,运用动态规划准则的最佳匹配问题。在TIDIGITS数据库上的实验表明,该方法在训练集与测试集不匹配时能明显降低系统的误识率。
Spectrogram is an image reflecting time-varying spectral magnitude.The correspondence between spectrograms is estab- lished based on Gradient Orientation Histogram(GOH)to find the corresponding frequency structures,which benefits speaker normalization and further speech processing.Before extraction of feature parameters,the local feature in a spectrogram is described and the non-linear correspondence on the frequency axes between spectrograms of two speakers is established.In fact,the method is to find the optimal match by using dynamic programming given the similarity measure of two frequency bins.The experiments on the TIDIGITS corpus show reduction on the error rate under mismatched condition of training and testing data.
出处
《计算机工程与应用》
CSCD
北大核心
2011年第18期146-148,共3页
Computer Engineering and Applications
基金
国家重点基础研究发展规划(973)(No.2009CB326203)~~
关键词
梯度方向直方图
语谱图映射
说话人规整
动态规划
Gradient Orientation Histogram(GOH)
spectrogram correspondence
speaker normalization
dynamic programming