摘要
语谱图在语音分析方面有着广泛的应用.音素的自动分割是语音识别过程中的一个基本阶段,它将把语音句子按音素特征进行分割.本文提出了一个音素自动分割的方法;使用了两个表示语谱图密度变化的形变函数,以及自适应阈值技术来定位每个音素段的边缘.这个方法在计算机上具体实现后.我们对取于一个语谱图数据库的一组实验数据,用本文所介绍的自动分割方法划分音素,将所得结果与由一语音学家分割的结果进行比较,得到的识别率高于93%.这个方法作为语音识别系统的一部分,已经在一个语音分析系统中使用.
A spectrogram is a grey scale image, which represents the energy changes of a speech signal. Automatic segmentation is an initial phase in the acoustic-phonetic analysis of automatic speech recognition based on spectrograms. Speech segmentation can be defined as the process of dividing the spectrogram into a sequence of segments, each segment indicating phonemic characteristics. This paper presents a method of automatic segmentation with image processing techniques. We describe two special functions which indicate the intensity changes of the spectrograms called. Together with these two functions, we used adaptive threshold techniques to detect the location of the edges for each segment. The threshold was calculated based on an optimum relation equation which was defined using interpolating linear nulti-ple regression. After the preliminary segmentation, a segmentation check procedure was taken to check the segmentation results. The algorithm was evaluated by comparing the automatic segmentation result with another segmentation result carried out by a phonetic expert. This automatic segmentation facility is a part of an automatic feature extraction program appiled in a speech analysis system.
出处
《杭州大学学报(自然科学版)》
CSCD
1995年第1期42-46,共5页
Journal of Hangzhou University Natural Science Edition
关键词
语谱图
语音音素
语音识别
自动分割
speech recognition
spectrograms
speech segmentation.