摘要
腭裂语音高鼻音等级的自动识别能为临床腭咽功能评估提供有效、客观、无创的辅助依据。对腭裂语音高鼻音等级自动分类系统进行了研究,利用听觉模型提取语音信号的听觉内部表达,并结合同步检测器提取软限制比(Soft Limited Ratio,SLR)谱特征作为特征参数,利用一对一支持向量机(1-v-1 Support Vector Machine,1-v-1SVM)实现腭裂语音高鼻音四类等级(正常、轻度、中度和重度)的自动划分。实验采用56名儿童的共3 086个语音样本,并对比了使用不同基底膜滤波器种类和个数,使用同步检测器和侧抑制网络对识别效果的影响。实验结果表明,使用基于等效矩阵带宽(Equivalent Rectangular Bandwidth,ERB)尺度的Gammatone滤波器的识别效果优于基于Bark尺度的小波包滤波器;54个通道的滤波器能有效权衡算法时间成本和识别正确率;使用同步检测器提取SLR谱特征的识别效果优于侧抑制网络提取的LIN(Lateral Inhibition Network)谱特征。腭裂语音高鼻音四类等级自动识别系统最高分类正确率达91.50%。
The automatic detection of hypernasality degrees in cleft palate speech can provide effective, objective and non-invasive basis for the assessment of velopharyngeal function in clinical. In this work, an automatic detection system of hypernasality degrees in cleft palate has been researched. The human auditory model is applied to extract the inner presentation of speech signal as the front-end processing, and the SLR(Soft-Limited Ratio)spectral features extracted from the synchronous detector is used as the acoustic characteristic parameters. The 1-v-1 SVM(1-v-1 Support Vector Machine)is utilized to automatically detect the hypernasality degrees(normal, mild, moderate and severe hypernasality). Experimental data include total 3 086 speeches from 56 kids, the comparisons of filter bank’s kind and number, synchronous detector and lateral inhibitory network are discussed. And the results show that the Gammatone filter based on ERB(Equivalent Rectangular Bandwidth)scale performs better than the wavelet-packet filter based on Bark scale, and the filter bank with 54 channels can effectively weigh the time cost and recognition accuracy of our algorithm, and SLR spectral features extracted from the synchronous detector has better recognition than LIN spectral features extracted from the lateral inhibition network. The highest accuracy of the automatic detection of four-hypernasality degree is 91.50%.
作者
付方玲
何飞
付佳
尹恒
黄华
何凌
FU Fangling;HE Fei;FU Jia;YIN Heng;HUANG Hua;HE Ling(College of Electrical Engineering and Information Technology, Sichuan University, Chengdu 610065, China;West China Hospital of Stomatology, Sichuan University, Chengdu 610041, China)
出处
《计算机工程与应用》
CSCD
北大核心
2019年第10期127-134,共8页
Computer Engineering and Applications
基金
国家自然科学基金青年科学基金项目(No.61503264)
关键词
腭裂语音
高鼻音
听觉模型
同步检测器
cleft palate speech
hypernasality
auditory model
synchronous detector