摘要
目的探讨提取腭裂语音中过高鼻音特征性共振峰参数建立的级联声道模型和小波包变换结合线性预测系数(LPC)2种算法模型,在识别腭裂患者高鼻音中的应用效果。方法选取2015年10月至2018年12月,在四川大学华西口腔医院语音矫治专科就诊的859例腭裂患者,其中男421例,女438例,平均年龄12.1岁。正常语音216例,轻度高鼻音220例,中度高鼻音213例,重度高鼻音210例。按照汉语普通话测试工具收集包括词组、短句的语音样本共62707份。运用级联声道模型、小波包变换结合LPC的语音信号识别方法提取共振峰参数,采用K近邻分类器,对数据进行分类,判别有无过高鼻音及具体等级。将2种算法模型的分类结果与人工语音评估金标准结果进行对比,运用卡方检验分析其准确性。结果级联声道模型和小波包变换结合LPC提取共振峰参数这2种方法判断高鼻音有无的正确率分别为80.56%(692/859)和89.99%(773/859),对高鼻音等级判断的总正确率为72.29%(621/859)和88.13%(757/859),差异均具有统计学意义(P<0.05)。2种算法对每个高鼻音等级自动判别的正确率均为小波包变换结合LPC优于级联声道模型,且差异具有统计学意义(P<0.05)。2种方法对高鼻音等级类别的识别错误类型中,最严重的错误均为将正常语音判断为轻度高鼻音,小波包变换结合LPC法与级联声道模型分别达到了18.98%(41/216)与14.81%(32/216),但前者的其余错误率均在5%以下,优于后者。结论小波包变换结合LPC的算法与级联声道模型相比,在判断腭裂患者高鼻音有无及等级方面正确率更高,可辅助人工语音师对腭裂患者的语音评估。
Objective To investigate the efficacy of 2 algorithm models,cascade channel model and combined wavelet with linear prediction coefficient(LPC),on extracting the hypernasal format parameters of cleft palate speech.Methods The voice of 859 patients,421 male and 438 female with average age of 12.1 years,were collected from the speech data of the Department of Cleft Lip and Palate Surgery of West China Hospital of Stomatology of Sichuan University.The patients were classified into 216 normal speech patients,220 low-level hypernasal patients,213 moderate-level hypernasal patients and 210 high-level hypernasal patients.62707 speech samples were collected.Cascade channel model and combined wavelet with LPC were used to combine the K-nearest neighbor classifier respectively to distinguish the hypernasal level,and the result were compared with the golden standard,i.e.the speech evaluation result.The result were analyzed statistically with chi-square test.Results Compared to the cascaded channel model,levels combined wavelet with LPC achieved significantly higher accuracy of all hypernasal levels(P<0.05).Among all different mis-classifications,the most common error of the 2 models was misjudging normal speech patients as low-level hypernasal patients(for cascaded channel model:41/216,18.98%;for combined wavelet with LPC:32/216,14.81%).Conclusions Two algorithm models based on formant parameters for hypernasal recognition of cleft palate was established.Combined wavelet with LPC both realized the automatic identification of hypernasal level in Mandarin Chinese.The average classification accuracy of hypernasal level evaluation by using combined wavelet with LPC is higher.
作者
毛渤淳
马平川
郭春丽
何凌
梅宏翔
尹恒
Mao Bochun;Ma Pingchuan;Guo Chunli;He Ling;Mei Hongxiang;Yin Heng(State Key Laboratory of Oral Diseases&National Clinical Research Center for Oral Diseases,Sichuan University,Chengdu 610041,China;Department of Cleft Lip and Palate Surgery,West China Hospital of Stomatology,Sichuan University,Chengdu 610041,China;School of Electrical Engineering and Information,Sichuan University,Chengdu 610065,China;Department of Orthodontics,Peking University School and Hospital of Stomatology,Beijing 100081,China)
出处
《中华整形外科杂志》
CAS
CSCD
北大核心
2020年第11期1246-1252,共7页
Chinese Journal of Plastic Surgery
基金
国家自然科学基金青年基金(61503264)。
关键词
腭咽闭合不全
语音
信号处理
计算机辅助
共振峰
声学
Velopharyngeal insufficiency
Voice
Signal processing,computer-assisted
Formant
Acoustics