摘要
针对现有一致性判决方法主要对整句(段)话进行分析,并无对分析内容加以筛选,存在运算繁琐及结果易受静音等弱关联片段影响等不足,以唇型变化显著的韵母发音单元为研究重心,通过分析聚类后各韵母类别的音唇关联度,选出更具代表性的特定韵母单元并结合位置时延分析,提出了基于特定韵母发音事件分析的音唇一致性判决方法。该方法先分割并识别出特定韵母单元;然后求出以上各韵母发音事件的音唇相关度,并对特定韵母出现位置的时延分布进行分析;最后融合特定韵母事件的音唇相关度得分与位置时延分析评分进行一致性判决。通过实验对该方法与其他方法进行了对比,结果表明,该算法在识别性能上优于多种整句分析的比较算法,同时也相应降低了运算量。
The traditional lip motion and voice consistency recognition method is to analyze the whole sentence without filtering the content,which is complicate in computation and its results are vulnerable to weak related segments such as mute.The vowels which with significant lip shape changes were researched in depth.By analyzing the audio and lip motion correlation of each vowel category clustered by lip sequence features,a more representative specific phonological pronunciation unit was selected as the analysis object.Combined with audio-visual delay analysis,a consistent recognition method based on specific vowel pronunciation events analysis was proposed.Firstly,the selected unit was segmented and identified.Then the correlation degree of each specific vowel event was obtained,and the delay distribution of each specific vowel occurrence position was statistically scored.Finally,a consistency judgment was made by combining the vowel pronunciation event audio-visual correlation score with the position delay analysis score.Compared with other methods through experiments,results show that the proposed method is superior in recognition performance and reduces the amount of computation.
作者
朱铮宇
邱华愉
杨春玲
王泳
ZHU Zhengyu;QIU Huayu;YANG Chunling;WANG Yong(School of Electronic and Information Engineering,South China University of Technology,Guangzhou 510640,Guangdong,China;School of Electronics and Information,Guangdong Polytechnic Normal University,Guangzhou 510665,Guangdong,China)
出处
《华南理工大学学报(自然科学版)》
EI
CAS
CSCD
北大核心
2020年第1期139-146,共8页
Journal of South China University of Technology(Natural Science Edition)
基金
国家自然科学基金资助项目(61672173)
广东省普通高校青年创新人才类项目(2018KQNCX140)
广东省普通高校特色创新项目(2015KTSCX083)~~
关键词
音唇一致性判决方法
韵母发音事件
音唇相关度
韵母分割
lip motion and voice consistency recognition method
vowel pronunciation events
correlation of lip motion and voice consistency
vowel segmentation