摘要
声纹识别,是根据声纹特征识别说话人身份的一种生物识别技术。与人脸识别、指纹识别、虹膜识别相比,声纹识别的数据获取更加便捷,不受时间地域的限制,数据采集成本更低,公众对声音采集的抵抗力较弱,已经在安防、刑侦、金融等多个领域被使用。而声纹识别算法的关键是描述特定对象的声纹特征,好的特征既要最大化保留说话人的语音特性,又要对噪音、语速、音量、说话内容等有较好的鲁棒性。针对语音数据较少、文本不相关情况下的声纹识别,本研究采用频繁序列挖掘技术对声音的梅尔倒谱系数(Mel-scale Frequency Cepstral Coefficients,MFCC)组成的序列进行挖掘,将挖掘到的频繁序列作为说话人的语音特征,再使用PLDA判别方法,结果显示该模型对语音数据较少的情况识别效果良好。
Sound pattern recognition is a biometric technique that identifies the speaker according to the characteristics of sound pattern.Compared with face recognition,fingerprint recognition and iris recognition,the data acquisition of sound pat⁃tern recognition is more convenient,not limited by time and region,the cost of data collection is lower,the public’s resis⁃tance to voice acquisition is weak,and it has been used in security,forensics,finance and other fields.The key of the sound pattern recognition algorithm is to describe the sound pattern characteristics of a particular object,and the good fea⁃tures should not only maximize the speech characteristics of the speaker,but also have better robustness to noise,speed of speech,volume,speech content,etc.Aiming at the recognition of sound patterns with less speech data and unrelated text,this study uses frequent sequence mining techniques to excavate the sequence of Mel-scale Frequency Cepstral Coefficients(MFCC)of sound,takes the frequent sequences mined as the speech characteristics of the speaker,and then using PLDA discrimination method,the results show that the model has good recognition effect on the situation of less speech data.
作者
王健
申炜涛
耿皓松
张艳
Wang Jian;Shen Weitao;Geng Haosong;Zhang Yan(School of Computer Science and Engineering,North China Institute of Aerospace Engineering,Langfang 065000,China)
出处
《北华航天工业学院学报》
CAS
2022年第1期10-12,共3页
Journal of North China Institute of Aerospace Engineering
基金
北华航天工业学院青年基金(KY-2020-21)
河北省高等学校科学技术研究项目(ZC2021006)。
关键词
说话人识别
声纹识别
序列挖掘
speaker identification
voiceprint recognition
sequence mining