期刊文献+
共找到2篇文章
< 1 >
每页显示 20 50 100
Physiological-physical feature fusion for automatic voice spoofing detection
1
作者 junxiao xue Hao ZHOU 《Frontiers of Computer Science》 SCIE EI CSCD 2023年第2期157-166,共10页
Biometric speech recognition systems are often subject to various spoofing attacks,the most common of which are speech synthesis and speech conversion attacks.These spoofing attacks can cause the biometric speech reco... Biometric speech recognition systems are often subject to various spoofing attacks,the most common of which are speech synthesis and speech conversion attacks.These spoofing attacks can cause the biometric speech recognition system to incorrectly accept these spoofing attacks,which can compromise the security of this system.Researchers have made many efforts to address this problem,and the existing studies have used the physical features of speech to identify spoofing attacks.However,recent studies have shown that speech contains a large number of physiological features related to the human face.For example,we can determine the speaker's gender,age,mouth shape,and other information by voice.Inspired by the above researches,we propose a spoofing attack recognition method based on physiological-physical features fusion.This method involves feature extraction,a densely connected convolutional neural network with squeeze and excitation block(SE-DenseNet),and feature fusion strategies.We first extract physiological features in audio from a pretrained convolutional network.Then we use SE-DenseNet to extract physical features.Such a dense connection pattern has high parameter efficiency,and squeeze and excitation blocks can enhance the transmission of the feature.Finally,we integrate the two features into the classification network to identify the spoofing attacks.Experimental results on the ASVspoof 2019 data set show that our model is effective for voice spoofing detection.In the logical access scenario,our model improves the tandem decision cost function and equal error rate scores by 5%and 7%,respectively,compared to existing methods. 展开更多
关键词 spoofing attacks SE-DenseNet physiological feature
原文传递
Fine-grained sequence-to-sequence lip reading based on self-attention and self-distillation
2
作者 junxiao xue Shibo HUANG +1 位作者 Huawei SONG Lei SHI 《Frontiers of Computer Science》 SCIE EI CSCD 2023年第6期151-153,共3页
1 Introduction The lip reading involves converting the image sequence into the corresponding text sequence.Currently,lip reading has significant applications in many fields,such as assisted speech recognition,helping ... 1 Introduction The lip reading involves converting the image sequence into the corresponding text sequence.Currently,lip reading has significant applications in many fields,such as assisted speech recognition,helping the speech impaired.Lip reading belongs to fine-grained video analysis and requires the local information and the overall spatial information of sequence.Most existing approaches capture local spatial information with CNN and temporal information with RNN generally.Considering these general methods,we propose a fine-grained method based on self-attention and self-distillation.The whole model mainly includes the CNN front-end,pixel-wise learning,temporal learning,and decoder.Specifically,we apply the CNN front-end to capture shallow spatial features inside the image sequence,and employ the Resformer module including self-attention to learn the global spatial correlation between pixels,namely,pixel-wise learning. 展开更多
关键词 DISTILLATION IMPAIRED apply
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部