This pilot study focuses on employment of hybrid LMS-ICA system for in-vehicle background noise reduction.Modern vehicles are nowadays increasingly supporting voice commands,which are one of the pillars of autonomous ...This pilot study focuses on employment of hybrid LMS-ICA system for in-vehicle background noise reduction.Modern vehicles are nowadays increasingly supporting voice commands,which are one of the pillars of autonomous and SMART vehicles.Robust speaker recognition for context-aware in-vehicle applications is limited to a certain extent by in-vehicle back-ground noise.This article presents the new concept of a hybrid system which is implemented as a virtual instrument.The highly modular concept of the virtual car used in combination with real recordings of various driving scenarios enables effective testing of the investigated methods of in-vehicle background noise reduction.The study also presents a unique concept of an adaptive system using intelligent clusters of distributed next generation 5G data networks,which allows the exchange of interference information and/or optimal hybrid algorithm settings between individual vehicles.On average,the unfiltered voice commands were successfully recognized in 29.34%of all scenarios,while the LMS reached up to 71.81%,and LMS-ICA hybrid improved the performance further to 73.03%.展开更多
In this paper, a speech signal recovery algorithm is presented for a personalized voice command automatic recognition system in vehicle and restaurant environments. This novel algorithm is able to separate a mixed spe...In this paper, a speech signal recovery algorithm is presented for a personalized voice command automatic recognition system in vehicle and restaurant environments. This novel algorithm is able to separate a mixed speech source from multiple speakers, detect presence/absence of speakers by tracking the higher magnitude portion of speech power spectrum and adaptively suppress noises. An automatic speech recognition (ASR) process to deal with the multi-speaker task is designed and implemented. Evaluation tests have been carried out by using the speech da- tabase NOIZEUS and the experimental results show that the proposed algorithm achieves impressive performance improvements.展开更多
Multi-media overcomes the defects of traditional teaching means so foreign language teaching rapidly develops with such technology. It becomes a bottleneck to restrict intelligence learning software development. To so...Multi-media overcomes the defects of traditional teaching means so foreign language teaching rapidly develops with such technology. It becomes a bottleneck to restrict intelligence learning software development. To solve the problem, this paper discuss basic knowledge in speech recognition and studies targeted corpus according to English pronunciation habit of Chinese people. Integrated with oral English learners' requirements with Chinese as native language, this paper applies DTW model-based speech recognition technology for Viterbi decoding speech, then it recognizes and scores through posterior probability. After experiment verification, English pronunciation recognition model in this paper is verified to be reasonable and credible and it can offer learners' timely, accurate and objective evaluation and feedback direction to correct pronunciation errors to improve oral English learning efficiency.展开更多
This paper describes a method for reducing sudden noise using noise detection and classification methods, and noise power estimation. Sudden noise detection and classification have been dealt with in our previous stud...This paper describes a method for reducing sudden noise using noise detection and classification methods, and noise power estimation. Sudden noise detection and classification have been dealt with in our previous study. In this paper, GMM-based noise reduction is performed using the detection and classification results. As a result of classification, we can determine the kind of noise we are dealing with, but the power is unknown. In this paper, this problem is solved by combining an estimation of noise power with the noise reduction method. In our experiments, the proposed method achieved good performance for recognition of utterances overlapped by sudden noises.展开更多
目的探究噪声暴露者的噪声下信噪比损失(SNR loss)与耳蜗电图的关系及其对隐性听力损失的辅助诊断价值。方法选取有噪声暴露史的工人41例(41耳),分别进行纯音测听、声导抗、噪声下的言语识别以及耳蜗电图测试,依据噪声下言语识别能力分...目的探究噪声暴露者的噪声下信噪比损失(SNR loss)与耳蜗电图的关系及其对隐性听力损失的辅助诊断价值。方法选取有噪声暴露史的工人41例(41耳),分别进行纯音测听、声导抗、噪声下的言语识别以及耳蜗电图测试,依据噪声下言语识别能力分为两组,A组:SNR loss<0(19耳),B组:SNR loss≥0(22耳),分析两组耳蜗电图的差异。结果噪声下言语识别测试结果显示,A、B两组受试者的信噪比损失差异有统计学意义(P<0.05);耳蜗电图结果显示,在96、90、80 dB nHL三个刺激强度下A组AP振幅大于B组,差异有统计学意义(P<0.05);在96、90、80、70、60 dB nHL五个刺激强度下B组SP振幅大于A组,差异有显著统计学意义(P<0.001);在96、90、80、70 dB nHL四个刺激强度下,B组SP/AP振幅比大于A组,差异有统计学意义(P<0.05)。结论信噪比损失<0与≥0的噪声暴露者耳蜗电图SP/AP振幅比在不同声强下有显著差异。展开更多
This study aims to reduce the interference of ambient noise in mobile communication,improve the accuracy and authenticity of information transmitted by sound,and guarantee the accuracy of voice information deliv-ered ...This study aims to reduce the interference of ambient noise in mobile communication,improve the accuracy and authenticity of information transmitted by sound,and guarantee the accuracy of voice information deliv-ered by mobile communication.First,the principles and techniques of speech enhancement are analyzed,and a fast lateral recursive least square method(FLRLS method)is adopted to process sound data.Then,the convolutional neural networks(CNNs)-based noise recognition CNN(NR-CNN)algorithm and speech enhancement model are proposed.Finally,related experiments are designed to verify the performance of the proposed algorithm and model.The experimental results show that the noise classification accuracy of the NR-CNN noise recognition algorithm is higher than 99.82%,and the recall rate and F1 value are also higher than 99.92.The proposed sound enhance-ment model can effectively enhance the original sound in the case of noise interference.After the CNN is incorporated,the average value of all noisy sound perception quality evaluation system values is improved by over 21%compared with that of the traditional noise reduction method.The proposed algorithm can adapt to a variety of voice environments and can simultaneously enhance and reduce noise processing on a variety of different types of voice signals,and the processing effect is better than that of traditional sound enhancement models.In addition,the sound distortion index of the proposed speech enhancement model is inferior to that of the control group,indicating that the addition of the CNN neural network is less likely to cause sound signal distortion in various sound environments and shows superior robustness.In summary,the proposed CNN-based speech enhancement model shows significant sound enhancement effects,stable performance,and strong adapt-ability.This study provides a reference and basis for research applying neural networks in speech enhancement.展开更多
为进一步提升铁路客运站嘈杂环境下的语音识别效果,文章提出一种基于Conformer的语音降噪模型ConformerGAN。其训练流程类似生成对抗网络,生成器采用Conformer进行语音特征提取,对特征建模;鉴别器使用代理评估函数对语音感知进行质量评...为进一步提升铁路客运站嘈杂环境下的语音识别效果,文章提出一种基于Conformer的语音降噪模型ConformerGAN。其训练流程类似生成对抗网络,生成器采用Conformer进行语音特征提取,对特征建模;鉴别器使用代理评估函数对语音感知进行质量评价。为增强模型的泛化能力并提高模型对未知噪声的降噪能力,在噪声的叠加上采用随机截取片段融入的方式,并构建铁路客运站场景噪声数据集。与语音降噪相关模型效果对比的结果表明,ConformerGAN模型可将客观语音质量评估(PESQ,Perceptual Evaluation of Speech Quality)分数提高0.19,有效提高铁路客运站嘈杂环境下的语音识别准确率,改善铁路旅客语音交互体验。展开更多
基金This research was funded by the European Regional Development Fund in the Research Centre of Advanced Mechatronic Systems project, project number CZ.02.1.01/0.0/0.0/16_019 /0000867by the Ministry of Education of the Czech Republic, Project No. SP2021/32.
文摘This pilot study focuses on employment of hybrid LMS-ICA system for in-vehicle background noise reduction.Modern vehicles are nowadays increasingly supporting voice commands,which are one of the pillars of autonomous and SMART vehicles.Robust speaker recognition for context-aware in-vehicle applications is limited to a certain extent by in-vehicle back-ground noise.This article presents the new concept of a hybrid system which is implemented as a virtual instrument.The highly modular concept of the virtual car used in combination with real recordings of various driving scenarios enables effective testing of the investigated methods of in-vehicle background noise reduction.The study also presents a unique concept of an adaptive system using intelligent clusters of distributed next generation 5G data networks,which allows the exchange of interference information and/or optimal hybrid algorithm settings between individual vehicles.On average,the unfiltered voice commands were successfully recognized in 29.34%of all scenarios,while the LMS reached up to 71.81%,and LMS-ICA hybrid improved the performance further to 73.03%.
文摘In this paper, a speech signal recovery algorithm is presented for a personalized voice command automatic recognition system in vehicle and restaurant environments. This novel algorithm is able to separate a mixed speech source from multiple speakers, detect presence/absence of speakers by tracking the higher magnitude portion of speech power spectrum and adaptively suppress noises. An automatic speech recognition (ASR) process to deal with the multi-speaker task is designed and implemented. Evaluation tests have been carried out by using the speech da- tabase NOIZEUS and the experimental results show that the proposed algorithm achieves impressive performance improvements.
文摘Multi-media overcomes the defects of traditional teaching means so foreign language teaching rapidly develops with such technology. It becomes a bottleneck to restrict intelligence learning software development. To solve the problem, this paper discuss basic knowledge in speech recognition and studies targeted corpus according to English pronunciation habit of Chinese people. Integrated with oral English learners' requirements with Chinese as native language, this paper applies DTW model-based speech recognition technology for Viterbi decoding speech, then it recognizes and scores through posterior probability. After experiment verification, English pronunciation recognition model in this paper is verified to be reasonable and credible and it can offer learners' timely, accurate and objective evaluation and feedback direction to correct pronunciation errors to improve oral English learning efficiency.
文摘This paper describes a method for reducing sudden noise using noise detection and classification methods, and noise power estimation. Sudden noise detection and classification have been dealt with in our previous study. In this paper, GMM-based noise reduction is performed using the detection and classification results. As a result of classification, we can determine the kind of noise we are dealing with, but the power is unknown. In this paper, this problem is solved by combining an estimation of noise power with the noise reduction method. In our experiments, the proposed method achieved good performance for recognition of utterances overlapped by sudden noises.
文摘目的探究噪声暴露者的噪声下信噪比损失(SNR loss)与耳蜗电图的关系及其对隐性听力损失的辅助诊断价值。方法选取有噪声暴露史的工人41例(41耳),分别进行纯音测听、声导抗、噪声下的言语识别以及耳蜗电图测试,依据噪声下言语识别能力分为两组,A组:SNR loss<0(19耳),B组:SNR loss≥0(22耳),分析两组耳蜗电图的差异。结果噪声下言语识别测试结果显示,A、B两组受试者的信噪比损失差异有统计学意义(P<0.05);耳蜗电图结果显示,在96、90、80 dB nHL三个刺激强度下A组AP振幅大于B组,差异有统计学意义(P<0.05);在96、90、80、70、60 dB nHL五个刺激强度下B组SP振幅大于A组,差异有显著统计学意义(P<0.001);在96、90、80、70 dB nHL四个刺激强度下,B组SP/AP振幅比大于A组,差异有统计学意义(P<0.05)。结论信噪比损失<0与≥0的噪声暴露者耳蜗电图SP/AP振幅比在不同声强下有显著差异。
基金supported by General Project of Philosophy and Social Science Research in Colleges and Universities in Jiangsu Province(2022SJYB0712)Research Development Fund for Young Teachers of Chengxian College of Southeast University(z0037)Special Project of Ideological and Political Education Reform and Research Course(yjgsz2206).
文摘This study aims to reduce the interference of ambient noise in mobile communication,improve the accuracy and authenticity of information transmitted by sound,and guarantee the accuracy of voice information deliv-ered by mobile communication.First,the principles and techniques of speech enhancement are analyzed,and a fast lateral recursive least square method(FLRLS method)is adopted to process sound data.Then,the convolutional neural networks(CNNs)-based noise recognition CNN(NR-CNN)algorithm and speech enhancement model are proposed.Finally,related experiments are designed to verify the performance of the proposed algorithm and model.The experimental results show that the noise classification accuracy of the NR-CNN noise recognition algorithm is higher than 99.82%,and the recall rate and F1 value are also higher than 99.92.The proposed sound enhance-ment model can effectively enhance the original sound in the case of noise interference.After the CNN is incorporated,the average value of all noisy sound perception quality evaluation system values is improved by over 21%compared with that of the traditional noise reduction method.The proposed algorithm can adapt to a variety of voice environments and can simultaneously enhance and reduce noise processing on a variety of different types of voice signals,and the processing effect is better than that of traditional sound enhancement models.In addition,the sound distortion index of the proposed speech enhancement model is inferior to that of the control group,indicating that the addition of the CNN neural network is less likely to cause sound signal distortion in various sound environments and shows superior robustness.In summary,the proposed CNN-based speech enhancement model shows significant sound enhancement effects,stable performance,and strong adapt-ability.This study provides a reference and basis for research applying neural networks in speech enhancement.
文摘为进一步提升铁路客运站嘈杂环境下的语音识别效果,文章提出一种基于Conformer的语音降噪模型ConformerGAN。其训练流程类似生成对抗网络,生成器采用Conformer进行语音特征提取,对特征建模;鉴别器使用代理评估函数对语音感知进行质量评价。为增强模型的泛化能力并提高模型对未知噪声的降噪能力,在噪声的叠加上采用随机截取片段融入的方式,并构建铁路客运站场景噪声数据集。与语音降噪相关模型效果对比的结果表明,ConformerGAN模型可将客观语音质量评估(PESQ,Perceptual Evaluation of Speech Quality)分数提高0.19,有效提高铁路客运站嘈杂环境下的语音识别准确率,改善铁路旅客语音交互体验。