In order to solve the problem that the performance of traditional localization methods for mixed near-field sources(NFSs)and far-field sources(FFSs)degrades under impulsive noise,a robust and novel localization method...In order to solve the problem that the performance of traditional localization methods for mixed near-field sources(NFSs)and far-field sources(FFSs)degrades under impulsive noise,a robust and novel localization method is proposed.After eliminating the impacts of impulsive noise by the weighted out-lier filter,the direction of arrivals(DOAs)of FFSs can be estimated by multiple signal classification(MUSIC)spectral peaks search.Based on the DOAs information of FFSs,the separation of mixed sources can be performed.Finally,the estimation of localizing parameters of NFSs can avoid two-dimension spectral peaks search by decomposing steering vectors.The Cramer-Rao bounds(CRB)for the unbiased estimations of DOA and range under impulsive noise have been drawn.Simulation experiments verify that the proposed method has advantages in probability of successful estimation(PSE)and root mean square error(RMSE)compared with existing localization methods.It can be concluded that the proposed method is effective and reliable in the environment with low generalized signal to noise ratio(GSNR),few snapshots,and strong impulse.展开更多
Speech separation is an active research topic that plays an important role in numerous applications,such as speaker recognition,hearing pros-thesis,and autonomous robots.Many algorithms have been put forward to improv...Speech separation is an active research topic that plays an important role in numerous applications,such as speaker recognition,hearing pros-thesis,and autonomous robots.Many algorithms have been put forward to improve separation performance.However,speech separation in reverberant noisy environment is still a challenging task.To address this,a novel speech separation algorithm using gate recurrent unit(GRU)network based on microphone array has been proposed in this paper.The main aim of the proposed algorithm is to improve the separation performance and reduce the computational cost.The proposed algorithm extracts the sub-band steered response power-phase transform(SRP-PHAT)weighted by gammatone filter as the speech separation feature due to its discriminative and robust spatial position in formation.Since the GRU net work has the advantage of processing time series data with faster training speed and fewer training parameters,the GRU model is adopted to process the separation featuresof several sequential frames in the same sub-band to estimate the ideal Ratio Masking(IRM).The proposed algorithm decomposes the mixture signals into time-frequency(TF)units using gammatone filter bank in the frequency domain,and the target speech is reconstructed in the frequency domain by masking the mixture signal according to the estimated IRM.The operations of decomposing the mixture signal and reconstructing the target signal are completed in the frequency domain which can reduce the total computational cost.Experimental results demonstrate that the proposed algorithm realizes omnidirectional speech sep-aration in noisy and reverberant environments,provides good performance in terms of speech quality and intelligibility,and has the generalization capacity to reverberate.展开更多
目的:针对野战噪声条件下便携式野战医疗装备的语音交互性能受到影响的问题,设计一种小尺寸双麦前端系统。方法:该系统基于最小二乘准则实现小尺寸双麦波速形成,进而实现前端语音增强。系统硬件主要由双麦、信号预处理模块、嵌入式处理...目的:针对野战噪声条件下便携式野战医疗装备的语音交互性能受到影响的问题,设计一种小尺寸双麦前端系统。方法:该系统基于最小二乘准则实现小尺寸双麦波速形成,进而实现前端语音增强。系统硬件主要由双麦、信号预处理模块、嵌入式处理器、模拟数字转换器(analog to digital converter,ADC)、数字模拟转换器(digital to analog converter,DAC)、供电模块等组成。其中,双麦采用2个贴片式微机电系统(micro electro mechanical system,MEMS)麦克风,信号预处理模块、ADC、DAC内置在通用音频编码器WM8978中,嵌入式处理器采用STM32F405系列处理器,供电模块采用LM1117电压调节器芯片。系统软件采用KeilμVision4开发软件编译和测试。为验证该系统的性能,进行指向性实验和语音增强实验。结果:指向性实验结果表明,在0.5~2.0 kHz频率范围内,该系统在各频点的指向性一致性较好;语音增强实验结果表明,在枪声、监护仪报警、医疗器皿碰撞3类非平稳噪声条件下,该系统可有效提升语音的音质及识别率。结论:该系统能实现语音增强,可为便携式野战医疗装备的语音交互提供有效的支持。展开更多
目的探讨多种听力学检测方法在听性脑干反应(ABR)最大输出未引出患儿的听力学诊断中的应用价值。方法回顾性分析69例(138耳)ABR最大强度未引出患儿的临床资料,年龄42天到5岁,平均1岁6个月,鼓室导抗图均为A型或正向单峰,声反射均未引出,...目的探讨多种听力学检测方法在听性脑干反应(ABR)最大输出未引出患儿的听力学诊断中的应用价值。方法回顾性分析69例(138耳)ABR最大强度未引出患儿的临床资料,年龄42天到5岁,平均1岁6个月,鼓室导抗图均为A型或正向单峰,声反射均未引出,影像学检查内耳无畸形。69例患儿均进行ABR、耳蜗微音电位(CM)、畸变产物耳声发射(DPOAE)和听性稳态反应(ASSR)测试。结果69例138耳中,8例16耳(11.59%)记录到CM,其中10耳(7.25%)记录到DPOAE,0.5、1、2、4 kHz ASSR反应阈值分别为83.2±13.1、82.9±13.0、75.3±12.4、63.1±9.1 dB nHL,结合其他检查结果诊断为听神经病。余61例(122耳)CM和DPOAE均未引出,0.5、1、2、4 kHz的ASSR引出率分别为82.3%、81.9%、76.9%、60.2%,其中20耳ASSR各频率均未引出,102耳至少一个频率引出,0.5、1、2、4 kHz ASSR反应阈分别为93.2±6.1、99.8±7.0、105.4±5.4、108.2±9.8 dB nHL,诊断为极重度感音神经性聋。结论对于ABR最大输出强度未引出的患儿,CM和/或DPOAE引出且ASSR各频率反应阈低于感音神经性聋患儿,有助于听神经病的诊断;CM和DPOAE均未引出有助于极重度感音神经性聋的诊断,ASSR测试有助于评估其残余听力。展开更多
基金supported by the National Natural Science Foundation of China(62073093)the initiation fund for postdoctoral research in Heilongjiang Province(LBH-Q19098)the Natural Science Foundation of Heilongjiang Province(LH2020F017).
文摘In order to solve the problem that the performance of traditional localization methods for mixed near-field sources(NFSs)and far-field sources(FFSs)degrades under impulsive noise,a robust and novel localization method is proposed.After eliminating the impacts of impulsive noise by the weighted out-lier filter,the direction of arrivals(DOAs)of FFSs can be estimated by multiple signal classification(MUSIC)spectral peaks search.Based on the DOAs information of FFSs,the separation of mixed sources can be performed.Finally,the estimation of localizing parameters of NFSs can avoid two-dimension spectral peaks search by decomposing steering vectors.The Cramer-Rao bounds(CRB)for the unbiased estimations of DOA and range under impulsive noise have been drawn.Simulation experiments verify that the proposed method has advantages in probability of successful estimation(PSE)and root mean square error(RMSE)compared with existing localization methods.It can be concluded that the proposed method is effective and reliable in the environment with low generalized signal to noise ratio(GSNR),few snapshots,and strong impulse.
基金This work is supported by Nanjing Institute of Technology(NIT)fund for Research Startup Projects of Introduced talents under Grant No.YKJ202019Nature Sci-ence Research Project of Higher Education Institutions in Jiangsu Province under Grant No.21KJB510018+1 种基金National Nature Science Foundation of China(NSFC)under Grant No.62001215NIT fund for Doctoral Research Projects under Grant No.ZKJ2020003.
文摘Speech separation is an active research topic that plays an important role in numerous applications,such as speaker recognition,hearing pros-thesis,and autonomous robots.Many algorithms have been put forward to improve separation performance.However,speech separation in reverberant noisy environment is still a challenging task.To address this,a novel speech separation algorithm using gate recurrent unit(GRU)network based on microphone array has been proposed in this paper.The main aim of the proposed algorithm is to improve the separation performance and reduce the computational cost.The proposed algorithm extracts the sub-band steered response power-phase transform(SRP-PHAT)weighted by gammatone filter as the speech separation feature due to its discriminative and robust spatial position in formation.Since the GRU net work has the advantage of processing time series data with faster training speed and fewer training parameters,the GRU model is adopted to process the separation featuresof several sequential frames in the same sub-band to estimate the ideal Ratio Masking(IRM).The proposed algorithm decomposes the mixture signals into time-frequency(TF)units using gammatone filter bank in the frequency domain,and the target speech is reconstructed in the frequency domain by masking the mixture signal according to the estimated IRM.The operations of decomposing the mixture signal and reconstructing the target signal are completed in the frequency domain which can reduce the total computational cost.Experimental results demonstrate that the proposed algorithm realizes omnidirectional speech sep-aration in noisy and reverberant environments,provides good performance in terms of speech quality and intelligibility,and has the generalization capacity to reverberate.
文摘目的:针对野战噪声条件下便携式野战医疗装备的语音交互性能受到影响的问题,设计一种小尺寸双麦前端系统。方法:该系统基于最小二乘准则实现小尺寸双麦波速形成,进而实现前端语音增强。系统硬件主要由双麦、信号预处理模块、嵌入式处理器、模拟数字转换器(analog to digital converter,ADC)、数字模拟转换器(digital to analog converter,DAC)、供电模块等组成。其中,双麦采用2个贴片式微机电系统(micro electro mechanical system,MEMS)麦克风,信号预处理模块、ADC、DAC内置在通用音频编码器WM8978中,嵌入式处理器采用STM32F405系列处理器,供电模块采用LM1117电压调节器芯片。系统软件采用KeilμVision4开发软件编译和测试。为验证该系统的性能,进行指向性实验和语音增强实验。结果:指向性实验结果表明,在0.5~2.0 kHz频率范围内,该系统在各频点的指向性一致性较好;语音增强实验结果表明,在枪声、监护仪报警、医疗器皿碰撞3类非平稳噪声条件下,该系统可有效提升语音的音质及识别率。结论:该系统能实现语音增强,可为便携式野战医疗装备的语音交互提供有效的支持。
文摘目的探讨多种听力学检测方法在听性脑干反应(ABR)最大输出未引出患儿的听力学诊断中的应用价值。方法回顾性分析69例(138耳)ABR最大强度未引出患儿的临床资料,年龄42天到5岁,平均1岁6个月,鼓室导抗图均为A型或正向单峰,声反射均未引出,影像学检查内耳无畸形。69例患儿均进行ABR、耳蜗微音电位(CM)、畸变产物耳声发射(DPOAE)和听性稳态反应(ASSR)测试。结果69例138耳中,8例16耳(11.59%)记录到CM,其中10耳(7.25%)记录到DPOAE,0.5、1、2、4 kHz ASSR反应阈值分别为83.2±13.1、82.9±13.0、75.3±12.4、63.1±9.1 dB nHL,结合其他检查结果诊断为听神经病。余61例(122耳)CM和DPOAE均未引出,0.5、1、2、4 kHz的ASSR引出率分别为82.3%、81.9%、76.9%、60.2%,其中20耳ASSR各频率均未引出,102耳至少一个频率引出,0.5、1、2、4 kHz ASSR反应阈分别为93.2±6.1、99.8±7.0、105.4±5.4、108.2±9.8 dB nHL,诊断为极重度感音神经性聋。结论对于ABR最大输出强度未引出的患儿,CM和/或DPOAE引出且ASSR各频率反应阈低于感音神经性聋患儿,有助于听神经病的诊断;CM和DPOAE均未引出有助于极重度感音神经性聋的诊断,ASSR测试有助于评估其残余听力。