Speech separation is an active research topic that plays an important role in numerous applications,such as speaker recognition,hearing pros-thesis,and autonomous robots.Many algorithms have been put forward to improv...Speech separation is an active research topic that plays an important role in numerous applications,such as speaker recognition,hearing pros-thesis,and autonomous robots.Many algorithms have been put forward to improve separation performance.However,speech separation in reverberant noisy environment is still a challenging task.To address this,a novel speech separation algorithm using gate recurrent unit(GRU)network based on microphone array has been proposed in this paper.The main aim of the proposed algorithm is to improve the separation performance and reduce the computational cost.The proposed algorithm extracts the sub-band steered response power-phase transform(SRP-PHAT)weighted by gammatone filter as the speech separation feature due to its discriminative and robust spatial position in formation.Since the GRU net work has the advantage of processing time series data with faster training speed and fewer training parameters,the GRU model is adopted to process the separation featuresof several sequential frames in the same sub-band to estimate the ideal Ratio Masking(IRM).The proposed algorithm decomposes the mixture signals into time-frequency(TF)units using gammatone filter bank in the frequency domain,and the target speech is reconstructed in the frequency domain by masking the mixture signal according to the estimated IRM.The operations of decomposing the mixture signal and reconstructing the target signal are completed in the frequency domain which can reduce the total computational cost.Experimental results demonstrate that the proposed algorithm realizes omnidirectional speech sep-aration in noisy and reverberant environments,provides good performance in terms of speech quality and intelligibility,and has the generalization capacity to reverberate.展开更多
To estimate percentiles of a response distribution, the transformed response rule of Wetherill and Robbins-Monro sequential design were proposed under Log-Logistic model. Based on responses data, a necessary and suffi...To estimate percentiles of a response distribution, the transformed response rule of Wetherill and Robbins-Monro sequential design were proposed under Log-Logistic model. Based on responses data, a necessary and sufficient condition for the existence of maximum likelihood estimators and then the calculating formula were presented. After a simulation study, the proposed approach was applied to 65# detonator. Numerical results showed that estimators of percentiles from the proposed approach are robust to the parametric models lacking information on the original response distribution.展开更多
Microphone array-based sound source localization(SSL)is a challenging task in adverse acoustic scenarios.To address this,a novel SSL algorithm based on deep neural network(DNN)using steered response power-phase transf...Microphone array-based sound source localization(SSL)is a challenging task in adverse acoustic scenarios.To address this,a novel SSL algorithm based on deep neural network(DNN)using steered response power-phase transform(SRP-PHAT)spatial spectrum as input feature is presented in this paper.Since the SRP-PHAT spatial power spectrum contains spatial location information,it is adopted as the input feature for sound source localization.DNN is exploited to extract the efficient location information from SRP-PHAT spatial power spectrum due to its advantage on extracting high-level features.SRP-PHAT at each steering position within a frame is arranged into a vector,which is treated as DNN input.A DNN model which can map the SRP-PHAT spatial spectrum to the azimuth of sound source is learned from the training signals.The azimuth of sound source is estimated through trained DNN model from the testing signals.Experiment results demonstrate that the proposed algorithm significantly improves localization performance whether the training and testing condition setup are the same or not,and is more robust to noise and reverberation.展开更多
In this article,an effective technique is developed to efficiently obtain the output responses of parameterized structural dynamic problems.This technique is based on the conception of reduced basis method and the usa...In this article,an effective technique is developed to efficiently obtain the output responses of parameterized structural dynamic problems.This technique is based on the conception of reduced basis method and the usage of linear interpolation principle.The original problem is projected onto the reduced basis space by linear interpolation projection,and subsequently an associated interpolation matrix is generated.To ensure the largest nonsingularity,the interpolation matrix needs to go through a timenode choosing process,which is developed by applying the angle of vector spaces.As a part of this technique,error estimation is recommended for achieving the computational error bound.To ensure the successful performance of this technique,the offline-online computational procedures are conducted in practical engineering.Two numerical examples demonstrate the accuracy and efficiency of the presented method.展开更多
The analysis of transient linear viscoelastic response of asphalt concrete (AC) is important for engineering applications. The traditional transient response of AC is analyzed in the time domain by performing compli...The analysis of transient linear viscoelastic response of asphalt concrete (AC) is important for engineering applications. The traditional transient response of AC is analyzed in the time domain by performing complicated convolution integral. The frequency domain approach allows one to determine the transient responses by performing simple multi- plication instead of the complicated convolution integral, and it does not require the time derivative of the input excitation, and thus, the approach could greatly reduce the analysis complexity. This study investigated the frequency domain approach in calculating the transient response by utilizing the discrete Fourier transform technique. The accuracy and effectiveness of the frequency domain approach were verified by comparing the analytical and calculated responses for the standard 3-parameter Maxwell model and by comparing the time and frequency domain solutions for AC. The effect of aliasing of the frequency domain approach can effectively reduce by selecting a small sampling interval for the time domain excitation function. A sampling interval is acceptable as long as the amplitude of the Fourier transformed excitation is close to 0 more than half of the sampling rate. The results show that the frequency domain approach provides a simple and accurate way to perform linear viscoelastic analysis of AC.展开更多
基金This work is supported by Nanjing Institute of Technology(NIT)fund for Research Startup Projects of Introduced talents under Grant No.YKJ202019Nature Sci-ence Research Project of Higher Education Institutions in Jiangsu Province under Grant No.21KJB510018+1 种基金National Nature Science Foundation of China(NSFC)under Grant No.62001215NIT fund for Doctoral Research Projects under Grant No.ZKJ2020003.
文摘Speech separation is an active research topic that plays an important role in numerous applications,such as speaker recognition,hearing pros-thesis,and autonomous robots.Many algorithms have been put forward to improve separation performance.However,speech separation in reverberant noisy environment is still a challenging task.To address this,a novel speech separation algorithm using gate recurrent unit(GRU)network based on microphone array has been proposed in this paper.The main aim of the proposed algorithm is to improve the separation performance and reduce the computational cost.The proposed algorithm extracts the sub-band steered response power-phase transform(SRP-PHAT)weighted by gammatone filter as the speech separation feature due to its discriminative and robust spatial position in formation.Since the GRU net work has the advantage of processing time series data with faster training speed and fewer training parameters,the GRU model is adopted to process the separation featuresof several sequential frames in the same sub-band to estimate the ideal Ratio Masking(IRM).The proposed algorithm decomposes the mixture signals into time-frequency(TF)units using gammatone filter bank in the frequency domain,and the target speech is reconstructed in the frequency domain by masking the mixture signal according to the estimated IRM.The operations of decomposing the mixture signal and reconstructing the target signal are completed in the frequency domain which can reduce the total computational cost.Experimental results demonstrate that the proposed algorithm realizes omnidirectional speech sep-aration in noisy and reverberant environments,provides good performance in terms of speech quality and intelligibility,and has the generalization capacity to reverberate.
文摘To estimate percentiles of a response distribution, the transformed response rule of Wetherill and Robbins-Monro sequential design were proposed under Log-Logistic model. Based on responses data, a necessary and sufficient condition for the existence of maximum likelihood estimators and then the calculating formula were presented. After a simulation study, the proposed approach was applied to 65# detonator. Numerical results showed that estimators of percentiles from the proposed approach are robust to the parametric models lacking information on the original response distribution.
基金This work is supported by the National Nature Science Foundation of China(NSFC)under Grant No.61571106Jiangsu Natural Science Foundation under Grant No.BK20170757the Natural Science Foundation of the Jiangsu Higher Education Institutions of China under grant No.17KJD510002.
文摘Microphone array-based sound source localization(SSL)is a challenging task in adverse acoustic scenarios.To address this,a novel SSL algorithm based on deep neural network(DNN)using steered response power-phase transform(SRP-PHAT)spatial spectrum as input feature is presented in this paper.Since the SRP-PHAT spatial power spectrum contains spatial location information,it is adopted as the input feature for sound source localization.DNN is exploited to extract the efficient location information from SRP-PHAT spatial power spectrum due to its advantage on extracting high-level features.SRP-PHAT at each steering position within a frame is arranged into a vector,which is treated as DNN input.A DNN model which can map the SRP-PHAT spatial spectrum to the azimuth of sound source is learned from the training signals.The azimuth of sound source is estimated through trained DNN model from the testing signals.Experiment results demonstrate that the proposed algorithm significantly improves localization performance whether the training and testing condition setup are the same or not,and is more robust to noise and reverberation.
基金supported by the National Natural Science Foundation of China (10802028)the Major State Basic Research Development Program of China (2010CB832705)the National Science Fund for Distinguished Young Scholars (10725208)
文摘In this article,an effective technique is developed to efficiently obtain the output responses of parameterized structural dynamic problems.This technique is based on the conception of reduced basis method and the usage of linear interpolation principle.The original problem is projected onto the reduced basis space by linear interpolation projection,and subsequently an associated interpolation matrix is generated.To ensure the largest nonsingularity,the interpolation matrix needs to go through a timenode choosing process,which is developed by applying the angle of vector spaces.As a part of this technique,error estimation is recommended for achieving the computational error bound.To ensure the successful performance of this technique,the offline-online computational procedures are conducted in practical engineering.Two numerical examples demonstrate the accuracy and efficiency of the presented method.
基金sponsored by Inner Mongolia Transportation Research Project(NJ-2014-X)Shanxi Transportation Research Project(2015-1-22)National Natural Science Foundation of China(51208080)
文摘The analysis of transient linear viscoelastic response of asphalt concrete (AC) is important for engineering applications. The traditional transient response of AC is analyzed in the time domain by performing complicated convolution integral. The frequency domain approach allows one to determine the transient responses by performing simple multi- plication instead of the complicated convolution integral, and it does not require the time derivative of the input excitation, and thus, the approach could greatly reduce the analysis complexity. This study investigated the frequency domain approach in calculating the transient response by utilizing the discrete Fourier transform technique. The accuracy and effectiveness of the frequency domain approach were verified by comparing the analytical and calculated responses for the standard 3-parameter Maxwell model and by comparing the time and frequency domain solutions for AC. The effect of aliasing of the frequency domain approach can effectively reduce by selecting a small sampling interval for the time domain excitation function. A sampling interval is acceptable as long as the amplitude of the Fourier transformed excitation is close to 0 more than half of the sampling rate. The results show that the frequency domain approach provides a simple and accurate way to perform linear viscoelastic analysis of AC.