In order to extract the fault feature of the bearing effectively and prevent the impact components caused by bearing damage being interfered with by discrete frequency components and background noise,a method of fault...In order to extract the fault feature of the bearing effectively and prevent the impact components caused by bearing damage being interfered with by discrete frequency components and background noise,a method of fault feature extraction based on cepstrum pre-whitening(CPW)and a quantitative law of symplectic geometry mode decomposition(SGMD)is proposed.First,CPW is performed on the original signal to enhance the impact feature of bearing fault and remove the periodic frequency components from complex vibration signals.The pre-whitening signal contains only background noise and non-stationary shock caused by damage.Secondly,a quantitative law that the number of effective eigenvalues of the Hamilton matrix is twice the number of frequency components in the signal during SGMD is found,and the quantitative law is verified by simulation and theoretical derivation.Finally,the trajectory matrix of the pre-whitening signal is constructed and SGMD is performed.According to the quantitative law,the corresponding feature vector is selected to reconstruct the signal.The Hilbert envelope spectrum analysis is performed to extract fault features.Simulation analysis and application examples prove that the proposed method can clearly extract the fault feature of bearings.展开更多
Smoothed cepstral peak prominence(CPPs)is a measurement of the distance from the prominent cepstral peak to the linear regression line directly beneath it.Variations of CPPs data acquisition and analysis lead to the c...Smoothed cepstral peak prominence(CPPs)is a measurement of the distance from the prominent cepstral peak to the linear regression line directly beneath it.Variations of CPPs data acquisition and analysis lead to the complexity of the clinical cut-off values,and there are no agreeable values for a specific voice disorder,such as hypokinetic dysarthria associated with Parkinson’s disease(PD).This study examined the CPPs in people with hypokinetic dysarthria associated with PD compared with healthy participants.Results demonstrated significant differences in speech tasks of sustained vowel and connected speech,with CPPs of connected speech more sensitive to dysphonia and gender difference in PD participants.Males in PD participants presented higher CPPs for sustained vowels and lower CPPs for connected speech than females.It is implied that a consistent clinical application protocol is necessary,and multiple acoustic measures are needed to ensure the accuracy of clinical decisions.展开更多
Passive target detection through shipping-radiated noise is a key technology in current underwater operations and is of great research value in civil and military fields.In this study,the stable spectral line componen...Passive target detection through shipping-radiated noise is a key technology in current underwater operations and is of great research value in civil and military fields.In this study,the stable spectral line component of shipping-radiated noise is used as the research object,and the classification of multisource targets is studied from the perspective of underwater channels.We utilize the channel impulse response function as the classification basis of different targets.First,the underwater channel is estimated by the cepstrum.Then,the channel cepstral features carried by different spectral line components are extracted in turn.Finally,the spectral line components belonging to the same target are clustered by the cepstral feature distance to realize the classification of different targets.The simulation and experimental results verify the effectiveness of the proposed method in this research.展开更多
Speech intelligibility enhancement in noisy environments is still one of the major challenges for hearing impaired in everyday life.Recently,Machine-learning based approaches to speech enhancement have shown great pro...Speech intelligibility enhancement in noisy environments is still one of the major challenges for hearing impaired in everyday life.Recently,Machine-learning based approaches to speech enhancement have shown great promise for improving speech intelligibility.Two key issues of these approaches are acoustic features extracted from noisy signals and classifiers used for supervised learning.In this paper,features are focused.Multi-resolution power-normalized cepstral coefficients(MRPNCC)are proposed as a new feature to enhance the speech intelligibility for hearing impaired.The new feature is constructed by combining four cepstrum at different time–frequency(T–F)resolutions in order to capture both the local and contextual information.MRPNCC vectors and binary masking labels calculated by signals passed through gammatone filterbank are used to train support vector machine(SVM)classifier,which aim to identify the binary masking values of the T–F units in the enhancement stage.The enhanced speech is synthesized by using the estimated masking values and wiener filtered T–F unit.Objective experimental results demonstrate that the proposed feature is superior to other comparing features in terms of HIT-FA,STOI,HASPI and PESQ,and that the proposed algorithm not only improves speech intelligibility but also improves speech quality slightly.Subjective tests validate the effectiveness of the proposed algorithm for hearing impaired.展开更多
The hidden danger of the automatic speaker verification(ASV)system is various spoofed speeches.These threats can be classified into two categories,namely logical access(LA)and physical access(PA).To improve identifica...The hidden danger of the automatic speaker verification(ASV)system is various spoofed speeches.These threats can be classified into two categories,namely logical access(LA)and physical access(PA).To improve identification capability of spoofed speech detection,this paper considers the research on features.Firstly,following the idea of modifying the constant-Q-based features,this work considered adding variance or mean to the constant-Q-based cepstral domain to obtain good performance.Secondly,linear frequency cepstral coefficients(LFCCs)performed comparably with constant-Q-based features.Finally,we proposed linear frequency variance-based cepstral coefficients(LVCCs)and linear frequency mean-based cepstral coefficients(LMCCs)for identification of speech spoofing.LVCCs and LMCCs could be attained by adding the frame variance or the mean to the log magnitude spectrum based on LFCC features.The proposed novel features were evaluated on ASVspoof 2019 datase.The experimental results show that compared with known hand-crafted features,LVCCs and LMCCs are more effective in resisting spoofed speech attack.展开更多
Seismic edge detection algorithm unmasks blurred discontinuity in an image and its efficiency is dependent on the precession of the processing scheme adopted.Data-driven modeling is a fast machine learning scheme and ...Seismic edge detection algorithm unmasks blurred discontinuity in an image and its efficiency is dependent on the precession of the processing scheme adopted.Data-driven modeling is a fast machine learning scheme and a formal automatic version of the empirical approach in existence for a long time and which can be used in many different contexts.Here,a desired algorithm that can identify masked connection and correlation from a set of observations is built and used.Geologic models of hydrocarbon reservoirs facilitate enhanced visualization,volumetric calculation,well planning and prediction of migration path for fluid.In order to obtain new insights and test the mappability of a geologic feature,spectral decomposition techniques i.e.Discrete Fourier Transform(DFT),etc and Cepstral decomposition techniques,i.e Complex Cepstral Transform(CCT),etc can be employed.Cepstral decomposition is a new approach that extends the widely used process of spectral decomposition which is rigorous when analyzing very subtle stratigraphic plays and fractured reservoirs.This paper presents the results of the application of DFT and CCT to a two dimensional,50Hz low impedance Channel sand model,representing typical geologic environment around a prospective hydrocarbon zone largely trapped in various types of channel structures.While the DFT represents the frequency and phase spectra of a signal,assumes stationarity and highlights the average properties of its dominant portion,assuming analytical,the CCT represents the quefrency and saphe cepstra of a signal in quefrency domain.The transform filters the field data recorded in time domain,and recovers lost sub-seismic geologic information in quefrency domain by separating source and transmission path effects.Our algorithm is based on fast Fourier transform(FFT)techniques and the programming code was written within Matlab software.It was developed from first principles and outside oil industry’s interpretational platform using standard processing routines.The results of the algorithm,when implemented on both commercial and general platforms,were comparable.The cepstral properties of the channel model indicate that cepstral attributes can be utilized as powerful tool in exploration problems to enhance visualization of small scale anomalies and obtain reliable estimates of wavelet and stratigraphic parameters.The practical relevance of this investigation is illustrated by means of sample results of spectral and cepstral attribute plots and pseudo-sections of phase and saphe constructed from the model data.The cepstral attributes reveal more details in terms of quefrency required for clearer imaging and better interpretation of subtle edges/discontinuities,sand-shale interbedding,differences in lithology.These positively impact on production as they serve as basis for the interpretation of similar geologic situations in field data.展开更多
This article presents an exhaustive comparative investigation into the accuracy of gender identification across diverse geographical regions,employing a deep learning classification algorithm for speech signal analysi...This article presents an exhaustive comparative investigation into the accuracy of gender identification across diverse geographical regions,employing a deep learning classification algorithm for speech signal analysis.In this study,speech samples are categorized for both training and testing purposes based on their geographical origin.Category 1 comprises speech samples from speakers outside of India,whereas Category 2 comprises live-recorded speech samples from Indian speakers.Testing speech samples are likewise classified into four distinct sets,taking into consideration both geographical origin and the language spoken by the speakers.Significantly,the results indicate a noticeable difference in gender identification accuracy among speakers from different geographical areas.Indian speakers,utilizing 52 Hindi and 26 English phonemes in their speech,demonstrate a notably higher gender identification accuracy of 85.75%compared to those speakers who predominantly use 26 English phonemes in their conversations when the system is trained using speech samples from Indian speakers.The gender identification accuracy of the proposed model reaches 83.20%when the system is trained using speech samples from speakers outside of India.In the analysis of speech signals,Mel Frequency Cepstral Coefficients(MFCCs)serve as relevant features for the speech data.The deep learning classification algorithm utilized in this research is based on a Bidirectional Long Short-Term Memory(BiLSTM)architecture within a Recurrent Neural Network(RNN)model.展开更多
基金The National Natural Science Foundation of China(No.52075095).
文摘In order to extract the fault feature of the bearing effectively and prevent the impact components caused by bearing damage being interfered with by discrete frequency components and background noise,a method of fault feature extraction based on cepstrum pre-whitening(CPW)and a quantitative law of symplectic geometry mode decomposition(SGMD)is proposed.First,CPW is performed on the original signal to enhance the impact feature of bearing fault and remove the periodic frequency components from complex vibration signals.The pre-whitening signal contains only background noise and non-stationary shock caused by damage.Secondly,a quantitative law that the number of effective eigenvalues of the Hamilton matrix is twice the number of frequency components in the signal during SGMD is found,and the quantitative law is verified by simulation and theoretical derivation.Finally,the trajectory matrix of the pre-whitening signal is constructed and SGMD is performed.According to the quantitative law,the corresponding feature vector is selected to reconstruct the signal.The Hilbert envelope spectrum analysis is performed to extract fault features.Simulation analysis and application examples prove that the proposed method can clearly extract the fault feature of bearings.
文摘Smoothed cepstral peak prominence(CPPs)is a measurement of the distance from the prominent cepstral peak to the linear regression line directly beneath it.Variations of CPPs data acquisition and analysis lead to the complexity of the clinical cut-off values,and there are no agreeable values for a specific voice disorder,such as hypokinetic dysarthria associated with Parkinson’s disease(PD).This study examined the CPPs in people with hypokinetic dysarthria associated with PD compared with healthy participants.Results demonstrated significant differences in speech tasks of sustained vowel and connected speech,with CPPs of connected speech more sensitive to dysphonia and gender difference in PD participants.Males in PD participants presented higher CPPs for sustained vowels and lower CPPs for connected speech than females.It is implied that a consistent clinical application protocol is necessary,and multiple acoustic measures are needed to ensure the accuracy of clinical decisions.
基金This study was supported by the National Natural Sci-ence Foundation of China(No.11774073)the State Key Laboratory of Acoustics(No.SKLA201904).
文摘Passive target detection through shipping-radiated noise is a key technology in current underwater operations and is of great research value in civil and military fields.In this study,the stable spectral line component of shipping-radiated noise is used as the research object,and the classification of multisource targets is studied from the perspective of underwater channels.We utilize the channel impulse response function as the classification basis of different targets.First,the underwater channel is estimated by the cepstrum.Then,the channel cepstral features carried by different spectral line components are extracted in turn.Finally,the spectral line components belonging to the same target are clustered by the cepstral feature distance to realize the classification of different targets.The simulation and experimental results verify the effectiveness of the proposed method in this research.
基金supported by the National Natural Science Foundation of China(Nos.61902158,61673108)the Science and Technology Program of Nantong(JC2018129,MS12018082)Top-notch Academic Programs Project of Jiangsu Higher Education Institu-tions(PPZY2015B135).
文摘Speech intelligibility enhancement in noisy environments is still one of the major challenges for hearing impaired in everyday life.Recently,Machine-learning based approaches to speech enhancement have shown great promise for improving speech intelligibility.Two key issues of these approaches are acoustic features extracted from noisy signals and classifiers used for supervised learning.In this paper,features are focused.Multi-resolution power-normalized cepstral coefficients(MRPNCC)are proposed as a new feature to enhance the speech intelligibility for hearing impaired.The new feature is constructed by combining four cepstrum at different time–frequency(T–F)resolutions in order to capture both the local and contextual information.MRPNCC vectors and binary masking labels calculated by signals passed through gammatone filterbank are used to train support vector machine(SVM)classifier,which aim to identify the binary masking values of the T–F units in the enhancement stage.The enhanced speech is synthesized by using the estimated masking values and wiener filtered T–F unit.Objective experimental results demonstrate that the proposed feature is superior to other comparing features in terms of HIT-FA,STOI,HASPI and PESQ,and that the proposed algorithm not only improves speech intelligibility but also improves speech quality slightly.Subjective tests validate the effectiveness of the proposed algorithm for hearing impaired.
基金National Natural Science Foundation of China(No.62001100)。
文摘The hidden danger of the automatic speaker verification(ASV)system is various spoofed speeches.These threats can be classified into two categories,namely logical access(LA)and physical access(PA).To improve identification capability of spoofed speech detection,this paper considers the research on features.Firstly,following the idea of modifying the constant-Q-based features,this work considered adding variance or mean to the constant-Q-based cepstral domain to obtain good performance.Secondly,linear frequency cepstral coefficients(LFCCs)performed comparably with constant-Q-based features.Finally,we proposed linear frequency variance-based cepstral coefficients(LVCCs)and linear frequency mean-based cepstral coefficients(LMCCs)for identification of speech spoofing.LVCCs and LMCCs could be attained by adding the frame variance or the mean to the log magnitude spectrum based on LFCC features.The proposed novel features were evaluated on ASVspoof 2019 datase.The experimental results show that compared with known hand-crafted features,LVCCs and LMCCs are more effective in resisting spoofed speech attack.
文摘Seismic edge detection algorithm unmasks blurred discontinuity in an image and its efficiency is dependent on the precession of the processing scheme adopted.Data-driven modeling is a fast machine learning scheme and a formal automatic version of the empirical approach in existence for a long time and which can be used in many different contexts.Here,a desired algorithm that can identify masked connection and correlation from a set of observations is built and used.Geologic models of hydrocarbon reservoirs facilitate enhanced visualization,volumetric calculation,well planning and prediction of migration path for fluid.In order to obtain new insights and test the mappability of a geologic feature,spectral decomposition techniques i.e.Discrete Fourier Transform(DFT),etc and Cepstral decomposition techniques,i.e Complex Cepstral Transform(CCT),etc can be employed.Cepstral decomposition is a new approach that extends the widely used process of spectral decomposition which is rigorous when analyzing very subtle stratigraphic plays and fractured reservoirs.This paper presents the results of the application of DFT and CCT to a two dimensional,50Hz low impedance Channel sand model,representing typical geologic environment around a prospective hydrocarbon zone largely trapped in various types of channel structures.While the DFT represents the frequency and phase spectra of a signal,assumes stationarity and highlights the average properties of its dominant portion,assuming analytical,the CCT represents the quefrency and saphe cepstra of a signal in quefrency domain.The transform filters the field data recorded in time domain,and recovers lost sub-seismic geologic information in quefrency domain by separating source and transmission path effects.Our algorithm is based on fast Fourier transform(FFT)techniques and the programming code was written within Matlab software.It was developed from first principles and outside oil industry’s interpretational platform using standard processing routines.The results of the algorithm,when implemented on both commercial and general platforms,were comparable.The cepstral properties of the channel model indicate that cepstral attributes can be utilized as powerful tool in exploration problems to enhance visualization of small scale anomalies and obtain reliable estimates of wavelet and stratigraphic parameters.The practical relevance of this investigation is illustrated by means of sample results of spectral and cepstral attribute plots and pseudo-sections of phase and saphe constructed from the model data.The cepstral attributes reveal more details in terms of quefrency required for clearer imaging and better interpretation of subtle edges/discontinuities,sand-shale interbedding,differences in lithology.These positively impact on production as they serve as basis for the interpretation of similar geologic situations in field data.
文摘This article presents an exhaustive comparative investigation into the accuracy of gender identification across diverse geographical regions,employing a deep learning classification algorithm for speech signal analysis.In this study,speech samples are categorized for both training and testing purposes based on their geographical origin.Category 1 comprises speech samples from speakers outside of India,whereas Category 2 comprises live-recorded speech samples from Indian speakers.Testing speech samples are likewise classified into four distinct sets,taking into consideration both geographical origin and the language spoken by the speakers.Significantly,the results indicate a noticeable difference in gender identification accuracy among speakers from different geographical areas.Indian speakers,utilizing 52 Hindi and 26 English phonemes in their speech,demonstrate a notably higher gender identification accuracy of 85.75%compared to those speakers who predominantly use 26 English phonemes in their conversations when the system is trained using speech samples from Indian speakers.The gender identification accuracy of the proposed model reaches 83.20%when the system is trained using speech samples from speakers outside of India.In the analysis of speech signals,Mel Frequency Cepstral Coefficients(MFCCs)serve as relevant features for the speech data.The deep learning classification algorithm utilized in this research is based on a Bidirectional Long Short-Term Memory(BiLSTM)architecture within a Recurrent Neural Network(RNN)model.