Aiming at the problem of open set voiceprint recognition, this paper proposes an adaptive threshold algorithm based on OTSU and deep learning. The bottleneck technology of open set voiceprint recognition lies in the c...Aiming at the problem of open set voiceprint recognition, this paper proposes an adaptive threshold algorithm based on OTSU and deep learning. The bottleneck technology of open set voiceprint recognition lies in the calculation of similarity values and thresholds of speakers inside and outside the set. This paper combines deep learning and machine learning methods, and uses a Deep Belief Network stacked with three layers of Restricted Boltzmann Machines to extract deep voice features from basic acoustic features. And by training the Gaussian Mixture Model, this paper calculates the similarity value of the feature, and further determines the threshold of the similarity value of the feature through OTSU. After experimental testing, the algorithm in this paper has a false rejection rate of 3.00% for specific speakers, a false acceptance rate of 0.35% for internal speakers, and a false acceptance rate of 0 for external speakers. This improves the accuracy of traditional methods in open set voiceprint recognition. This proves that the method is feasible and good recognition effect.展开更多
Purpose–The safe operation of the metro power transformer directly relates to the safety and efficiency of the entire metro system.Through voiceprint technology,the sounds emitted by the transformer can be monitored ...Purpose–The safe operation of the metro power transformer directly relates to the safety and efficiency of the entire metro system.Through voiceprint technology,the sounds emitted by the transformer can be monitored in real-time,thereby achieving real-time monitoring of the transformer’s operational status.However,the environment surrounding power transformers is filled with various interfering sounds that intertwine with both the normal operational voiceprints and faulty voiceprints of the transformer,severely impacting the accuracy and reliability of voiceprint identification.Therefore,effective preprocessing steps are required to identify and separate the sound signals of transformer operation,which is a prerequisite for subsequent analysis.Design/methodology/approach–This paper proposes an Adaptive Threshold Repeating Pattern Extraction Technique(REPET)algorithm to separate and denoise the transformer operation sound signals.By analyzing the Short-Time Fourier Transform(STFT)amplitude spectrum,the algorithm identifies and utilizes the repeating periodic structures within the signal to automatically adjust the threshold,effectively distinguishing and extracting stable background signals from transient foreground events.The REPET algorithm first calculates the autocorrelation matrix of the signal to determine the repeating period,then constructs a repeating segment model.Through comparison with the amplitude spectrum of the original signal,repeating patterns are extracted and a soft time-frequency mask is generated.Findings–After adaptive thresholding processing,the target signal is separated.Experiments conducted on mixed sounds to separate background sounds from foreground sounds using this algorithm and comparing the results with those obtained using the FastICA algorithm demonstrate that the Adaptive Threshold REPET method achieves good separation effects.Originality/value–A REPET method with adaptive threshold is proposed,which adopts the dynamic threshold adjustment mechanism,adaptively calculates the threshold for blind source separation and improves the adaptability and robustness of the algorithm to the statistical characteristics of the signal.It also lays the foundation for transformer fault detection based on acoustic fingerprinting.展开更多
文摘Aiming at the problem of open set voiceprint recognition, this paper proposes an adaptive threshold algorithm based on OTSU and deep learning. The bottleneck technology of open set voiceprint recognition lies in the calculation of similarity values and thresholds of speakers inside and outside the set. This paper combines deep learning and machine learning methods, and uses a Deep Belief Network stacked with three layers of Restricted Boltzmann Machines to extract deep voice features from basic acoustic features. And by training the Gaussian Mixture Model, this paper calculates the similarity value of the feature, and further determines the threshold of the similarity value of the feature through OTSU. After experimental testing, the algorithm in this paper has a false rejection rate of 3.00% for specific speakers, a false acceptance rate of 0.35% for internal speakers, and a false acceptance rate of 0 for external speakers. This improves the accuracy of traditional methods in open set voiceprint recognition. This proves that the method is feasible and good recognition effect.
基金the China Academy of Railway Sciences Corporation Limited(2023YJ257).
文摘Purpose–The safe operation of the metro power transformer directly relates to the safety and efficiency of the entire metro system.Through voiceprint technology,the sounds emitted by the transformer can be monitored in real-time,thereby achieving real-time monitoring of the transformer’s operational status.However,the environment surrounding power transformers is filled with various interfering sounds that intertwine with both the normal operational voiceprints and faulty voiceprints of the transformer,severely impacting the accuracy and reliability of voiceprint identification.Therefore,effective preprocessing steps are required to identify and separate the sound signals of transformer operation,which is a prerequisite for subsequent analysis.Design/methodology/approach–This paper proposes an Adaptive Threshold Repeating Pattern Extraction Technique(REPET)algorithm to separate and denoise the transformer operation sound signals.By analyzing the Short-Time Fourier Transform(STFT)amplitude spectrum,the algorithm identifies and utilizes the repeating periodic structures within the signal to automatically adjust the threshold,effectively distinguishing and extracting stable background signals from transient foreground events.The REPET algorithm first calculates the autocorrelation matrix of the signal to determine the repeating period,then constructs a repeating segment model.Through comparison with the amplitude spectrum of the original signal,repeating patterns are extracted and a soft time-frequency mask is generated.Findings–After adaptive thresholding processing,the target signal is separated.Experiments conducted on mixed sounds to separate background sounds from foreground sounds using this algorithm and comparing the results with those obtained using the FastICA algorithm demonstrate that the Adaptive Threshold REPET method achieves good separation effects.Originality/value–A REPET method with adaptive threshold is proposed,which adopts the dynamic threshold adjustment mechanism,adaptively calculates the threshold for blind source separation and improves the adaptability and robustness of the algorithm to the statistical characteristics of the signal.It also lays the foundation for transformer fault detection based on acoustic fingerprinting.