This article presents an exhaustive comparative investigation into the accuracy of gender identification across diverse geographical regions,employing a deep learning classification algorithm for speech signal analysi...This article presents an exhaustive comparative investigation into the accuracy of gender identification across diverse geographical regions,employing a deep learning classification algorithm for speech signal analysis.In this study,speech samples are categorized for both training and testing purposes based on their geographical origin.Category 1 comprises speech samples from speakers outside of India,whereas Category 2 comprises live-recorded speech samples from Indian speakers.Testing speech samples are likewise classified into four distinct sets,taking into consideration both geographical origin and the language spoken by the speakers.Significantly,the results indicate a noticeable difference in gender identification accuracy among speakers from different geographical areas.Indian speakers,utilizing 52 Hindi and 26 English phonemes in their speech,demonstrate a notably higher gender identification accuracy of 85.75%compared to those speakers who predominantly use 26 English phonemes in their conversations when the system is trained using speech samples from Indian speakers.The gender identification accuracy of the proposed model reaches 83.20%when the system is trained using speech samples from speakers outside of India.In the analysis of speech signals,Mel Frequency Cepstral Coefficients(MFCCs)serve as relevant features for the speech data.The deep learning classification algorithm utilized in this research is based on a Bidirectional Long Short-Term Memory(BiLSTM)architecture within a Recurrent Neural Network(RNN)model.展开更多
Emotion is such a unique power of human trial that plays a vital role in distinguishing human civilization from others. Voice is one of the most important media of expressing emotion. We can identify many types of emo...Emotion is such a unique power of human trial that plays a vital role in distinguishing human civilization from others. Voice is one of the most important media of expressing emotion. We can identify many types of emotions by talking or listening to voices. This is what we know as a voice signal. Just as the way people talk is different, so is the way they express emotions. By looking or hearing a person’s way of speaking, we can easily guess his/her personality and instantaneous emotions. People’s emotion and feelings are expressed in different ways. It is through the expression of emotions and feelings that people fully express his thoughts. Happiness, sadness, and anger are the main medium of expression way of different human emotions. To express these emotions, people use body postures, facial expressions and vocalizations. Though people use a variety of means to express emotions and feelings, the easiest and most complete way to express emotion and feelings is voice signal. The subject of our study is whether we can identify the right human emotion by examining the human voice signal. By analyzing the voice signal through wavelet, we have tried to show whether the mean frequency, maximum frequency and <em>L<sub>p</sub></em> values conform to a pattern according to its different sensory types. Moreover, the technique applied here is to develop a concept using MATLAB programming, which will compare the mean frequency, maximum frequency and <em>L<sub>p</sub></em> norm to find relation and detect emotion by analyzing different voices.展开更多
Evaluation of quality of singing is an issue subjectively realized by the experts. This paper presents the results of the analysis of the vibrato parameter in the singing. The well-known fact is the existence of vibra...Evaluation of quality of singing is an issue subjectively realized by the experts. This paper presents the results of the analysis of the vibrato parameter in the singing. The well-known fact is the existence of vibrato of sufficient quality in the voices of professional singers. The authors focus here on the choral voices to assess the quality of their singing from the point of view of the vibrato parameter. The method presented here is developed to evaluate the vibrato while singing under conditions close to the real ones. The study was carried out on the recordings of the members of an academic choir. As a result of tests it was found that not all singers present the same quality of vibrato in terms of deviation of vibrato confidence (STDCV).展开更多
Application of Unmanned Aircraft Systems(UAS)for plant protection is becoming a common tool in agricultural field management.To avoid shortcomings of intrusive flowrate sensors including poor measurement accuracy and ...Application of Unmanned Aircraft Systems(UAS)for plant protection is becoming a common tool in agricultural field management.To avoid shortcomings of intrusive flowrate sensors including poor measurement accuracy and poor anti-vibration ability,a non-intrusive flowrate measurement and monitoring system of plant-protection UAS was developed based on pump voice signal analysis.It is mainly composed of STM32 processor,microphone and signal-conditioning circuit.By collecting and analyzing the voice signal of the pump in the UAS,the monitoring system will output the real-time values of spraying flowrate and amount.An extraction model was developed to determine operation status and primary frequency of the pump based on voice signal analysis.Real-time spray flowrate can be determined from the real-time extracted primary frequency and the fitted correlation formulas of spraying flowrate under outlet area and pump primary frequency.The flowrate correlation equation of one certain pump from 4-rotor UAS 3WQFTX-1011S was obtained,the max deviation rate of fitted spray flowrate was only 2.8%.In primary frequency extraction test,the error rate of primary frequency extraction was less than 1%.In the 4-rotor UAS flight tests:the max deviation of operating starting/end point was only 0.7 s and the max deviation of extracted total operating time was only 0.8 s;the deviation of extracted spray flowrate was less than 2%,and the max deviation rate of total spray amount was 3.2%.This research could be used as a guidance for plant-protection UAS non-intrusive flowrate measurement and monitoring.展开更多
文摘This article presents an exhaustive comparative investigation into the accuracy of gender identification across diverse geographical regions,employing a deep learning classification algorithm for speech signal analysis.In this study,speech samples are categorized for both training and testing purposes based on their geographical origin.Category 1 comprises speech samples from speakers outside of India,whereas Category 2 comprises live-recorded speech samples from Indian speakers.Testing speech samples are likewise classified into four distinct sets,taking into consideration both geographical origin and the language spoken by the speakers.Significantly,the results indicate a noticeable difference in gender identification accuracy among speakers from different geographical areas.Indian speakers,utilizing 52 Hindi and 26 English phonemes in their speech,demonstrate a notably higher gender identification accuracy of 85.75%compared to those speakers who predominantly use 26 English phonemes in their conversations when the system is trained using speech samples from Indian speakers.The gender identification accuracy of the proposed model reaches 83.20%when the system is trained using speech samples from speakers outside of India.In the analysis of speech signals,Mel Frequency Cepstral Coefficients(MFCCs)serve as relevant features for the speech data.The deep learning classification algorithm utilized in this research is based on a Bidirectional Long Short-Term Memory(BiLSTM)architecture within a Recurrent Neural Network(RNN)model.
文摘Emotion is such a unique power of human trial that plays a vital role in distinguishing human civilization from others. Voice is one of the most important media of expressing emotion. We can identify many types of emotions by talking or listening to voices. This is what we know as a voice signal. Just as the way people talk is different, so is the way they express emotions. By looking or hearing a person’s way of speaking, we can easily guess his/her personality and instantaneous emotions. People’s emotion and feelings are expressed in different ways. It is through the expression of emotions and feelings that people fully express his thoughts. Happiness, sadness, and anger are the main medium of expression way of different human emotions. To express these emotions, people use body postures, facial expressions and vocalizations. Though people use a variety of means to express emotions and feelings, the easiest and most complete way to express emotion and feelings is voice signal. The subject of our study is whether we can identify the right human emotion by examining the human voice signal. By analyzing the voice signal through wavelet, we have tried to show whether the mean frequency, maximum frequency and <em>L<sub>p</sub></em> values conform to a pattern according to its different sensory types. Moreover, the technique applied here is to develop a concept using MATLAB programming, which will compare the mean frequency, maximum frequency and <em>L<sub>p</sub></em> norm to find relation and detect emotion by analyzing different voices.
基金supported by the Ministry of Science and Higher Education of Poland,under Grant No.NN516 517539
文摘Evaluation of quality of singing is an issue subjectively realized by the experts. This paper presents the results of the analysis of the vibrato parameter in the singing. The well-known fact is the existence of vibrato of sufficient quality in the voices of professional singers. The authors focus here on the choral voices to assess the quality of their singing from the point of view of the vibrato parameter. The method presented here is developed to evaluate the vibrato while singing under conditions close to the real ones. The study was carried out on the recordings of the members of an academic choir. As a result of tests it was found that not all singers present the same quality of vibrato in terms of deviation of vibrato confidence (STDCV).
基金The research was supported by National Key R&D Program of China(Grant No.2017YFD0701000,2018YFD0200900)China Agriculture Research System of MOF and MARA(Grant No.CARS-12)Chinese Academy of Agricultural Sciences Fundamental Research Funds(Grant No.SR201903).
文摘Application of Unmanned Aircraft Systems(UAS)for plant protection is becoming a common tool in agricultural field management.To avoid shortcomings of intrusive flowrate sensors including poor measurement accuracy and poor anti-vibration ability,a non-intrusive flowrate measurement and monitoring system of plant-protection UAS was developed based on pump voice signal analysis.It is mainly composed of STM32 processor,microphone and signal-conditioning circuit.By collecting and analyzing the voice signal of the pump in the UAS,the monitoring system will output the real-time values of spraying flowrate and amount.An extraction model was developed to determine operation status and primary frequency of the pump based on voice signal analysis.Real-time spray flowrate can be determined from the real-time extracted primary frequency and the fitted correlation formulas of spraying flowrate under outlet area and pump primary frequency.The flowrate correlation equation of one certain pump from 4-rotor UAS 3WQFTX-1011S was obtained,the max deviation rate of fitted spray flowrate was only 2.8%.In primary frequency extraction test,the error rate of primary frequency extraction was less than 1%.In the 4-rotor UAS flight tests:the max deviation of operating starting/end point was only 0.7 s and the max deviation of extracted total operating time was only 0.8 s;the deviation of extracted spray flowrate was less than 2%,and the max deviation rate of total spray amount was 3.2%.This research could be used as a guidance for plant-protection UAS non-intrusive flowrate measurement and monitoring.