Environmental sound classification(ESC)involves the process of distinguishing an audio stream associated with numerous environmental sounds.Some common aspects such as the framework difference,overlapping of different...Environmental sound classification(ESC)involves the process of distinguishing an audio stream associated with numerous environmental sounds.Some common aspects such as the framework difference,overlapping of different sound events,and the presence of various sound sources during recording make the ESC task much more complicated and complex.This research is to propose a deep learning model to improve the recognition rate of environmental sounds and reduce the model training time under limited computation resources.In this research,the performance of transformer and convolutional neural networks(CNN)are investigated.Seven audio features,chromagram,Mel-spectrogram,tonnetz,Mel-Frequency Cepstral Coefficients(MFCCs),delta MFCCs,delta-delta MFCCs and spectral contrast,are extracted fromtheUrbanSound8K,ESC-50,and ESC-10,databases.Moreover,this research also employed three data enhancement methods,namely,white noise,pitch tuning,and time stretch to reduce the risk of overfitting issue due to the limited audio clips.The evaluation of various experiments demonstrates that the best performance was achieved by the proposed transformer model using seven audio features on enhanced database.For UrbanSound8K,ESC-50,and ESC-10,the highest attained accuracies are 0.98,0.94,and 0.97 respectively.The experimental results reveal that the proposed technique can achieve the best performance for ESC problems.展开更多
Environmental sound classification (ESC) has gained increasing attention in recent years. This study focuses on the evaluation of the popular public dataset Urbansound8k (Us8k) at different sampling rates using hand c...Environmental sound classification (ESC) has gained increasing attention in recent years. This study focuses on the evaluation of the popular public dataset Urbansound8k (Us8k) at different sampling rates using hand crafted features. The Us8k dataset contains environment sounds recorded at various sampling rates, and previous ESC works have uniformly resampled the dataset. Some previous work converted this data to different sampling rates for various reasons. Some of them chose to convert the rest of the dataset to 44,100, as the majority of the Us8k files were already at that sampling rate. On the other hand, some researchers down sampled the dataset to 8000, as it reduced computational complexity, while others resampled it to 16,000, aiming to achieve a balance between higher classification accuracy and lower computational complexity. In this research, we assessed the performance of ESC tasks using sampling rates of 8000 Hz, 16,000 Hz, and 44,100 Hz by extracting the hand crafted features Mel frequency cepstral coefficient (MFCC), gamma tone cepstral coefficients (GTCC), and Mel Spectrogram (MelSpec). The results indicated that there was no significant difference in the classification accuracy among the three tested sampling rates.展开更多
Purpose This work uses the data which represent the measurements of the gamma radiation levels in ambient air from many gamma monitoring stations that are distributed in many sites to classify the regions which cover ...Purpose This work uses the data which represent the measurements of the gamma radiation levels in ambient air from many gamma monitoring stations that are distributed in many sites to classify the regions which cover these sites according to these measurements.Method The processes of the classification are:dividing the range of measurements to several intervals,making interpolation for all regions that cover all gammamonitoring stations,representing the interpolation information in the map using geographic information systems technology and finally classifying all the sites on this map according to the determined intervals by using data mining techniques via interpolation information.Implementation and Importance This method is implemented for determining the background of gamma radiation levels for many sites in Egypt.This background is necessary for many environmental researches because it is useful for making risk assessment evaluation for any site in Egypt.Results The output result from this implementation shows that most sites in Egypt have been classified within three intervals:first interval is from 2.02E−2 to4.75E−2μSv/h with 47.11%of Egypt area,second interval is from 4.75E−2 to8.85E−2μSv/h with 40%of Egypt area and third interval is from 8.85E−2 to 1.42E−1μSv/h with 6.82%of Egypt area.Conclusions This method is more useful than other traditional methods because the results from this method show that this method saves more effort,time and cost than other methods.展开更多
基金the Taif University Researchers Supporting Project number(TURSP-2020/36),Taif University,Taif,Saudi Arabia.
文摘Environmental sound classification(ESC)involves the process of distinguishing an audio stream associated with numerous environmental sounds.Some common aspects such as the framework difference,overlapping of different sound events,and the presence of various sound sources during recording make the ESC task much more complicated and complex.This research is to propose a deep learning model to improve the recognition rate of environmental sounds and reduce the model training time under limited computation resources.In this research,the performance of transformer and convolutional neural networks(CNN)are investigated.Seven audio features,chromagram,Mel-spectrogram,tonnetz,Mel-Frequency Cepstral Coefficients(MFCCs),delta MFCCs,delta-delta MFCCs and spectral contrast,are extracted fromtheUrbanSound8K,ESC-50,and ESC-10,databases.Moreover,this research also employed three data enhancement methods,namely,white noise,pitch tuning,and time stretch to reduce the risk of overfitting issue due to the limited audio clips.The evaluation of various experiments demonstrates that the best performance was achieved by the proposed transformer model using seven audio features on enhanced database.For UrbanSound8K,ESC-50,and ESC-10,the highest attained accuracies are 0.98,0.94,and 0.97 respectively.The experimental results reveal that the proposed technique can achieve the best performance for ESC problems.
文摘Environmental sound classification (ESC) has gained increasing attention in recent years. This study focuses on the evaluation of the popular public dataset Urbansound8k (Us8k) at different sampling rates using hand crafted features. The Us8k dataset contains environment sounds recorded at various sampling rates, and previous ESC works have uniformly resampled the dataset. Some previous work converted this data to different sampling rates for various reasons. Some of them chose to convert the rest of the dataset to 44,100, as the majority of the Us8k files were already at that sampling rate. On the other hand, some researchers down sampled the dataset to 8000, as it reduced computational complexity, while others resampled it to 16,000, aiming to achieve a balance between higher classification accuracy and lower computational complexity. In this research, we assessed the performance of ESC tasks using sampling rates of 8000 Hz, 16,000 Hz, and 44,100 Hz by extracting the hand crafted features Mel frequency cepstral coefficient (MFCC), gamma tone cepstral coefficients (GTCC), and Mel Spectrogram (MelSpec). The results indicated that there was no significant difference in the classification accuracy among the three tested sampling rates.
文摘Purpose This work uses the data which represent the measurements of the gamma radiation levels in ambient air from many gamma monitoring stations that are distributed in many sites to classify the regions which cover these sites according to these measurements.Method The processes of the classification are:dividing the range of measurements to several intervals,making interpolation for all regions that cover all gammamonitoring stations,representing the interpolation information in the map using geographic information systems technology and finally classifying all the sites on this map according to the determined intervals by using data mining techniques via interpolation information.Implementation and Importance This method is implemented for determining the background of gamma radiation levels for many sites in Egypt.This background is necessary for many environmental researches because it is useful for making risk assessment evaluation for any site in Egypt.Results The output result from this implementation shows that most sites in Egypt have been classified within three intervals:first interval is from 2.02E−2 to4.75E−2μSv/h with 47.11%of Egypt area,second interval is from 4.75E−2 to8.85E−2μSv/h with 40%of Egypt area and third interval is from 8.85E−2 to 1.42E−1μSv/h with 6.82%of Egypt area.Conclusions This method is more useful than other traditional methods because the results from this method show that this method saves more effort,time and cost than other methods.