For accuracy and rapidity of audio event detection in the mass-data audio pro- cessing tasks, a generic method of rapidly recognizing audio event based on 2D-Haar acoustic super feature vector and AdaBoost is proposed...For accuracy and rapidity of audio event detection in the mass-data audio pro- cessing tasks, a generic method of rapidly recognizing audio event based on 2D-Haar acoustic super feature vector and AdaBoost is proposed. Firstly, it combines certain number of con- tinuous audio frames to be an "acoustic feature image", secondly, uses AdaBoost.MH or fast Random AdaBoost feature selection algorithm to select high representative 2D-Haar pattern combinations to construct super feature vectors; thirdly, analyzes the commonality and differ- ences between subcategories, then extracts common features and reduces different features to obtain a generic audio event template, which can support the accurate identification of multi- ple sub-classes and detect and locate the specific audio event from the audio stream accurately. Experimental results show that the use of 2D-Haar acoustic feature super vector can make recog- nition accuracy 5% higher than ones that MFCC, PLP, LPCC and other traditional acoustic features yielded, and can make tile training processing 7 20 times faster and the recognition processing 5-10 times faster, it can even achieve an average precision of 93.38%, an average recall of 95.03% under the optimal parameter configuration found by grid method. Above all, it can provide an accurate and fast mass-data processing method for audio event detection.展开更多
文摘For accuracy and rapidity of audio event detection in the mass-data audio pro- cessing tasks, a generic method of rapidly recognizing audio event based on 2D-Haar acoustic super feature vector and AdaBoost is proposed. Firstly, it combines certain number of con- tinuous audio frames to be an "acoustic feature image", secondly, uses AdaBoost.MH or fast Random AdaBoost feature selection algorithm to select high representative 2D-Haar pattern combinations to construct super feature vectors; thirdly, analyzes the commonality and differ- ences between subcategories, then extracts common features and reduces different features to obtain a generic audio event template, which can support the accurate identification of multi- ple sub-classes and detect and locate the specific audio event from the audio stream accurately. Experimental results show that the use of 2D-Haar acoustic feature super vector can make recog- nition accuracy 5% higher than ones that MFCC, PLP, LPCC and other traditional acoustic features yielded, and can make tile training processing 7 20 times faster and the recognition processing 5-10 times faster, it can even achieve an average precision of 93.38%, an average recall of 95.03% under the optimal parameter configuration found by grid method. Above all, it can provide an accurate and fast mass-data processing method for audio event detection.