For accuracy and rapidity of audio event detection in the mass-data audio pro- cessing tasks, a generic method of rapidly recognizing audio event based on 2D-Haar acoustic super feature vector and AdaBoost is proposed...For accuracy and rapidity of audio event detection in the mass-data audio pro- cessing tasks, a generic method of rapidly recognizing audio event based on 2D-Haar acoustic super feature vector and AdaBoost is proposed. Firstly, it combines certain number of con- tinuous audio frames to be an "acoustic feature image", secondly, uses AdaBoost.MH or fast Random AdaBoost feature selection algorithm to select high representative 2D-Haar pattern combinations to construct super feature vectors; thirdly, analyzes the commonality and differ- ences between subcategories, then extracts common features and reduces different features to obtain a generic audio event template, which can support the accurate identification of multi- ple sub-classes and detect and locate the specific audio event from the audio stream accurately. Experimental results show that the use of 2D-Haar acoustic feature super vector can make recog- nition accuracy 5% higher than ones that MFCC, PLP, LPCC and other traditional acoustic features yielded, and can make tile training processing 7 20 times faster and the recognition processing 5-10 times faster, it can even achieve an average precision of 93.38%, an average recall of 95.03% under the optimal parameter configuration found by grid method. Above all, it can provide an accurate and fast mass-data processing method for audio event detection.展开更多
Low-resolution face images can be found in many practical applications. For example, faces captured from surveillance videos are typically in small sizes. Existing face recognition deep networks, trained on high-resol...Low-resolution face images can be found in many practical applications. For example, faces captured from surveillance videos are typically in small sizes. Existing face recognition deep networks, trained on high-resolution images, perform poorly in recognizing low-resolution faces. In this work, an improved multi-branch network is proposed by combining ResNet and feature super-resolution modules. ResNet is for recognizing high-resolution facial images and extracting features from both high-and low-resolution images.Feature super-resolution modules are inserted before the classifier of ResNet for low-resolution facial images. They are used to increase feature resolution. The proposed method is effective and simple. Experimental results show that the recognition accuracy for high-resolution face images is high, and the recognition accuracy for low-resolution face images is improved.展开更多
文摘For accuracy and rapidity of audio event detection in the mass-data audio pro- cessing tasks, a generic method of rapidly recognizing audio event based on 2D-Haar acoustic super feature vector and AdaBoost is proposed. Firstly, it combines certain number of con- tinuous audio frames to be an "acoustic feature image", secondly, uses AdaBoost.MH or fast Random AdaBoost feature selection algorithm to select high representative 2D-Haar pattern combinations to construct super feature vectors; thirdly, analyzes the commonality and differ- ences between subcategories, then extracts common features and reduces different features to obtain a generic audio event template, which can support the accurate identification of multi- ple sub-classes and detect and locate the specific audio event from the audio stream accurately. Experimental results show that the use of 2D-Haar acoustic feature super vector can make recog- nition accuracy 5% higher than ones that MFCC, PLP, LPCC and other traditional acoustic features yielded, and can make tile training processing 7 20 times faster and the recognition processing 5-10 times faster, it can even achieve an average precision of 93.38%, an average recall of 95.03% under the optimal parameter configuration found by grid method. Above all, it can provide an accurate and fast mass-data processing method for audio event detection.
文摘Low-resolution face images can be found in many practical applications. For example, faces captured from surveillance videos are typically in small sizes. Existing face recognition deep networks, trained on high-resolution images, perform poorly in recognizing low-resolution faces. In this work, an improved multi-branch network is proposed by combining ResNet and feature super-resolution modules. ResNet is for recognizing high-resolution facial images and extracting features from both high-and low-resolution images.Feature super-resolution modules are inserted before the classifier of ResNet for low-resolution facial images. They are used to increase feature resolution. The proposed method is effective and simple. Experimental results show that the recognition accuracy for high-resolution face images is high, and the recognition accuracy for low-resolution face images is improved.