期刊文献+
共找到190篇文章
< 1 2 10 >
每页显示 20 50 100
A Graph-Based Semi-Supervised Approach for Few-Shot Class-Incremental Modulation Classification
1
作者 Zhou Xiaoyu Qi Peihan +3 位作者 Liu Qi Ding Yuanlei Zheng Shilian Li Zan 《China Communications》 SCIE CSCD 2024年第11期88-103,共16页
With the successive application of deep learning(DL)in classification tasks,the DL-based modulation classification method has become the preference for its state-of-the-art performance.Nevertheless,once the DL recogni... With the successive application of deep learning(DL)in classification tasks,the DL-based modulation classification method has become the preference for its state-of-the-art performance.Nevertheless,once the DL recognition model is pre-trained with fixed classes,the pre-trained model tends to predict incorrect results when identifying incremental classes.Moreover,the incremental classes are usually emergent without label information or only a few labeled samples of incremental classes can be obtained.In this context,we propose a graphbased semi-supervised approach to address the fewshot classes-incremental(FSCI)modulation classification problem.Our proposed method is a twostage learning method,specifically,a warm-up model is trained for classifying old classes and incremental classes,where the unlabeled samples of incremental classes are uniformly labeled with the same label to alleviate the damage of the class imbalance problem.Then the warm-up model is regarded as a feature extractor for constructing a similar graph to connect labeled samples and unlabeled samples,and the label propagation algorithm is adopted to propagate the label information from labeled nodes to unlabeled nodes in the graph to achieve the purpose of incremental classes recognition.Simulation results prove that the proposed method is superior to other finetuning methods and retrain methods. 展开更多
关键词 deep learning few-shot label propagation modulation classification semi-supervised learning
下载PDF
A Few-Shot Learning-Based Automatic Modulation Classification Method for Internet of Things
2
作者 Aer Sileng Qi Chenhao 《China Communications》 SCIE CSCD 2024年第8期18-29,共12页
Due to the limited computational capability and the diversity of the Internet of Things devices working in different environment,we consider fewshot learning-based automatic modulation classification(AMC)to improve it... Due to the limited computational capability and the diversity of the Internet of Things devices working in different environment,we consider fewshot learning-based automatic modulation classification(AMC)to improve its reliability.A data enhancement module(DEM)is designed by a convolutional layer to supplement frequency-domain information as well as providing nonlinear mapping that is beneficial for AMC.Multimodal network is designed to have multiple residual blocks,where each residual block has multiple convolutional kernels of different sizes for diverse feature extraction.Moreover,a deep supervised loss function is designed to supervise all parts of the network including the hidden layers and the DEM.Since different model may output different results,cooperative classifier is designed to avoid the randomness of single model and improve the reliability.Simulation results show that this few-shot learning-based AMC method can significantly improve the AMC accuracy compared to the existing methods. 展开更多
关键词 automatic modulation classification(AMC) deep learning(DL) few-shot learning Internet of Things(IoT)
下载PDF
Fast Detection and Classification of Dangerous Urban Sounds Using Deep Learning
3
作者 Zeinel Momynkulov Zhandos Dosbayev +4 位作者 Azizah Suliman Bayan Abduraimova Nurzhigit Smailov Maigul Zhekambayeva Dusmat Zhamangarin 《Computers, Materials & Continua》 SCIE EI 2023年第4期2191-2208,共18页
Video analytics is an integral part of surveillance cameras. Comparedto video analytics, audio analytics offers several benefits, includingless expensive equipment and upkeep expenses. Additionally, the volume ofthe a... Video analytics is an integral part of surveillance cameras. Comparedto video analytics, audio analytics offers several benefits, includingless expensive equipment and upkeep expenses. Additionally, the volume ofthe audio datastream is substantially lower than the video camera datastream,especially concerning real-time operating systems, which makes it lessdemanding of the data channel’s bandwidth needs. For instance, automaticlive video streaming from the site of an explosion and gunshot to the policeconsole using audio analytics technologies would be exceedingly helpful forurban surveillance. Technologies for audio analytics may also be used toanalyze video recordings and identify occurrences. This research proposeda deep learning model based on the combination of convolutional neuralnetwork (CNN) and recurrent neural network (RNN) known as the CNNRNNapproach. The proposed model focused on automatically identifyingpulse sounds that indicate critical situations in audio sources. The algorithm’saccuracy ranged from 95% to 81% when classifying noises from incidents,including gunshots, explosions, shattered glass, sirens, cries, and dog barking.The proposed approach can be applied to provide security for citizens in openand closed locations, like stadiums, underground areas, shopping malls, andother places. 展开更多
关键词 Deep learning urban sounds CNN RNN classification impulsive sounds
下载PDF
Deep Learning-based Environmental Sound Classification Using Feature Fusion and Data Enhancement
4
作者 Rashid Jahangir Muhammad Asif Nauman +3 位作者 Roobaea Alroobaea Jasem Almotiri Muhammad Mohsin Malik Sabah M.Alzahrani 《Computers, Materials & Continua》 SCIE EI 2023年第1期1069-1091,共23页
Environmental sound classification(ESC)involves the process of distinguishing an audio stream associated with numerous environmental sounds.Some common aspects such as the framework difference,overlapping of different... Environmental sound classification(ESC)involves the process of distinguishing an audio stream associated with numerous environmental sounds.Some common aspects such as the framework difference,overlapping of different sound events,and the presence of various sound sources during recording make the ESC task much more complicated and complex.This research is to propose a deep learning model to improve the recognition rate of environmental sounds and reduce the model training time under limited computation resources.In this research,the performance of transformer and convolutional neural networks(CNN)are investigated.Seven audio features,chromagram,Mel-spectrogram,tonnetz,Mel-Frequency Cepstral Coefficients(MFCCs),delta MFCCs,delta-delta MFCCs and spectral contrast,are extracted fromtheUrbanSound8K,ESC-50,and ESC-10,databases.Moreover,this research also employed three data enhancement methods,namely,white noise,pitch tuning,and time stretch to reduce the risk of overfitting issue due to the limited audio clips.The evaluation of various experiments demonstrates that the best performance was achieved by the proposed transformer model using seven audio features on enhanced database.For UrbanSound8K,ESC-50,and ESC-10,the highest attained accuracies are 0.98,0.94,and 0.97 respectively.The experimental results reveal that the proposed technique can achieve the best performance for ESC problems. 展开更多
关键词 Environmental sound classification convolutional neural network deep learning TRANSFORMER data augmentation
下载PDF
Leveraging on few-shot learning for tire pattern classification in forensics
5
作者 Lijun Jiang Syed Ariff Syed Hesham +1 位作者 Keng Pang Lim Changyun Wen 《Journal of Automation and Intelligence》 2023年第3期146-151,共6页
This paper presents a novel approach for tire-pattern classification,aimed at conducting forensic analysis on tire marks discovered at crime scenes.The classification model proposed in this study accounts for the intr... This paper presents a novel approach for tire-pattern classification,aimed at conducting forensic analysis on tire marks discovered at crime scenes.The classification model proposed in this study accounts for the intricate and dynamic nature of tire prints found in real-world scenarios,including accident sites.To address this complexity,the classifier model was developed to harness the meta-learning capabilities of few-shot learning algorithms(learning-to-learn).The model is meticulously designed and optimized to effectively classify both tire patterns exhibited on wheels and tire-indentation marks visible on surfaces due to friction.This is achieved by employing a semantic segmentation model to extract the tire pattern marks within the image.These marks are subsequently used as a mask channel,combined with the original image,and fed into the classifier to perform classification.Overall,The proposed model follows a three-step process:(i)the Bilateral Segmentation Network is employed to derive the semantic segmentation of the tire pattern within a given image.(ii)utilizing the semantic image in conjunction with the original image,the model learns and clusters groups to generate vectors that define the relative position of the image in the test set.(iii)the model performs predictions based on these learned features.Empirical verification demonstrates usage of semantic model to extract the tire patterns before performing classification increases the overall accuracy of classification by∼4%. 展开更多
关键词 META-LEARNING few-shot classification Semantic segmentation
下载PDF
Comparative Analysis of Different Sampling Rates on Environmental Sound Classification Using the Urbansound8k Dataset
6
作者 Ibrahim Aljubayri 《Journal of Computer and Communications》 2023年第6期19-27,共9页
Environmental sound classification (ESC) has gained increasing attention in recent years. This study focuses on the evaluation of the popular public dataset Urbansound8k (Us8k) at different sampling rates using hand c... Environmental sound classification (ESC) has gained increasing attention in recent years. This study focuses on the evaluation of the popular public dataset Urbansound8k (Us8k) at different sampling rates using hand crafted features. The Us8k dataset contains environment sounds recorded at various sampling rates, and previous ESC works have uniformly resampled the dataset. Some previous work converted this data to different sampling rates for various reasons. Some of them chose to convert the rest of the dataset to 44,100, as the majority of the Us8k files were already at that sampling rate. On the other hand, some researchers down sampled the dataset to 8000, as it reduced computational complexity, while others resampled it to 16,000, aiming to achieve a balance between higher classification accuracy and lower computational complexity. In this research, we assessed the performance of ESC tasks using sampling rates of 8000 Hz, 16,000 Hz, and 44,100 Hz by extracting the hand crafted features Mel frequency cepstral coefficient (MFCC), gamma tone cepstral coefficients (GTCC), and Mel Spectrogram (MelSpec). The results indicated that there was no significant difference in the classification accuracy among the three tested sampling rates. 展开更多
关键词 Deep Learning Convolutional Neural Network Environmental sound classification
下载PDF
Multi-attention fusion and weighted class representation for few-shot classification
7
作者 ZHAO Wencang QIN Wenqian LI Ming 《High Technology Letters》 EI CAS 2022年第3期295-306,共12页
The existing few-shot learning(FSL) approaches based on metric-learning usually lack attention to the distinction of feature contributions,and the importance of each sample is often ignored when obtaining the class re... The existing few-shot learning(FSL) approaches based on metric-learning usually lack attention to the distinction of feature contributions,and the importance of each sample is often ignored when obtaining the class representation,where the performance of the model is limited.Additionally,similarity metric method is also worthy of attention.Therefore,a few-shot learning approach called MWNet based on multi-attention fusion and weighted class representation(WCR) is proposed in this paper.Firstly,a multi-attention fusion module is introduced into the model to highlight the valuable part of the feature and reduce the interference of irrelevant content.Then,when obtaining the class representation,weight is given to each support set sample,and the weighted class representation is used to better express the class.Moreover,a mutual similarity metric method is used to obtain a more accurate similarity relationship through the mutual similarity for each representation.Experiments prove that the approach in this paper performs well in few-shot image classification,and also shows remarkable excellence and competitiveness compared with related advanced techniques. 展开更多
关键词 few-shot learning(FSL) image classification metric-learning multi-attention fusion
下载PDF
SW-Net: A novel few-shot learning approach for disease subtype prediction
8
作者 YUHAN JI YONG LIANG +1 位作者 ZIYI YANG NING AI 《BIOCELL》 SCIE 2023年第3期569-579,共11页
Few-shot learning is becoming more and more popular in many fields,especially in the computer vision field.This inspires us to introduce few-shot learning to the genomic field,which faces a typical few-shot problem be... Few-shot learning is becoming more and more popular in many fields,especially in the computer vision field.This inspires us to introduce few-shot learning to the genomic field,which faces a typical few-shot problem because some tasks only have a limited number of samples with high-dimensions.The goal of this study was to investigate the few-shot disease sub-type prediction problem and identify patient subgroups through training on small data.Accurate disease subtype classification allows clinicians to efficiently deliver investigations and interventions in clinical practice.We propose the SW-Net,which simulates the clinical process of extracting the shared knowledge from a range of interrelated tasks and generalizes it to unseen data.Our model is built upon a simple baseline,and we modified it for genomic data.Supportbased initialization for the classifier and transductive fine-tuning techniques were applied in our model to improve prediction accuracy,and an Entropy regularization term on the query set was appended to reduce over-fitting.Moreover,to address the high dimension and high noise issue,we future extended a feature selection module to adaptively select important features and a sample weighting module to prioritize high-confidence samples.Experiments on simulated data and The Cancer Genome Atlas meta-dataset show that our new baseline model gets higher prediction accuracy compared to other competing algorithms. 展开更多
关键词 few-shot learning Disease sub-type classification Feature selection Deep learning META-LEARNING
下载PDF
Intelligent Sound-Based Early Fault Detection System for Vehicles
9
作者 Fawad Nasim Sohail Masood +2 位作者 Arfan Jaffar Usman Ahmad Muhammad Rashid 《Computer Systems Science & Engineering》 SCIE EI 2023年第9期3175-3190,共16页
An intelligent sound-based early fault detection system has been proposed for vehicles using machine learning.The system is designed to detect faults in vehicles at an early stage by analyzing the sound emitted by the... An intelligent sound-based early fault detection system has been proposed for vehicles using machine learning.The system is designed to detect faults in vehicles at an early stage by analyzing the sound emitted by the car.Early detection and correction of defects can improve the efficiency and life of the engine and other mechanical parts.The system uses a microphone to capture the sound emitted by the vehicle and a machine-learning algorithm to analyze the sound and detect faults.A possible fault is determined in the vehicle based on this processed sound.Binary classification is done at the first stage to differentiate between faulty and healthy cars.We collected noisy and normal sound samples of the car engine under normal and different abnormal conditions from multiple workshops and verified the data from experts.We used the time domain,frequency domain,and time-frequency domain features to detect the normal and abnormal conditions of the vehicle correctly.We used abnormal car data to classify it into fifteen other classical vehicle problems.We experimented with various signal processing techniques and presented the comparison results.In the detection and further problem classification,random forest showed the highest results of 97%and 92%with time-frequency features. 展开更多
关键词 sound classification signal processing random forest random tree time-frequency domain J48
下载PDF
融合注意力机制卷积神经网络的扬声器异常声分类 被引量:1
10
作者 周静雷 王晓明 李丽敏 《西安工程大学学报》 CAS 2024年第2期101-108,共8页
针对扬声器异常声非线性、非平稳且易受外部噪声干扰,以及因特征冗余而导致扬声器异常声识别率偏低的问题,提出一种基于变分模态分解(variational mode decomposition, VMD)和一维卷积循环注意力网络(1DCNN-BiLSTM-Attention)相结合的... 针对扬声器异常声非线性、非平稳且易受外部噪声干扰,以及因特征冗余而导致扬声器异常声识别率偏低的问题,提出一种基于变分模态分解(variational mode decomposition, VMD)和一维卷积循环注意力网络(1DCNN-BiLSTM-Attention)相结合的扬声器异常声分类方法。首先,采集不同类型异常声信号,采用VMD对异常声信号进行分解并提取扬声器异常声特征,构建标签化的初始数据;其次,将特征数据输入至1DCNN-BiLSTM网络中进行初始化特征提取,利用注意力机制自适应优化网络对异常声特征的学习权重,提升网络对特征鉴别能力,并优化Dropout抑制网络在训练过程中存在的过拟合问题,构成1DCNN-BiLSTM-Attention分类网络;最后,将所提方法应用于扬声器异常声分类中。实验结果表明:该方法可以有效提取到扬声器异常声中的关键特征,平均分类准确率为99.17%,与VGG16、RF和DCNN相比,其准确率分别提高了13.14%、0.56%,12.34%。 展开更多
关键词 异常声分类 变分模态分解 卷积神经网络 注意力机制
下载PDF
基于多维度声学特征优选的多波束海底底质分类
11
作者 宋佰万 付明生 +1 位作者 崔晓东 牛冲 《海洋通报》 CAS CSCD 北大核心 2024年第2期198-209,共12页
海底表层底质分布信息的准确获取在构建海洋基础地理数据库中发挥着重要作用。目前,多波束是实现大范围海底底质分类的有效手段之一,基于多波束测深和反向散射强度数据所派生的声学特征被广泛应用于底质分类建模。然而,随着特征维度的增... 海底表层底质分布信息的准确获取在构建海洋基础地理数据库中发挥着重要作用。目前,多波束是实现大范围海底底质分类的有效手段之一,基于多波束测深和反向散射强度数据所派生的声学特征被广泛应用于底质分类建模。然而,随着特征维度的增加,特征空间中存在的无关和冗余特征严重影响底质分类精度。为了定量评估声学特征对底质类别的表征能力,并消除无效特征对分类结果的干扰,本文提出了基于多维度声学特征优选的海底底质分类方法。首先,结合实际底质样本的物理属性对多维特征进行排序和优选,排除冗余和无关特征。其次,分别应用支持向量机、随机森林和深度信念网络构建海底底质监督分类模型。通过利用爱尔兰海南部多波束调查数据和实地取样信息进行试验,结果表明提出方法对海底底质的总体分类精度和Kappa系数分别最高达到了86.20%和0.834,相较于主成分分析和熵指标特征选择方法有明显提高,突出了该方法在海底底质探测及制图的应用潜力。 展开更多
关键词 海底底质分类 多波束测深系统 特征优选 反向散射强度 海底地形
下载PDF
结合MGCC特征与多尺度通道注意力的环境声深度学习分类方法
12
作者 杨俊杰 丁家辉 +2 位作者 杨柳 冯丽 杨超 《应用声学》 CSCD 北大核心 2024年第3期513-524,共12页
环境声分类技术在家居安全监测、人机语声交互等领域具有关键作用。然而,声源的多样性与混合性给环境声分类方法设计带来了重大挑战。为提高分类准确率与节约计算资源,该文提出一种基于多尺度通道注意力机制的深度学习分类模型。所提模... 环境声分类技术在家居安全监测、人机语声交互等领域具有关键作用。然而,声源的多样性与混合性给环境声分类方法设计带来了重大挑战。为提高分类准确率与节约计算资源,该文提出一种基于多尺度通道注意力机制的深度学习分类模型。所提模型由特征提取模块、多尺度卷积模块、高效通道注意力模块、输出层四部分组成。首先,通过引入加权型梅尔Gammatone频率倒谱系数(MGCC)挖掘环境声频谱幅值与相位结构信息;其次,融合多尺度卷积核与高效通道注意力机制优选出声频关键局部细节和通道特征;最后,在全连接层采用softmax函数映射特征并输出环境声类型的概率值。所提模型在6种环境声的iFLYTEK、10种环境声的Urbansound8k数据集上开展测试验证,分别取得了94%、76.52%、79.24%(iFLYTEK+Urbansound8k)的分类准确率。消融实验结果进一步表明:引入的多尺度卷积模块、通道注意力机制模块对分类准确率的提升贡献率分别接近于3.77%和1.89%。实验还详细对比了7种现有的深度学习分类方法,所提算法在分类准确率上排名第二;另外,在同级别算法中如ResNet18、GoogLeNet,所提算法在模型参数量和计算复杂度方面上实现了进一步的约减。 展开更多
关键词 环境声分类 梅尔Gammatone频率倒谱 多尺度核卷积 高效通道注意力 卷积神经网络
下载PDF
融合Swin Transformer和CNN的环境声音分类模型
13
作者 朱振飞 葛动元 +1 位作者 姚锡凡 苏瑞轩 《科学技术与工程》 北大核心 2024年第28期12259-12267,共9页
环境声音分类已经成为计算机听觉领域的一项重要任务,可以作为计算机视觉的补充,帮助设备更好地理解环境和用户需求,具有广泛的应用前景,将对人类生活产生积极影响。近年来,环境声音分类领域采用了具有自注意力机制的Transformer模型,... 环境声音分类已经成为计算机听觉领域的一项重要任务,可以作为计算机视觉的补充,帮助设备更好地理解环境和用户需求,具有广泛的应用前景,将对人类生活产生积极影响。近年来,环境声音分类领域采用了具有自注意力机制的Transformer模型,然而现有模型需要较大的内存,同时依赖于预训练的视觉模型,无法较好提取音频特征。为了解决这些问题并提高环境声音分类准确度,提出了一种新的具有双分支结构的Swin Conformer环境声音分类模型。通过融合卷积神经网络和具有窗口自注意力机制的Swin Transformer模型,以交互方式融合双分支特征并引入令牌语义模块。结果表明:Swin Conformer模型在ESC-50和UrbanSound8K公共数据集上分别通过验证实现了98.1%和96.8%的分类准确度。与现有模型相比,具有更高的分类准确度,证明了该模型在环境声音分类任务中的可行性和优越性。 展开更多
关键词 环境声音分类 数据增强 TRANSFORMER 自注意力
下载PDF
基于并联型神经网络的环境声音分类
14
作者 覃镜涛 高瑜翔 《传感器与微系统》 CSCD 北大核心 2024年第7期106-109,113,共5页
针对传统单输入模型在环境声音分类中准确率不高的问题,提出一种基于时域特征和频域特征并联型特征融合神经网络。在该网络中,首先通过数据增强的方法来处理原始音频;其次处理后的原始音频数据和梅尔(Mel)频谱特征数据分别送入原始波形... 针对传统单输入模型在环境声音分类中准确率不高的问题,提出一种基于时域特征和频域特征并联型特征融合神经网络。在该网络中,首先通过数据增强的方法来处理原始音频;其次处理后的原始音频数据和梅尔(Mel)频谱特征数据分别送入原始波形网络和Mel频谱网络,得到其时域和频谱特征后,进行特征融合;最后,将特征融合后的结果送入SoftMax分类器进行分类。本文在UrbanSound8K数据集上进行了实验验证,最终分类准确率高达96.03%,优于其他模型。 展开更多
关键词 并联型神经网络 特征融合 环境声音分类
下载PDF
一种面向微控制器上环境声音分类的DNN压缩方法
15
作者 孟娜 方维维 路红英 《计算机与现代化》 2024年第1期80-86,共7页
环境声音分类(Environmental Sound Classification,ESC)是非语音音频分类任务最重要的课题之一。近年来,深度神经网络(Deep Neural Network,DNN)方法在ESC方面取得了许多进展。然而,DNN是计算和存储密集型的,无法直接部署到基于微控制... 环境声音分类(Environmental Sound Classification,ESC)是非语音音频分类任务最重要的课题之一。近年来,深度神经网络(Deep Neural Network,DNN)方法在ESC方面取得了许多进展。然而,DNN是计算和存储密集型的,无法直接部署到基于微控制器(Microcontroller Unit,MCU)的物联网设备上。针对这一问题,本文提出一种用于资源高度受限设备的DNN压缩方法。由于DNN模型参数规模较大无法直接部署,因此提出使用剪枝方法进行大幅压缩,并针对该操作带来的精度损失问题,设计一种基于模型中间层特征信息的知识蒸馏方法。基于STM32F746ZG设备在公开的数据集(UrbanSound8K、ESC-50)上进行测试,实验结果表明,本文方法能够获得高达97%的压缩率,同时保持良好的推理精度和速度。 展开更多
关键词 环境声音分类 边缘计算 微控制器 剪枝 知识蒸馏 量化
下载PDF
基于Zynq和蜜蜂进化遗传算法的声源实时定向系统
16
作者 陆智辉 兰昀弢 +2 位作者 郑郁正 刘凯 唐国璇 《电子设计工程》 2024年第1期164-169,174,共7页
针对声源定向系统利用多重信号分类(MUSIC)算法测向时存在精准度高,但实时性偏低的问题,基于Zynq平台和蜜蜂进化遗传算法(BEGA),设计了一款软硬件协同工作的声源实时定向系统。系统利用MEMS麦克风均匀圆阵和Zynq采集和传输声源数据,并引... 针对声源定向系统利用多重信号分类(MUSIC)算法测向时存在精准度高,但实时性偏低的问题,基于Zynq平台和蜜蜂进化遗传算法(BEGA),设计了一款软硬件协同工作的声源实时定向系统。系统利用MEMS麦克风均匀圆阵和Zynq采集和传输声源数据,并引入BEGA提升MUSIC算法搜索谱峰的速度。实验表明,硬件平台具有不掉帧、延迟低的优良性能,同时数据处理单元利用BEGA大幅缩短了谱峰搜索的时间,并且具备精准的定向性能。因此,该系统满足实时性要求,也保留了MUSIC算法精准度高的优点。 展开更多
关键词 声源定向 MUSIC Zynq 蜜蜂进化 MEMS
下载PDF
基于鲁棒纹理特征的环境声音事件检测方法
17
作者 吴婷 刘琼 郭慧茹 《电子器件》 CAS 2024年第2期530-535,共6页
针对各种类别的环境声音事件检测问题,提出了基于鲁棒纹理特征的环境声音事件检测方法。首先,将原始的声音样本转换为类伽马声谱图;然后将类伽马声谱图通过剪切波变换提取图像的纹理特征;又采用中心化二值模式(CBP)算法进行编码;针对特... 针对各种类别的环境声音事件检测问题,提出了基于鲁棒纹理特征的环境声音事件检测方法。首先,将原始的声音样本转换为类伽马声谱图;然后将类伽马声谱图通过剪切波变换提取图像的纹理特征;又采用中心化二值模式(CBP)算法进行编码;针对特征维度过高问题,先利用随机森林算法后结合主成分分析(PCA)算法,提出了RF-PCA降维方法;最后使用支持向量机(SVM)对不同环境的声音进行分类。在公开数据集ESC-10上的仿真实验结果表明,利用所提出的基于鲁棒纹理特征的环境声音事件检测方法所提取的特征对声音分类可达到93.00%的分类效果。 展开更多
关键词 环境声音分类 类伽马声谱图 SHEARLET变换 CBP算法 RF-PCA
下载PDF
基于神经网络的环境声音分类方法研究
18
作者 徐圣林 《电声技术》 2024年第10期54-56,共3页
研究一种基于卷积神经网络(Convolutional Neural Network,CNN)和门控循环单元(Gated Recurrent Unit,GRU)相结合的环境声音分类方法。首先,分析CNN-GRU模型的基本结构;其次,探讨模型进行环境声音分类的数学原理;最后,采用ESC-50数据集... 研究一种基于卷积神经网络(Convolutional Neural Network,CNN)和门控循环单元(Gated Recurrent Unit,GRU)相结合的环境声音分类方法。首先,分析CNN-GRU模型的基本结构;其次,探讨模型进行环境声音分类的数学原理;最后,采用ESC-50数据集在MATLAB平台上对所提方法进行测试。实验结果表明,CNN-GRU模型的准确率、精确率、召回率及F1值分别达到了0.92、0.91、0.89及0.90,验证了该模型在处理环境声音分类任务中的有效性和健壮性。 展开更多
关键词 卷积神经网络(CNN) 门控循环单元(GRU) 声音分类
下载PDF
遗传算法与修正的自适应矩估计优化循环神经网络的心音分类方法
19
作者 吴全玉 刘美君 +2 位作者 范家琪 潘玲佼 陶为戈 《南京理工大学学报》 CAS CSCD 北大核心 2024年第2期202-208,226,共8页
针对传统的循环神经网络(RNN)在识别分类心音信号方面具有梯度爆炸、梯度消失和短期记忆的问题,该文提出了无需心音分段的结合遗传算法(GA)与修正的自适应矩估计(RAdam)优化RNN的心音分类模型。该模型的优势是将GA和RAdam优化器以串联... 针对传统的循环神经网络(RNN)在识别分类心音信号方面具有梯度爆炸、梯度消失和短期记忆的问题,该文提出了无需心音分段的结合遗传算法(GA)与修正的自适应矩估计(RAdam)优化RNN的心音分类模型。该模型的优势是将GA和RAdam优化器以串联的方式融合到RNN中,以达到改进RNN的作用。首先,利用GA的选择、变异和遗传操作,优化RNN的输入层节点数,获取心音特征向量的最优个体的初始解。其次,根据最优个体中的权重、偏置矩阵,赋予模型初始权值和阈值,获得初始权重最优解,整个模型共享参数。最后,联合改进的学习率自适应优化算法,优化RNN模型。结果表明,结合经典的梅尔(Mel)倒频谱系数方法提取心音信号的特征向量,心音信号分类准确率达到90.29%,相比于未优化的RNN模型,准确率提高了17.79%。 展开更多
关键词 遗传算法 自适应矩估计 循环神经网络 心音分类
下载PDF
基于MobileNetV3卷积神经网络的供水管道漏损音频分类
20
作者 陈双叶 徐雷桁 +3 位作者 黄成意 张智武 张林 韩默 《北京工业大学学报》 CAS CSCD 北大核心 2024年第7期797-804,共8页
为了对城市供水管网漏损音进行准确识别,提出一种基于MobileNetV3的供水管道漏损音频分类识别方法。首先将ROPP数据集中的音频文件进行离线数据增强,将漏损信号转变为对数梅尔谱图并采用谱减法实现数据降噪;然后使用注意力机制模块与Mob... 为了对城市供水管网漏损音进行准确识别,提出一种基于MobileNetV3的供水管道漏损音频分类识别方法。首先将ROPP数据集中的音频文件进行离线数据增强,将漏损信号转变为对数梅尔谱图并采用谱减法实现数据降噪;然后使用注意力机制模块与MobileNetV3网络训练识别并提取图像特征;最后使用Softmax函数对漏损音频进行分类。实验结果表明,该方法可以使漏水类别的分类精确度达到99.40%,召回率达到99.20%。 展开更多
关键词 声音事件分类 水管泄漏检测 MobileNetV3 数据增强 谱减法 压缩奖惩网络模块
下载PDF
上一页 1 2 10 下一页 到第
使用帮助 返回顶部