期刊文献+

一种基于SOM与脉冲神经网络的音频识别方法

An Audio Recognition Method Based on SOM and Spiking Neural Network
下载PDF
导出
摘要 近年来,在人工神经网络技术的推动下,音频分类技术不断提高。然而,传统人工神经网络存在计算功耗大、时域信号处理困难等问题。脉冲神经网络由于其事件驱动的特性,有着低功耗、可解释、时域处理能力强等特点,非常适用于音频信号处理任务。提出一种基于SOM时空特征稀疏编码和SNN有监督分类的音频识别方法,利用MFCC进行时-频域转换后,再利用SOM实现对时间序列音频信号的稀疏编码,不同于其他基于误差反向传播的有监督学习,利用带积分的STDP学习规则训练权重,并且通过使用兴奋抑制双监督训练方法,可以使得SNN有效提取和分析音频信号中的空间特征与时间特征,最终所提方法在TIDIGITS数字音频数据集上取得了96.47%的分类准确度。 In recent years,audio classification has been continuously developed under the impetus of artificial neural networks.However,traditional artificial neural networks have some shortcomings,including high power consumption and difficulty in processing time domain signals.Due to its event-driven characteristics,spiking neural networks possess features like low power consumption,biological interpretation,and the ability to process time domain signals.A method for audio recognition based on SOM spatio-temporal feature sparse coding and SNN supervised classification is proposed.MFCC is used to perform time-frequency domain conversion.SOM is utilized to achieve spatio-temporal sparse encoding.Different from other supervised learning based on error back propagation,the integrated STDP learning rule is used and an excitation-suppression dual-supervised training method is adopted,enabling SNN to effectively extract and analyze the spatial and temporal features of audio signals.Finally,the classification accuracy of the proposed method is 96.47%on the TIDIGITS digital audio dataset.
作者 隆二红 王刚 莫凌飞 LONG Erhong;WANG Gang;MO Lingfei(School of Instrument Science and Engineering,Southeast University,Nanjing Jiangsu 210096,China)
出处 《传感技术学报》 CAS CSCD 北大核心 2024年第11期1885-1892,共8页 Chinese Journal of Sensors and Actuators
基金 2021江苏省高校“青蓝工程”优秀青年骨干教师计划项目。
关键词 脉冲神经网络 音频识别 SOM时空特征稀疏编码 兴奋抑制双监督训练 低功耗 spiking neural network audio recognition SOM spatio-temporal feature sparce coding excitation-inhibition dual supervised training low power consumption
  • 相关文献

参考文献6

二级参考文献62

  • 1Quiroga R Q, Panzeri S. Extracting Information from Neuronal Populations: Information Theory and Decoding Approaches[J]. Nature Reviews Neuroscience, 2009, 10(3): 173-185.
  • 2Johnson D H. Information Theory and Neural Information Processing[J]. IEEE Transactions on Information Theory, 2010, 56(2): 653-666.
  • 3Cho M W, Choi M Y. Theory of Neural Communication Based on Spatio-temporal Coding[J]. BMC Neuroscience, 2(111, 12(S 1 ): 38.
  • 4Ghosh-Dastidar S, Adeli H. Spiking Neural Networks[J]. International Journal of Neural Systems, 2009, 19(4): 295-308.
  • 5Buhmann J, Lange T, Ramacher U. Image Segmentation by Networks of Spiking Neurons[J]. Neural Computation, 2005, 17(5) 1010-1031.
  • 6Meftah B, Lezoray O, Benyettou A. Segmentation and Edge Detection Based on Spiking Neural Network Model[J]. Neural Processing Letters, 2010, 32(2): 131-146.
  • 7Wu Qingxiang, McGinnity T M, Maguire L, et al. Colour Image Segmentation Based on a Spiking Neural Network Model Inspired by the Visual System[C]//Proc. of the 6th International Conference on Intelligent Computing. Heidelberg, Germany: [s. n.], 2010: 49-57.
  • 8Thorpe S, Fize D, Martot C. Speed of Processing in the Human Visual System[J]i Nature, 1996, 381(6582): 520-522.
  • 9Thorpe S J, Delorme A, Van Rullen R. Spike-based Strategies for Rapid Processing[J]. Neural Networks, 2001, 14(6-7): 715-725.
  • 10Bohte S M, Kok J N, La Poutr6 J A. Error-backpropagation in Temporally Encoded Networks of Spiking Neurons[J]. Neurocomputing, 2002, 48( 1-4): 17-37.

共引文献39

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部