期刊文献+

基于常Q变换与深度神经网络的VDR语音端点检测 被引量:2

Voice activity detection of VDR audio based on constant-Q transform and deep neural network
原文传递
导出
摘要 基于采集的真实船舶航行数据记录仪音频数据,提出一种基于常Q变换(Constant-Q Transform, CQT)幅度谱与深度神经网络(DNN)的语音端点检测方法。为获得适合不同频段的变频率分辨率,采用CQT对VDR音频信号进行谱分析,并利用DNN自动学习基于CQT幅度谱的复杂特征表示,实现端到端的VDR音频数据语音端点检测,真实VDR音频数据验证了本文方法的有效性。实验结果表明,该方法具有较高的正确率和鲁棒性。 Based on the real-world audio data recored by voyage data recorder(VDR), a voice activity detection(VAD) method based on constant-Q transform(CQT) and deep neural network(DNN) was proposed. In order to obtain the frequency conversion rate resolution suitable for different frequency bands, CQT was used to analyze the spectrum of VDR audio signal, and DNN was used to automatically learn the complex feature representation based on CQT amplitude spectrum to realize end-to-end voice endpoint detection of VDR audio data. The effectiveness of the proposed method was verified by real VDR audio data. Experimental results show that this method has high accuracy and robustness.
作者 杜晗 张维维 张巧灵 闫凌宇 DU Han;ZHANG Wei-wei;ZHANG Qiao-ling;YAN Ling-yu(School of Information Science and Technology,Dalian Maritime University,Dalian 116026,China;School of Informatics and Electronics,Zhejiang Sci-Tech University,Hangzhou 310018,China)
出处 《大连海事大学学报》 CAS CSCD 北大核心 2022年第2期128-135,共8页 Journal of Dalian Maritime University
基金 国家自然科学基金资助项目(61806178 61972068) 中国博士后科学基金资助项目(2020M680932) 浙江省自然科学基金资助项目(LY21F010015) 中央高校基本科研业务费专项资金资助项目(3132021226)。
关键词 船舶航行数据记录仪(VDR) 语音端点检测(VAD) 常Q变换(CQT) 深度神经网络(DNN) voyage data recorder(VDR) voice activity detection(VAD) constant-Q transform(CQT) deep neural network(DNN)
  • 相关文献

参考文献2

二级参考文献24

  • 1朴春俊,马静霞,徐鹏.带噪语音端点检测方法研究[J].计算机应用,2006,26(11):2685-2686. 被引量:10
  • 2Chung H, Lee S J, Lee Y K. Weighed-finite state transduer-based endpoint detection using probabilistic decision logic[J]. ETRI Journal, 2014, 36(5): 714- 720.
  • 3Wang Yongqi, Zhang Hui. The research of speech recognition in low SNR based on GA-SVM[J]. Ap- plied Mechanics and Materials, 2014, 590.. 727-731.
  • 4Ouzounov A. Noisy speech endpoint detection using robust feature [C]. Biometric Authentication. NewYork= Springer International Publishing, 2014= 105- 117.
  • 5Ouzounov A. Telephone speech endpoint detection u- sing mean-delta feature[J]. Cybernetics and Informa- tion Technologies, 2014, 14(2)= 127-139.
  • 6Park J, Kim W, Han D K. Voice activity detection in noisy environments based on double-combined fourier transform and line fitting[J]. Scientific World Joural, 2014, 22(4): 216-228.
  • 7Cao Yali, La Dongsheng, Jia Shuo, et al. A speech endpoint detection algorithm based on wavelet trans- forms[C]. The 26th Control and Decision Conference. New York.- IEEE, 2014.- 3010-3012.
  • 8Liu Baisen, Zhang Ye, Zhang Wulin. Speech endpoint detection with low SNR based on HHTSM[C]. ICE- MI 2013 IEEE llth International Conference. New York: IEEE, 2013: 116-119.
  • 9Huang Guangbin , Zhu Qinyu, Siew C K. Extreme learn machine., theory and applications [J]. Neuro- computing, 2006, 70(1/2/3).- 489-501.
  • 10Men Changqian, Wang Wenjian. A randomized ELM speedup algorithm[J]. Neurocomputing, 2015, 159(2) : 78-83.

共引文献10

同被引文献21

引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部