摘要
基于采集的真实船舶航行数据记录仪音频数据,提出一种基于常Q变换(Constant-Q Transform, CQT)幅度谱与深度神经网络(DNN)的语音端点检测方法。为获得适合不同频段的变频率分辨率,采用CQT对VDR音频信号进行谱分析,并利用DNN自动学习基于CQT幅度谱的复杂特征表示,实现端到端的VDR音频数据语音端点检测,真实VDR音频数据验证了本文方法的有效性。实验结果表明,该方法具有较高的正确率和鲁棒性。
Based on the real-world audio data recored by voyage data recorder(VDR), a voice activity detection(VAD) method based on constant-Q transform(CQT) and deep neural network(DNN) was proposed. In order to obtain the frequency conversion rate resolution suitable for different frequency bands, CQT was used to analyze the spectrum of VDR audio signal, and DNN was used to automatically learn the complex feature representation based on CQT amplitude spectrum to realize end-to-end voice endpoint detection of VDR audio data. The effectiveness of the proposed method was verified by real VDR audio data. Experimental results show that this method has high accuracy and robustness.
作者
杜晗
张维维
张巧灵
闫凌宇
DU Han;ZHANG Wei-wei;ZHANG Qiao-ling;YAN Ling-yu(School of Information Science and Technology,Dalian Maritime University,Dalian 116026,China;School of Informatics and Electronics,Zhejiang Sci-Tech University,Hangzhou 310018,China)
出处
《大连海事大学学报》
CAS
CSCD
北大核心
2022年第2期128-135,共8页
Journal of Dalian Maritime University
基金
国家自然科学基金资助项目(61806178
61972068)
中国博士后科学基金资助项目(2020M680932)
浙江省自然科学基金资助项目(LY21F010015)
中央高校基本科研业务费专项资金资助项目(3132021226)。