期刊文献+

基于动静态特征双输入神经网络的咳嗽声诊断COVID-19算法 被引量:2

A Dynamic-Static Dual Input Deep Neural Network Algorithm for Diagnosing COVID-19 by Cough
下载PDF
导出
摘要 新型冠状病毒肺炎(COVID-19)已经在世界范围内造成了严重影响,在防控疫情方面学者们进行了大量研究.利用咳嗽声判断病变部位来诊断新冠肺炎具有非接触、成本低、易获取等优点,但是此类研究在国内较为匮乏.梅尔倒谱系数(Mel Frequency Cepstral Coefficients,MFCC)特征仅能够表示声音的静态特征,而一阶差分MFCC特征还能反应声音的动态特征.为了更好地防治新冠肺炎,本文提出了基于动静态特征双输入神经网络的咳嗽声诊断新冠肺炎算法,通过咳嗽声诊断新冠肺炎.在Coswara数据集基础上,对咳嗽声的音频进行裁剪,提取MFCC和一阶差分MFCC特征训练了一个动静态特征双输入神经网络模型.本文模型采用统计池化层,可以输入不同长度的MFCC特征.实验结果表明,与现有模型相比较,本文算法明显提升了识别准确率、召回率、特异性和F1值. The COVID-19(corona virus disease 2019) has caused serious impacts worldwide. Many scholars have done a lot of research on the prevention and control of the epidemic. The diagnosis of COVID-19 by cough is non-contact,low-cost, and easy-access, however, such research is still relatively scarce in China. Mel frequency cepstral coefficients(MFCC) feature can only represent the static sound feature, while the first-order differential MFCC feature can also reflect the dynamic feature of sound. In order to better prevent and treat COVID-19, the paper proposes a dynamic-static dual input deep neural network algorithm for diagnosing COVID-19 by cough. Based on Coswara dataset, cough audio is clipped, MFCC and first-order differential MFCC features are extracted, and a dynamic and static feature dual-input neural network model is trained. The model adopts a statistic pooling layer so that different length of MFCC features can be input. The experiment results show the proposed algorithm can significantly improve the recognition accuracy, recall rate, specificity,and F1-score compared with the existing models.
作者 张永梅 孙捷 ZHANG Yong-mei;SUN Jie(School of Information Science and Technology,North China University of Technology,Beijing 100144,China)
出处 《电子学报》 EI CAS CSCD 北大核心 2023年第1期202-212,共11页 Acta Electronica Sinica
基金 国家重点研发计划(No.2020YFC0811004)。
关键词 深度学习 咳嗽声 新冠肺炎 梅尔倒谱系数 音频技术 卷积神经网络 deep learning cough COVID-19 Mel frequency cepstral coefficients audio technology CNN
  • 相关文献

参考文献4

二级参考文献41

  • 1林玮,杨莉莉,徐柏龄.基于修正MFCC参数汉语耳语音的话者识别[J].南京大学学报(自然科学版),2006,42(1):54-62. 被引量:22
  • 2吴红卫,吴镇扬,赵力.基于多窗谱的心理声学语音增强[J].声学学报,2007,32(3):275-281. 被引量:12
  • 3Ishizaka K.Isshiki N.Computer simulation of pathological vocal-cord vibration.J.Acoust.Soc.Am,1976;60(5):1193-1198.
  • 4Gavidia-Ceballos L,Hansen J H L,Kaiser J F.Vocal fold pathology assessment using AM autocorrelation analysis of the teager energy operator.ICSLP,1996;2:757-760.
  • 5Akbari A,Arjmandi M K.An efficient voice pathology classification scheme based on applying multi-layer linear discriminant analysis to wavelet packet-based features.Biomedical Signal Processing and Control,2014;10:209-223.
  • 6Jiang J J,Zhang Y,McGilligan C.Chaos in voice,from modeling to measurement.Journal of Voice,2006;20(1):2-17.
  • 7Pinheiro A P,Kerschen G.Vibrational dynamics of vocal folds using nonlinear normal modes.Medical Engineering&Physics,2013;35(8):1079-1088.
  • 8Alipour F,Berry D A,Titze I R.A finite-element model of vocal-fold vibration.The Journal of the Acoustical Society of America,2000;108(6):3003-3012.
  • 9Fraile R,Kob M,Godino-Llorente J I et al.Physical simulation of laryngeal disorders using a multiple-mass vocal fold model.Biomedical Signal Processing and Control,2012;7(1):65-78.
  • 10Jiang J J,Zhang Y,Stern J.Modeling of chaotic vibrations in symmetric vocal folds.The Journal of the Acoustical Society of America,2001;110(4):2120-2128.

共引文献33

同被引文献7

引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部