摘要
新型冠状病毒肺炎(COVID-19)已经在世界范围内造成了严重影响,在防控疫情方面学者们进行了大量研究.利用咳嗽声判断病变部位来诊断新冠肺炎具有非接触、成本低、易获取等优点,但是此类研究在国内较为匮乏.梅尔倒谱系数(Mel Frequency Cepstral Coefficients,MFCC)特征仅能够表示声音的静态特征,而一阶差分MFCC特征还能反应声音的动态特征.为了更好地防治新冠肺炎,本文提出了基于动静态特征双输入神经网络的咳嗽声诊断新冠肺炎算法,通过咳嗽声诊断新冠肺炎.在Coswara数据集基础上,对咳嗽声的音频进行裁剪,提取MFCC和一阶差分MFCC特征训练了一个动静态特征双输入神经网络模型.本文模型采用统计池化层,可以输入不同长度的MFCC特征.实验结果表明,与现有模型相比较,本文算法明显提升了识别准确率、召回率、特异性和F1值.
The COVID-19(corona virus disease 2019) has caused serious impacts worldwide. Many scholars have done a lot of research on the prevention and control of the epidemic. The diagnosis of COVID-19 by cough is non-contact,low-cost, and easy-access, however, such research is still relatively scarce in China. Mel frequency cepstral coefficients(MFCC) feature can only represent the static sound feature, while the first-order differential MFCC feature can also reflect the dynamic feature of sound. In order to better prevent and treat COVID-19, the paper proposes a dynamic-static dual input deep neural network algorithm for diagnosing COVID-19 by cough. Based on Coswara dataset, cough audio is clipped, MFCC and first-order differential MFCC features are extracted, and a dynamic and static feature dual-input neural network model is trained. The model adopts a statistic pooling layer so that different length of MFCC features can be input. The experiment results show the proposed algorithm can significantly improve the recognition accuracy, recall rate, specificity,and F1-score compared with the existing models.
作者
张永梅
孙捷
ZHANG Yong-mei;SUN Jie(School of Information Science and Technology,North China University of Technology,Beijing 100144,China)
出处
《电子学报》
EI
CAS
CSCD
北大核心
2023年第1期202-212,共11页
Acta Electronica Sinica
基金
国家重点研发计划(No.2020YFC0811004)。