摘要
世界各地抑郁症患者数量不断增多,抑郁症的诊断和治疗面临着医生短缺问题,针对这一问题,提出了卷积神经网络和结合注意力机制的双向长短时记忆特征融合模型。从特征选择和网络构架两方面进行了研究,对比了几种经典语声特征,得出梅尔倒谱系数对抑郁分类效果最好,再将梅尔倒谱系数分别送进卷积神经网络和结合注意力机制的双向长短时记忆网络实现抑郁分类。在DAIC-WOZ数据集上进行实验,所提出的方法对语声抑郁的分类精确度达到78.06%,F1分数达到74.68%。
The number of depression patients is increasing around the world.There is a shortage of doctors to diagnose and treat depression.In response to this problem,convolutional neural network(CNN)and bidirectional long short-term memory(BLSTM)feature fusion model combined with attention mechanism are proposed.Research has been carried out from the aspects of feature selection and network architecture.By comparing several classical speech features,it is concluded that the Mel-frequency cepstrum coefficient(MFCC)has the best effect on depression classification,and then the Meier cepstrum coefficient is sent into CNN and BLSTM network combined with attention mechanism respectively to achieve depression classification.Experiments on the DAIC-WOZ data set show that the proposed method has a classification accuracy of 78.06%and a F1 score of 74.68%.
作者
吴情
胡维平
陈丹丹
肖婷
WU Qing;HU Weiping;CHEN Dandan;XIAO Ting(College of Electronic Engineering,Guangxi Normal University,Guilin 541000,China)
出处
《应用声学》
CSCD
北大核心
2022年第5期837-842,共6页
Journal of Applied Acoustics
基金
国家自然科学基金项目(NSFC 61861005)。
关键词
抑郁识别
语声分析
分类
Depression recognition
Speech analysis
Classification