摘要
基于语音数据挖掘实现帕金森病诊断的方法近年来已被证明有效。然而,受数据采集对象患病程度以及采集设备和环境等因素影响,所获取数据集的样本空间中存在不同类别样本混叠现象。混叠区域的样本难以有效识别,严重影响了算法的分类准确度。为了解决这一问题,本文提出了分包融合集成算法,通过设计类心距离比值来衡量样本的混叠程度并将训练集划分成多个子集,再利用错误分类样本传递式训练的方法调整子集划分结果,最后通过优化子分类权重对各个子分类器的测试结果进行加权融合。实验结果表明,本文方法分类准确度在两个公共数据集上都得到明显提高,平均准确度最大提高可达25.44%。该方法不仅有效提高了帕金森病语音数据集分类准确度,还增加了样本利用率,为帕金森病语音诊断提供了一种新思路。
Methods for achieving diagnosis of Parkinson’s disease (PD) based on speech data mining have been proven effective in recent years. However, due to factors such as the degree of disease of the data collection subjects and the collection equipment and environment, there are different categories of sample aliasing in the sample space of the acquired data set. Samples in the aliased area are difficult to be identified effectively, which seriously affects the classification accuracy of the algorithm. In order to solve this problem, a partition bagging ensemble learning is proposed in this article, which measures the aliasing degree of the sample by designing the the ratio of sample centroid distance metrics and divides the training set into multiple subsets. And then the method of transfer training of misclassified samples is used to adjust the results of subset partitioning. Finally, the optimized weights of each subclassifier are used to integrate the test results. The experimental results show that the classification accuracy of the proposed method is significantly improved on two public datasets and the increasement of mean accuracy is up to 25.44%. This method not only effectively improves the classification accuracy of PD speech dataset, but also increases the sample utilization rate, providing a new idea for the diagnosis of PD.
作者
李勇明
张成
王品
谢廷杰
曾孝平
张艳玲
承欧梅
颜芳
LI Yongming;ZHANG Cheng;WANG Pin;XIE Ting jie;ZENG Xiaoping;ZHANG Yanling;CHENG Oumei;YAN Fang(School of Microelectronics and Communication Engineering, Chongqing University, Chongqing 400044, P.R.China;Collaborative Innovation Center for Brain Science, Chongqing University, Chongqing 400044, P.R.China;Department of Neurology, Southwest Hospital, Third Military Medical University, Chongqing 400038, P.R.China;Department of Neurology, The First Affiliated Hospital, Chongqing Medical University, Chongqing 400016, P.R.China)
出处
《生物医学工程学杂志》
EI
CAS
CSCD
北大核心
2019年第4期548-556,共9页
Journal of Biomedical Engineering
基金
国家自然科学基金资助项目(61771080,61571069)
西南医院联合孵化项目(SWH2016LHYS-11)
重庆市社会事业与民生保障科技创新专项(cstc2016shmszx40002)
重庆市基础与前沿研究项目(cstc2016jcyjA0043,cstc2016jcyjA0064,cstc2016jcyjA0134)
模式识别国家重点实验室开放课题基金(201800011)
关键词
帕金森病
分类
语音数据
分包融合
集成学习
Parkinson's disease
classification
speech data
partition bagging boosting mechanism
ensemble learning