摘要
目的基于核主成分与k近邻算法提出了心脏疾病分类的KPCA-KNN方法,以便更准确地掌握病人的病情。方法通过Q-Q图检验核变换后的数据是否服从多元正态分布,其中核参数采用非参数统计检验——Friedman检验方法进行优化选取,进一步发现在同一种分类方法中,分类正确率对于核参数的选取具有鲁棒性。结果所使用的数据是高维非线性数据,为了避免出现维数灾难和过拟合的现象,使用核主成分方法来减少数据维数,去除非线性因素的影响,通过k近邻算法判断病人是否患有心脏病。该方法在UCI数据库的SPECIF数据上进行了测试。结论核主成分在降维和分类方面表现良好,分类准确率比原始的CLIP3算法提高了15%。与主成分相比,对于非线性数据的分类效果更为优越。在处理心脏疾病数据这一类非线性分类问题时,KPCA-KNN方法使得解决问题又多了一条有效的途径。
Objective To observe the KPCA-KNN method for heart disease classification based on KPCA and K neighbor algorithm,in order to know the patients′condition more accurately.Methods The data after kernel transformation obeyed multivariate normal distribution by Q-Q chart,and the kernel parameters were optimized by non-parametric statistical test-Friedman test.It was further found that the classification accuracy was robust to the selection of kernel parameters in the same classification method.Results The data used were high-dimensional nonlinear data.In order to avoid dimensional disaster and overfitting,the kernel principal component method was used to reduce the data dimension and eliminatie the influence of nonlinear factors.The K-nearest neighbor algorithm was used to determine whether the patient with heart disease and the method was tested on SPECIF data from UCI database.Kernel principal component performed well in reduction and classification,and the classification accuracy was 15%higher than that of the original CLIP3 algorithm.Conclusion Compared with principal component,the classification effect of nonlinear data was better.KPCA-KNN method was a more effective way to solve the problem of nonlinear classification of heart diseases data.
作者
胡扬
魏毅强
HU Yang;WEI Yiqiang(College of Mathematics,Taiyuan University of Technology,Taiyuan 030024,Shanxi,China)
出处
《中西医结合心脑血管病杂志》
2021年第11期1848-1852,共5页
Chinese Journal of Integrative Medicine on Cardio-Cerebrovascular Disease
基金
国家自然科学基金资助项目(No.201901D111123)。
关键词
心脏疾病
核主成分分析
K近邻算法
正态分布
Friedman秩方差分析法
非线性降维
heart diseases
kernel principal component analysis
k neighbor
normal distribution
Friedman rank variance analysis
nonlinear dimensiona reduction