摘要
朴素贝叶斯是一种处理分类问题的常用方法,但它的属性条件独立性假设在实际应用中难以成立,导致其分类性能降低。针对这一问题,文章提出了基于改进PCA的朴素贝叶斯分类算法,该算法通过Pearson和Kendall系数计算出属性间的相关性大小,基于主成分分析筛选出新的属性集,使其尽量满足条件独立性假设,并对新数据集进行朴素贝叶斯分类。实验结果表明,该方法有效地提高了分类准确率。
Naive Bayes is a commonly used method to deal with classifications,but its attribute condition independence assumption is difficult to be established in practical applications,resulting in reduced classification performance.In order to solve this problem,the paper proposes a naive Bayes classification algorithm based on improved PCA.In the proposed algorithm,the paper uses Pearson and Kendall coefficients to calculate the correlation between attributes,and then,based on principal component analysis(PCA),screens new attribute sets to satisfy the conditional independence hypothesis as far as possible.Finally,naive Bayes classification is carried out on the new data sets.The experimental results show that this method effectively improves the classification accuracy.
作者
李思奇
吕王勇
邓柙
陈雯
Li Siqi;Lyu Wangyong;Deng Xia;Chen Wen(School of Mathematical Science,Sichuan Normal University,Chengdu 610068,China;Visual Computing and Virtual Reality Key Laboratory of Sichuan Province,Sichuan Normal University,Chengdu 610068,China)
出处
《统计与决策》
CSSCI
北大核心
2022年第1期34-37,共4页
Statistics & Decision
基金
国家自然科学基金青年项目(11601357)
可视化计算与虚拟现实四川省重点实验室项目(SCVCVR2018.08VS)。
关键词
朴素贝叶斯
相关系数
主成分分析
naive Bayes
correlation coefficient
principal component analysis(PCA)