摘要
针对朴素贝叶斯(NB)算法因条件独立性的理想式假设引起分类性能降低的问题,提出一种改进的粒子群优化-朴素贝叶斯(PSO-NB)算法。在文本预处理时,引入权重因子、类内和类间离散因子进行属性约简,基于NB加权模型,将条件属性的词频比率作为其初始权值,利用PSO算法迭代寻找全局最优特征权向量,并以此权向量作为加权模型中各个特征词的权值生成分类器。运用经典数据集对PSO-NB算法进行性能分析,结果表明,改进算法可有效减少冗余属性,降低计算复杂度,具有较高的准确率和召回率。
Aiming at the problem of classification performance degradation caused by the idealized assumption of conditional independence of Naive Bayes(NB)algorithm,an improved Particle Swarm Optimization-Native Bayes(PSO-NB)algorithm is proposed.In text preprocessing,weight factor,intra-class and inter-class discrete factors are introduced for attribute reduction.Based on NB weighted model,the word-frequency ratio of conditional attribute is used as its initial weight,and PSO algorithm is used to iteratively find global optimal feature weight vector.The vector is used as a weight value to generate a classifier for each feature word in the weighting model.The performance analysis of PSO-NB algorithm is done using classical dataset.Result shows that the improved algorithm can effectively reduce redundant attributes,reduce computational complexity,and has high accuracy and recall rate.
作者
邱宁佳
李娜
胡小娟
王鹏
孙爽滋
QIU Ningjia;LI Na;HU Xiaojuan;WANG Peng;SUN Shuangzi(College of Computer Science and Technology,Changchun University of Science and Technology,Changchun 130022,China)
出处
《计算机工程》
CAS
CSCD
北大核心
2018年第11期27-32,39,共7页
Computer Engineering
基金
吉林省科技发展计划重点科技攻关项目(20150204036GX)
吉林省省级产业创新专项资金(2017C051)
关键词
朴素贝叶斯
互信息
属性约简
粒子群优化算法
权值优化
Native Bayes(NB)
mutual information
attribute reduction
Particle Swarm Optimization(PSO)algorithm
weight optimization