摘要
朴素贝叶斯分类是一种简单高效的方法.但是当属性独立性假设不成立时,有可能导致待测样本类别判断错误;且当待测样本到各类别的概率相同时,无法判断该样本类别,从而影响了它的分类准确率.本文提出基于属性值贡献率的朴素贝叶斯改进算法,利用待测样本的各个属性值在各类别的总贡献率判别该样本的类别.在蘑菇数据实验结果表明,该算法能有效提高分类的准确率.
The Naive Bayesian is a simple and efficient way of classification.When the assumption of attribute independence does not hold,it possibly leads to misjudgment in types of the will-be-tested samples.When the will-be-tested samples have the same probabilities in all categories,it is unable to judge the type of samples.Those affect the accuracy in data's classification.An improved algorithm of Bayesian based on contribution rate of attribute value is proposed in the paper,that is,the type of samples will be judged by the total contribution rate of all attribute value of will-be-tested samples in all categories.The result of mushroom data experiments show that the improved algorithm can effectively improve the accuracy of data classification.
出处
《漳州师范学院学报(自然科学版)》
2010年第4期42-44,共3页
Journal of ZhangZhou Teachers College(Natural Science)
基金
国家自然科学基金项目(10971186)
福建省教育厅重点项目(JA10202)
关键词
分类
朴素贝叶斯原理
属性值贡献率
classification
Bayesian principle
the contribution rate of attribute value