摘要
针对朴素贝叶斯分类方法中属性值条件独立假设不适应实际情况的问题,提出了关联规则森林表示法及应用关联规则森林的改进贝叶斯分类算法(ABC算法).ABC算法利用关联规则挖掘得到满足条件的关联规则,并由此来构造关联规则森林,而规则森林中所有根节点的概率与所有适用的规则置信度连乘,就得到所有属性值的联合概率.应用UDI数据集对分类器进行了测试,分类结果表明,ABC算法的分类准确率明显高于朴素贝叶斯分类算法,平均提高5%,特别是对属性间有着较强依赖关系的数据集,其分类准确率提高了37%.
To alleviate the independent assumption on the attribute of the naive Bayes classification, an association rule forest representation and a modified Bayes classifier called ABC are proposed. A data mining method is used to get useful association rules. An association rule forest is constructed from all the resulting useful association rules. Then the joint probability of all the attributes contained in an instance is calculated by multiplying the probabilities of all root nodes with the confidences of all the useful association rules. UDI dataset is used to verify the validation of the ABC. Experimental results show that the ABC has higher classification accuracy, with 5% average improvement, than the naive Bayes one has. Especially, for the dataset containing strong associated attributes, 37% improvement in accuracy is obtained.
出处
《西安交通大学学报》
EI
CAS
CSCD
北大核心
2009年第2期48-52,共5页
Journal of Xi'an Jiaotong University
关键词
朴素贝叶斯分类
关联规则
联合概率
naive Bayes classification
association rule
joint probability