摘要
在朴素贝叶斯算法和ID3算法的基础上,提出一种改进的决策树分类算法。引入客观属性重要度参数,给出弱化的朴素贝叶斯条件独立性假设,并采用加权独立信息熵作为分类属性的选取标准。理论分析和实验结果表明,改进算法能在一定程度上克服ID3算法的多值偏向问题,并且具有较高的执行效率和分类准确度。
This paper proposes an improved decision tree classification algorithm based on naive Bayes algorithm and ID3 algorithm. It introduces objective attribute importance parameter, gives a kind of conditional independence assumption that is weaker than naive Bayesian algorithm, and uses the weighted independent information entropy as splitting attribute's selection criteria. Theoretical analysis and experimental results show that the improved algorithm, to a certain extent well overcomes I193 algorithm's shortcoming of multi-value tendency, and improves algorithm's implementation efficiency and classification accuracy.
出处
《计算机工程》
CAS
CSCD
2012年第14期41-43,47,共4页
Computer Engineering
基金
河南省教育厅自然科学研究计划基金资助项目(2008B520047)
河南省科技厅基础与前沿技术研究计划基金资助项目(112300410307)
关键词
朴素贝叶斯算法
ID3算法
信息增益
客观属性重要度
条件独立性假设
加权独立信息熵
naive Bayesian algorithm
ID3 algorithm
information gain
objective attribute importance
conditional independence assumption
weighted independent information entropy