期刊文献+

基于属性间交互信息的预剪枝ID3算法 被引量:1

An Pre-pruning ID3 Algorithm Based on the Mutual Information between Attributes
下载PDF
导出
摘要 ID3算法是决策树归纳中普遍而有效的启发式算法。本文针对ID3算法的不足,给出了一个改进版本,它在选择测试属性时不仅要求该属性和类的交互信息较大,而且要求和祖先结点使用过的属性之间的交互性息尽可能小,从而避免了对冗余属性的选择,实现信息熵的真正减少。在生成树的过程中,设定分类阈值,对树进行剪枝,以避免数据子集过小,使进一步划分失去统计意义。实验结果表明,该算法能构造出比ID3算法更优的决策树。 ID3 algorithm is a popular and efficient heuristic algorithm in decision tree induction. This paper analyzes the shortcomings of the ID3 algorithm and proposes an extended version in which the testing attributes is selected based on not only the more mutual information between a candidate attribute and the class but also the less mutual information between a candidate attribute and the attribute of its ancestor nodes, in order to avoid selecting the redundant attributes and achieve the real reduce in entropy. And in the process of building tree, prune the tree with a pre-specified threshold, to avoiding the subset of instances is too small and loses the statistical significance of further divided. The experimental result indicates that it can construct a better decision tree compared with ID3.
出处 《贵州大学学报(自然科学版)》 2008年第5期494-497,共4页 Journal of Guizhou University:Natural Sciences
关键词 ID3交互信息 预剪枝 ID3 mutual information pre-pruning
  • 相关文献

参考文献5

  • 1LAN H WITTEN, EIBE FRANK. DataMining Practical Machine Learning Tools and Techniques[ M].北京:机械工业出版社 ,2006.
  • 2XIZHAO WANG, JINGBO XIE. An Extended Fuzzy-ID3 Based on the Mutual Information between Attributes [ J]. Journal of Fudan University, 2004(43) :777 -780.
  • 3HAN J, Y, KAMBER M. DATA MINING: Concepts and Techniques [M]. San Francisco CA: Morgan Kaufmarm, 2005.
  • 4JUN DU, ZHIHUA CAI, CHARLES X LING. Cost-Sensitive Decision Tree with Pre-pruning [ J]. Advances in Artlficial Intelligence, 2007 (6) : 171 - 179.
  • 5魏涛.改进的ID3算法及其在教育信息挖掘中的应用[J].上海海事大学学报,2005,26(3):82-84. 被引量:6

二级参考文献11

  • 1彭玉青,张红梅,何华,顾军华.数据挖掘技术及其在教学中的应用[J].河北科技大学学报,2001,22(4):21-24. 被引量:41
  • 2HAN J W. Discovery of multiple-level association rules from large databases[A]. Proc. of VLDB[C]. Zurich, Switzerland, 1995. 420 - 431.
  • 3HAN J W. DB miner: a system for mining knowledge in large relational databases [ A ]. Proc. of KDD [ C ]. Portland, Oregon,1996. 250 - 255.
  • 4AGRAWAL R. The quest data mining system [ A ]. Proc. of KDD[ C]. Portland, Oregon, 1996. 244 -249.
  • 5HAN J W, KAMBER M. Data Mining: Concepts and Techniques[ M ]. San Fransisco: Morgan Kaufmann Publishes,2001.
  • 6TU P L , CHUNG J Y. A new decision-tree classification algorithm for machine learning [ A ]. Proc. of the 1992 IEEE International Conference on Tools [ C ]. Arlington, Virginia, 1992. 389 - 402.
  • 7刘小虎.决策树的优化算法[EB/OL].http:∥www.ipower.com.cn/ipower/lib/rjxb/981015.htm,.
  • 8胡侃,夏绍玮.基于大型数据仓库的数据采掘:研究综述[J].软件学报,1998,9(1):53-63. 被引量:256
  • 9陈栋,徐洁磐.Knight:一个通用知识挖掘工具[J].计算机研究与发展,1998,35(4):338-343. 被引量:24
  • 10魏萍萍,王翠茹,王保义,张振兴.数据挖掘技术及其在高校教学系统中的应用[J].计算机工程,2003,29(11):87-89. 被引量:49

共引文献5

同被引文献14

  • 1王爱民,夏冰清.ID3算法在绩效评价中的应用研究[J].财会通讯(理财版),2007(9):17-18. 被引量:3
  • 2齐清文,姜莉莉.面向地理特征的制图综合指标体系和知识法则的建立与应用研究[J].地理科学进展,2001,20(S1):1-13. 被引量:16
  • 3马瑜,王有刚.ID3算法应用研究[J].信息技术,2006,30(12):84-86. 被引量:10
  • 4郭超峰,李梅莲.基于ID3算法的决策树研究与应用[J].许昌学院学报,2007,26(2):107-111. 被引量:10
  • 5刘慧.决策树ID3算法的应用.科技信息-学术版,2008,(32):182-182.
  • 6ARMSTRONG M P.Knowledge Classification and Organi-zation[M] ∥BUTTENFIELF B P,MCMASTER R B.Map Generalization:Making Rules for Knowledge Representa-tion.New York:Longman Group,1991:86-102.
  • 7KILPELAINEN T.Knowledge Acquisition for Generaliza-tion Rules[J].Cartography and Geography and Geographic Information Science,2000,27(1):41-50.
  • 8ZUCKER J D,MUSTIERE S,SAITTA L.Leaning Ab-straction and Representation Knowledge:an Application to Cartographic Generalization[C] ∥Proceeding of the5th In-ternational Workshop on Multi-Strategy Learning(MSL’2000).Guimaraes,Portugal,2000:1-18.
  • 9GEORGE F L.人工智能——复杂问题求解的结构和策略[M].6版.郭茂祖,刘扬,玄萍,等,译.北京:机械工业出版社,2009:284-289.
  • 10国防科学工业技术委员会.GJB/454-881∶50万协同图编绘规范及图式[S].北京:总装备部军标出版发行部,1988.

引证文献1

二级引证文献6

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部