期刊文献+

一种大规模数据库的组合优化决策树算法 被引量:5

Combined optimization decision tree algorithm suitable for large scale data-base
下载PDF
导出
摘要 提出了一种适合于大规模高维数据库的组合优化决策树算法。相比于传统的类似算法,该算法从数据的离散化,降维,属性选择三方面进行改进,对决策树建立过程中不适应大规模高维数据库的主要环节进行了优化,有效解决了处理大规模高维数据库问题的效率和精度之间的矛盾。仿真试验表明,该算法在大大减少了计算代价的同时提高了决策树的分类精度。 A combined optimization decision tree algorithm suitable for a large scale and high dimension data-base is presented. Compared with the traditional similar algorithms, the algorithm makes improvements from three aspects: discretization, reducing dimension and attribute selection. It also optimizes the main processes, so that it is suitable for large scale and high dimension data-base and effectively solves the conflict between efficiency and predictive precision. Experiments show that the proposed method raises the predictive precision of decision trees while it greatly reduces the computational cost.
出处 《系统工程与电子技术》 EI CSCD 北大核心 2009年第3期583-587,共5页 Systems Engineering and Electronics
基金 国家自然科学基金(70573076 70671074) 天津科技大学科研基金(20080303)资助课题
关键词 离散化 降维 属性选择 决策树 数据挖掘 discretization reducing dimension attribute selection decision tree data mining
  • 相关文献

参考文献13

  • 1文专,王正欧.一种高效的基于排序的RBF神经网络属性选择方法[J].计算机应用,2003,23(8):34-36. 被引量:8
  • 2Buntine W, Nibtett T. A further comparion of splitting rules for decision-tree induction[J]. Machine Learning, 1992, 8(1) : 75 -85.
  • 3Kononenko I, Se J H. Attribute Selection for Modeling[J]. Future Generation Computer Systems, 1997,13(2 - 3) : 181 - 195.
  • 4Shih Y S. Families of splilting criteria for classification tree[J]. Statistics and Computing, 1999, 9(4) : 309 - 315.
  • 5Kurgan L A, Cios K J. CAlM discretization algorithm[J]. Knowledge and Data Engineering, 2004, 2,16(2) : 145 - 153.
  • 6Wang Haixun, Yu P S. SSDT: a scalable subspace-splitting classifier for biased data[C]. ICDM Proc. , IEEE International Conference Proc. , 2001 : 542 - 549.
  • 7Quinlan J R. Induction of decision trees[J]. Machine Learning, 1986, 1(1): 81-106.
  • 8Quinlan J R. C4. 5: programs for machine learning[M]. San Mateo : Morgan Kaufmann. 1993.
  • 9钱国良,舒文豪,陈彬,权光日.基于信息熵的特征子集选择启发式算法的研究[J].软件学报,1998,9(12):911-916. 被引量:8
  • 10Elomaa T, Rousu J. General and efficient muhisplitting of numerical attributes[J]. Machine Learning, 1999, 36 (3) : 201 - 224.

二级参考文献10

  • 1陈彬,洪家荣,王亚东.最优特征子集选择问题[J].计算机学报,1997,20(2):133-138. 被引量:96
  • 2Engelbrecht AP. A New Pruning Heuristic Based on Variance Analysis of Sensitivity Information[ J]. IEEE Trans. on Neural Networks,2001,12(6) : 1386 - 1399.
  • 3Kwak N, C-h. choi. Input Feature Selection for Classification Problem[J]. IEEE Tran. on Neural Networks, 2002, 13(1) : 143 - 159.
  • 4Xiuju, Lipo Wang. Rule Extraction Based on Data Dimensionality Reduction Using RBF Neural networks[ A]. ICONIP 2001 proceedings, 8th International Conference on Neural Information processing[ C]. Shanghai, China. 2001, vol. 1. 149 - 153.
  • 5Meng Joo Er. Face Recognition With Radial Basis Function(RBF)Neural networks[ J]. IEEE Transaction on Neural Networks, 2002,13(3) : 697 - 709.
  • 6Wu X D,Technical Report,1992年
  • 7HanJ KamberM.数据挖掘概念与技术[M].北京:机械工业出版社,2001.185.
  • 8洪家荣.示例学习的扩张矩阵理论[J].计算机学报,1991,14(6):401-410. 被引量:31
  • 9李仁璞,王正欧.一种结构自适应的神经网络特征选择方法[J].计算机研究与发展,2002,39(12):1613-1617. 被引量:11
  • 10王兴起,孔繁胜.容忍噪音的特征子集选择算法研究[J].计算机研究与发展,2002,39(12):1637-1644. 被引量:4

共引文献14

同被引文献40

引证文献5

二级引证文献12

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部