期刊文献+

信息偏差在连续属性离散化中的应用

Discretization of continuous attributes using information divergence
下载PDF
导出
摘要 对基于信息论的离散化系列算法进行了分析,在此基础上提出了一种新的连续属性离散化方法。该算法使用信息偏差来对断点重要性进行度量,在离散化过程中使用不一致率进行控制以保证决策表的相容性不发生变化。最后通过使用C4.5和支持向量机(SVM)对该算法和其他算法进行性能对比,验证了该算法的有效性。 The discretization of continuous attributes is always with great contribution to the followed process of machine learning or data mining.A new algorithm based on information divergence for discretization is proposed.By an inconsistency checking,the procedure of discretization is controlled.The experiments are performed respectively with the results of discreted data by using C4.5 and SVM.The results show that the presented algorithm is effective.
出处 《计算机工程与应用》 CSCD 北大核心 2010年第20期103-105,共3页 Computer Engineering and Applications
基金 国家自然科学基金No.60372071~~
关键词 连续属性离散化 决策表 信息偏差 不一致率 discretization of continuous attributes decision table information divergence inconsistency
  • 相关文献

参考文献9

  • 1Dougherty J,Kohavi R,Sahami M.Supervised and unsupervised discretization of continuous feature[C] //Proc 12th Int'l Conf on Machine learning,1995:194-202.
  • 2Fayyad U M,Ireni K B.Multi-interval discretization of continuous-valued attributes fbr classification learning[C] //Proc 13th Int'l Joint Conf on Artificial Intelligence,1993:1022-1027.
  • 3Liu H,Setiono R.Chi2:Feature selection and discretization of numeric attributes[C] //Proceedings of the 7th IEEE Intermational Conference on Tools with Artificial Intelligence.Hemdon:IEEE Computer Society,1995.
  • 4Kurgen L A,Cios K J.CAIM discretization algorithm[J].IEEE Transactions on Knowledge and Data Engineering,2004,16(2):145-153.
  • 5谢宏,程浩忠,牛东晓.基于信息熵的粗糙集连续属性离散化算法[J].计算机学报,2005,28(9):1570-1574. 被引量:134
  • 6宫悦 桑琳 陈斯等.基于信息增益的连续属性离散化算法及其应用.计算机与现代化,2009,(1):81-83.
  • 7Kullback S.Information theory end statistics[M].New York:Dover Publications,1968.
  • 8Ying Z.Minimum Hellinger distance estimation for censored data[J].The Annals of Statistics,1992,20(3):1361-1390.
  • 9Hcttich S,Bay S D.The UCI KDD archive[EB/OL].(1999).http://kdd.ics.uci.edu/.

二级参考文献16

  • 1Nguyen S.H., Nguyen H.S.. Some efficient algorithms for rough set methods. In: Proceedings of the Conference of Information Processing and Management of Uncertainty in Knowledge-Based Systems, Granada, Spain, 1996, 1451~1456.
  • 2Susmaga R.. Analyzing discretizations of continuous attributes given a monotonic discrimination function. Intelligent Data Analysis, 1997, 1(4): 157~179.
  • 3Dai Jian-Hua, Li Yuan-Xiang. Study on discretization based on rough set theory. In: Proceedings of the first International Conference on Machine Learning and Cybernetics, Beijing, 2002, 1371~1373.
  • 4Chen Cai-Yun, Li Zhi-Guo, Qiao Sheng-Yong, Wen Shuo-Pin. Study on discretization in rough set based on genetic algorithm. In: Proceedings of the Second International Conference on Machine Learning and Cybernetics, Xi′an, 2003, 1430~1434.
  • 5Huang Jin-Jie, Li Shi-Yong. A GA-based approach to rough data model. In: Proceedings of the 5th World Congress on Intelligent Control and Automation, Hangzhou. 2004, 1880~1884.
  • 6Roy A., Pal S.K.. Fuzzy discretization of feature space for a rough set classier. Pattern Recognition Letters, 2003, 24(6): 895~902.
  • 7Wang Li-Hong, Zhang Shu-Cui, Fan Hui, Wu Geng-Feng. The information granulation in discretization. In: Proceedings of the Second International Conference on Machine Learning and Cybernedcs, Xi′an, 2003, 2620~2623.
  • 8Li Meng-Xin, Wu Cheng-Dong, Han Zhong-Hua, Yue Yong. A hierarchical clustering method for attribute discretization in rough set theory. In: Proceedings of the third International Conference on Machine Learning and Cybernetics, Shanghai, 2004, 3650~3654.
  • 9Shen L., Tay E.H.. A discretization method for rough sets theory. Intelligent Data Analysis, 2001, 5(5): 431~438.
  • 10Tay E.H., Shen L.. A modified Chi2 algorithm for discretization. IEEE Transactions on Knowledge and Data Engineering, 2002, 14(3): 666~670.

共引文献133

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部