期刊文献+

连续属性离散化算法研究综述 被引量:10

SURVEY ON CONTINUOUS FEATURE DISCRETISATION ALGORITHM
下载PDF
导出
摘要 在数据挖掘和机器学习研究中,许多算法以离散值为处理对象,常常需要对连续属性进行离散化。以有监督和无监督离散化为线索,对典型离散化算法的基本思想进行梳理总结,并从时间复杂度以及对后续分类的影响等角度进行对比。最后对连续属性离散化的一些主要研究方向进行展望。 In studies of machine learning and data mining,quite a few algorithms take the discrete values as the processing objects,and often have the need to discretise continuous attributes. Taking the supervised and unsupervised discretisation as the clue,we sort out and summarise the basic idea of typical discretisation algorithms,and make the comparison from the perspectives of time complexity and the effects on the classification implemented afterwards respectively. Finally,we suggest the expectation on a couple of main research directions about continuous features discretisation.
出处 《计算机应用与软件》 CSCD 北大核心 2014年第8期6-8,140,共4页 Computer Applications and Software
基金 国家自然科学基金项目(61070061)
关键词 有监督离散化算法 无监督离散化算法 分类算法 Supervised features discretisation Unsupervised features discretisation Classification algorithm
  • 相关文献

参考文献16

  • 1Salvador Garcia,Julian Luengo,et al.A Survey of Discretization Techniques:Taxonomy and Empirical Analysis in Supervised Learning[J].knowledge and Data engineerin,2013,25(4):734-750.
  • 2Sotiris Kotsiantis,Dimitris Kanellopoulos.Discretization Techniques:A recent survey[J].GESTS International Transactions on Computer Science and Engineering,2006,32(1):47-58.
  • 3Chang-Hwan Lee.A Hellinger-based discretization method for numeric attributes in classification learning[J].Knowledge-Based Systems.2007,20(4):419-425.
  • 4David Tian,Xiaojun Zeng,John Keane.Core-generating approximate minimum entropy discretization for rough set feature selection in pattern classification[J].International Journal of Approximate Reasoning,2011,52(6):863-880.
  • 5Lukasz A Kurgan,Krzysztof J Cios.CAIM Discretization Algorithm[J].IEEE Transactions on Knowledge and Data Engineering,2004,16(2):145-153.
  • 6Ruiz FJ,Angulo C,Agell IDD N.A supervised interval distance-based method for discretization[J].IEEE Transactions on Knowledge and Data Engineering,2008,10(9):1230-1238.
  • 7Chengjung Tsai Chien I Lee,Weipang Yang.A discretization algorithm based on Class-Attribute Contingency Coefficient[J].Information Sciences,2008,178(3):714-731.
  • 8Shengyi Jiang,Xia Li,Qi Zheng,et al.Approxmate Equal Frequency Discretization Method[C]//GCIS2009,2009,5:514-518.
  • 9Ankit Guptaa,Kishan G Mehrotrab,Chilukuri Mohanb.A clusteringbased discretization for supervised learning[J].Statistics&Probability Letters,2010,80(9-10):816-824.
  • 10Chaoton Su,Jyhhwa Hsu.An Extended Chi2 Algorithm for Discretization of Real Value Attributes[J].IEEE Transactions on Knowledge and Data Engineering,2005,17(3):437-441.

二级参考文献14

  • 1[1]Catlett J. On changing continuous attributes into ordered discreteattributes. In: Proc European Working Session on Learning (EWSL91). LNAI-482, Porto,Portugal, 1991. 164-178
  • 2[2]Dougherty J, Kohavi R, Sahami M. Supervised and unsupervised discretizationof continuous features. In: Proc the 12th International Conference, Morgan KaufmannPublishers, 1995.194-202
  • 3[3]Quinlan J R. C4.5: Programs for Machine Learning. San Mateo: Morgan Kaufmann,1993
  • 4[4]Fayyad U, Irani K. Multi-interval discretizaton of continuous-valuedattributes for classification learning. In: Proc the 13th International JointConference on Artificial Intelligence, San Mateo, CA. Morgan Kaufmann Publishers,1993. 1022-1027
  • 5[5]Li G, Tong F. WILD: Weighted information-loss discretization algorithm forordinal attributes. In: Proc Conference on Intelligent Information Processing, the16th IFIP World Computer Congress 2000, Beijing, China, 2000.254-527
  • 6[6]Quinlan J R. Improved use of continuous attributes in C4.5. Journal ofArtificial Intelligence Research, 1996,4(1):77-90
  • 7[7]Wong A K C, Chiu D K Y. Synthesizing statistical knowledge from incompletemixed-mode data. IEEE Trans Pattern Analysis and Machine Intelligence, 1987,PAMI-9(6):796-805
  • 8[8]Banfield J D, Raftery A E. Model based Gaussian and non-Gaussian clustering.Biometrics, 1993,49(3):803-821
  • 9[9]Mackay D J C. Information Theory, Inference and Learning Algorithms.Cambridge: Cambridge University Press, 2000
  • 10[10]Dempster A P, Laird N M, Rubin D B. Maximum likelihood for incomplete data viathe EM algorithm. Journal of the Royal Statistical Society, Series B, 1977,39(1):1-38

共引文献15

同被引文献143

引证文献10

二级引证文献44

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部