摘要
在数据挖掘和机器学习研究中,许多算法以离散值为处理对象,常常需要对连续属性进行离散化。以有监督和无监督离散化为线索,对典型离散化算法的基本思想进行梳理总结,并从时间复杂度以及对后续分类的影响等角度进行对比。最后对连续属性离散化的一些主要研究方向进行展望。
In studies of machine learning and data mining,quite a few algorithms take the discrete values as the processing objects,and often have the need to discretise continuous attributes. Taking the supervised and unsupervised discretisation as the clue,we sort out and summarise the basic idea of typical discretisation algorithms,and make the comparison from the perspectives of time complexity and the effects on the classification implemented afterwards respectively. Finally,we suggest the expectation on a couple of main research directions about continuous features discretisation.
出处
《计算机应用与软件》
CSCD
北大核心
2014年第8期6-8,140,共4页
Computer Applications and Software
基金
国家自然科学基金项目(61070061)
关键词
有监督离散化算法
无监督离散化算法
分类算法
Supervised features discretisation Unsupervised features discretisation Classification algorithm