摘要
随着知识发现和数据挖掘的迅速发展,出现了很多的方法,这些方法很多都依赖于离散的数据。但是,大部分现实中应用的数据都带有连续变量的属性。为了使得数据挖掘的技术能够用在这些数据上面,必须进行离散化。文章探讨了基于粗糙集的离散化方法。论文做实验来比较局部和全局离散化算法,实验结果表明,这两种算法对于数据集有敏感性。
The area of knowledge discovery an d data mining is growing rapidly.A large number of methods are employed to mine knowledge.Many of the methods rely of discrete data.However,most of the data sets used in real application have attributes with continuous values.To make th e data mining techniques useful for such datasets,discretization is performed as a preprocessing step of the data mining.In this paper,we discuss rough set based discretization.We do experiments to compare the quality of Local discre tization and Global discretization based on rough set.Our experiments show tha t Global discretization and Local discretization are dataset sensitive.
出处
《计算机工程与应用》
CSCD
北大核心
2004年第26期68-69,159,共3页
Computer Engineering and Applications
关键词
粗糙集
断点
离散化
数据挖掘
rought set,cuts,discretization,Data Mining