摘要
连续属性的离散化问题是粗糙集理论研究的一个重要内容,通过对一种局部离散化方法的改进,提出了全局的离散化算法。利用粗糙集理论,首先定义一致性的度量(辨别函数),修改了基于“最小描述长度准则”的离散化算法,实现了全局离散,弥补了前者引入不一致的缺陷;在保持数据一致性的前提下,进一步分析了离散中分割点的冗金并进行了约简。实验通过基于粗糙集的分类工具,在几组典型数据集上得到了预期的满意结果,验证了该算法的有效性。
The problem of discretization of continuous attributes is an important issue in the research of rough sets
theory. By modifying the local method that is based on the MDLPC criterion with the help of rough sets theory, a
global discretization algorithm is proposed. In the first stage, it modifies the criterion of selecting the best cut
point in the MDLPC method, and makes the MDLPC method globalized by introducing inconsistency checking based
on rough set theory to preserve the fidelity of the original data. Then the reduction of cut points is performed,
which will not change the consistency level and lead to small size learning model. The algorithm is tested on
several data sets, and the results are satisfactory, which proved its effectiveness.
出处
《电机与控制学报》
EI
CSCD
北大核心
2004年第3期268-270,288,共4页
Electric Machines and Control