It is being widely studied how to extract knowledge from a decision table based on rough set theory. The novel problem is how to discretize a decision table having continuous attribute. In order to obtain more reasona...It is being widely studied how to extract knowledge from a decision table based on rough set theory. The novel problem is how to discretize a decision table having continuous attribute. In order to obtain more reasonable discretization results, a discretization algorithm is proposed, which arranges half-global discretization based on the correlational coefficient of each continuous attribute while considering the uniqueness of rough set theory. When choosing heuristic information, stability is combined with rough entropy. In terms of stability, the possibility of classifying objects belonging to certain sub-interval of a given attribute into neighbor sub-intervals is minimized. By doing this, rational discrete intervals can be determined. Rough entropy is employed to decide the optimal cut-points while guaranteeing the consistency of the decision table after discretization. Thought of this algorithm is elaborated through Iris data and then some experiments by comparing outcomes of four discritized datasets are also given, which are calculated by the proposed algorithm and four other typical algorithras for discritization respectively. After that, classification rules are deduced and summarized through rough set based classifiers. Results show that the proposed discretization algorithm is able to generate optimal classification accuracy while minimizing the number of discrete intervals. It displays superiority especially when dealing with a decision table having a large attribute number.展开更多
文摘It is being widely studied how to extract knowledge from a decision table based on rough set theory. The novel problem is how to discretize a decision table having continuous attribute. In order to obtain more reasonable discretization results, a discretization algorithm is proposed, which arranges half-global discretization based on the correlational coefficient of each continuous attribute while considering the uniqueness of rough set theory. When choosing heuristic information, stability is combined with rough entropy. In terms of stability, the possibility of classifying objects belonging to certain sub-interval of a given attribute into neighbor sub-intervals is minimized. By doing this, rational discrete intervals can be determined. Rough entropy is employed to decide the optimal cut-points while guaranteeing the consistency of the decision table after discretization. Thought of this algorithm is elaborated through Iris data and then some experiments by comparing outcomes of four discritized datasets are also given, which are calculated by the proposed algorithm and four other typical algorithras for discritization respectively. After that, classification rules are deduced and summarized through rough set based classifiers. Results show that the proposed discretization algorithm is able to generate optimal classification accuracy while minimizing the number of discrete intervals. It displays superiority especially when dealing with a decision table having a large attribute number.