摘要
提出一种新的约简算法.首先以全局等价类为最小计算粒度,提出粗等价类概念,深入研究其性质并证明粗等价类下求核和约简与原决策系统等价;剖析3类粗等价类与正区域间的内在关联,设计针对1和-1两类粗等价类双边删减下正区域的渐增式等价计算方法,从而设计双向剪枝策略以及多次Hash的属性增量划分算法,基于此给出高效完备的约简算法.最后用UCI中20个决策集、海量、超高维3类数据集从多个角度进行验证,结果表明,所提出的约简算法的完备性和高效性在绝大多数情况下优于现有算法,尤其适用于海量数据和超高维数据集.
A new attribute reduction algorithm is proposed. Firstly, the rough equivalence class(REC) is proposed based on the smallest computational granularity of global equivalences, and the character of REC is analyzed, under which core and reduction computation are proved to be the same with those in the original decision system. Then the relationship between positive region and the 3 types of RECs are studied, and an incremental equal method of positive region based on bilateral deleting of 1-REC and-1-REC is designed. Two directional pruning strategies and the incremental attribute partitioning algorithm with multiple Hashing are designed, based on which the efficient and complete attribution reduction algorithm is proposed. Finally, 20 decision sets of UCI, massive and ultra-high dimension data sets are used to verify the algorithms,and the results show that the attribution reduction algorithm proposed is efficient and superior to current algorithms in most conditions, and is fit for massive and ultra-high dimensional decision tables especially.
出处
《控制与决策》
EI
CSCD
北大核心
2016年第11期1921-1935,共15页
Control and Decision
基金
国家自然科学基金项目(71401045)
教育部人文社会科学基金项目(12YJCZH129)
关键词
粗糙约简
粗等价类
HASH
双边剪枝
attribute reduction using rough set
rough equivalence class
Hash
bilateral-pruning