摘要
随着网络的进步,社会中产生了大量的高维数据,但很多统计方法难以直接应用到高维数据上。如何获得去噪简化且保存关键信息的低维度数据是一项急需解决的问题。粗糙集理论提供了一种数据降维的方法,被称为属性约简。属性约简的目标是保证原数据集的一种分类特征不变,获得其最小的属性子集。目前,基于不同的分类特征,已提出了许多不同的属性约简算法,如下近似保持、分布保持等。在序决策系统中,传统的下近似约简算法是基于差别矩阵的,计算复杂度高。为了解决这一问题,本文使用依赖度,设计了一个后向贪婪的启发式算法来计算下近似约简。实验使用6组UCI数据集。实验结果表明本文设计的算法可以得到正确的下近似约简,并在时间效率上优于传统的差别矩阵算法。
With the development of network, a large number of high-dimensional data are generated in the society, but many statistical methods are difficult to be directly applied to high-dimensional data. How to obtain the denoising simplified data with low dimensions and key information is an urgent problem. Rough set theory provides a method for reducing data dimensions, called attribute reduction. The purpose of attribute reduction is to obtain a minimal subset of attributes without changing a classification property of the original data. Until now, different attribute reduction methods are proposed for different classification properties, such as lower approximation preservation and distribution preservation. In an ordered decision system, the traditional attribute reduction algorithm is based on discernibility matrices, and the computation complexity is high. To solve this problem, we design a backward greedy heuristic algorithm to compute a reduct for lower approximation preservation. The experiments are conducted in six UCI data sets. The experimental results show that the algorithm gets a correct reduct and is better than the traditional discernibility matrix algorithm in time efficiency.
出处
《计算机科学与应用》
2021年第1期113-120,共8页
Computer Science and Application