摘要
冗余属性过多是影响分类算法运行效率和准确率的重要因素。为了提高分类算法的运行效率和分类准确率,提出一种基于改进邻域粗糙集属性重要度的快速属性约简算法。首先,提出一种改进的KNN属性重要度;其次,利用改进过属性重要度的邻域粗糙集对原始数据的条件属性进行重要度排序,利用排序结果对原始数据进行属性约简,得到约简后的特征子集;最后,将约简后的特征子集输入分类模型进行分类预测。实验仿真结果表明,与改进前的基于邻域粗糙集的属性约简算法相比,所提出的方法具有较高预测精度和较快运行速度。
Too many redundant attributes are an important factor affecting the efficiency and accuracy of classification algorithms.In order to improve the operation efficiency and classification accuracy of the classification algorithm,a fast attribute reduction algorithm based on improved attribute importance is proposed in the neighborhood rough sets.First,an attribute reduction algorithm which improves the importance of KNN attributes is proposed.Then,the improved attribute importance is used to sort the attributes of the original data.And when the attribute importance sorting results are used to reduce the attributes,the model reduces the attributes of the original data to obtain a subset of features after reduction.Finally,the reduced subset of features enters into the classification model for classification prediction.The experimental findings suggest,compared with the attribute reduction algorithm based on neighborhood rough sets before the improvement,the proposed method in this paper reduces the algorithm running time while ensuring the prediction accuracy.
作者
周长顺
徐久成
瞿康林
申凯丽
章磊
ZHOU Changshun;XU Jiucheng;QU Kanglin;SHEN Kaili;ZHANG Lei(College of Computer and Information Engineering,Henan Normal University,Xinxiang 453007,China;Engineering Lab of Intelligence Business&Internet of Things,Xinxiang 453007,China)
出处
《西北大学学报(自然科学版)》
CAS
CSCD
北大核心
2022年第5期745-752,共8页
Journal of Northwest University(Natural Science Edition)
基金
国家自然科学基金(61976082,62076089,62002103)。
关键词
数据处理
属性重要度
属性约简
不确定性度量
邻域粗糙集
the data processing
attribute importance
attribute reduction
the uncertainty measurement
neighborhood rough set