摘要
实际应用中,数据常常表现出不完备性和动态性的特点。针对动态不完备数据中的特征选择问题,提出了一种基于相容粗糙集模型和信息熵理论的增量式特征选择方法。首先,建立了不完备信息系统中特征值动态更新时论域上条件划分与决策分类的动态更新模式,分析了作为特征重要度评价准则的不完备相容信息熵的增量计算机制,并将该机制引入到启发式最优特征子集搜索过程中特征重要度的迭代计算,进一步设计了不完备数据中面向特征值动态更新的增量式特征选择算法。最后,在标准UCI数据集上从分类精度、决策性能和计算效率3个方面对文中所提出的增量算法的有效性和高效性进行了实验验证。
In practical application,data often exhibits incomplete and dynamic characteristics.For the feature selection problem in dynamic incomplete data,an incremental feature selection method based on the tolerance rough set model and information entropy theory is proposed.First,the update patterns of conditional partition and decision classification are established based on the variation of feature values in incomplete information systems.The incremental computing mechanism of incomplete tolerance information entropy as the evaluation criterion of feature importance is built subsequently.Such an incremental mechanism is integrated into the iterative calculation of feature importance during the heuristic search of optimal feature subset,and an incremental feature selection algorithm for dynamic variation of feature values is developed.Finally,the effectiveness and efficiency of the proposed incremental algorithm are verified on several standard UCI datasets in terms of classification accuracy,decision performance,and computing efficiency.
作者
唐荣
罗川
曹潜
王思朝
TANG Rong;LUO Chuan;CAO Qian;WANG Sizhao(College of Computer Science,Sichuan University,Chengdu 610065,China)
出处
《智能系统学报》
CSCD
北大核心
2021年第3期493-501,共9页
CAAI Transactions on Intelligent Systems
基金
国家自然科学基金项目(62076171)
四川省科技厅应用基础研究计划项目(2019YJ0084).
关键词
特征选择
维度约简
粗糙集
信息熵
不完备数据
缺失值
启发式搜索
增量学习
feature selection
dimensional reduction
rough set
information entropy
incomplete data
missing values
heuristic search
incremental learning