摘要
针对经典粗糙集不能直接处理决策系统中既含有属性值缺失的不完备问题又同时具有名义型属性和数值型属性的混合数据问题,提出一种限制邻域关系,并给出了一套不完备混合决策系统属性约简算法。该算法以条件熵作为启发因子,弥补将决策正域作为启发因子时可能会出现选不出第一个最重要属性的不足,并利用所提的限制邻域关系直接处理不完备混合型数据,从而省去了对不完备数据进行数据补齐或删除和对数值型数据进行离散化的过程,以减少这些数据预处理所带来的不确定性,最后通过对UCI的不完备混合型数据集进行仿真实验,从而验证了该算法在保持或改善分类能力的情况下可以有效地约简冗余属性,并且讨论了在限制邻域关系中的阈值选择对分类结果的影响。
As for a decision system with both missing attribute values and mixed data types,the classical rough sets theory cannot directly do anything about it. Such a decision system was firstly defined as the incomplete mixed decision system (IMDS). Secondly,the limited neighborhood relation was proposed for composing the attribute reduction algorithm of a novel incomplete mixed decision system for IMDS, which employed the conditional entropy as the heuristic factor to make up for the positive region of deci- sion deficiency. Based on the limited neighborhood relation,the nominal attribute and the numerical at- tribute and the missing attribute could be handled simultaneously by the proposed reduction algorithm without the discretization of numerical attributes or completing the incomplete data. Finally,the pro-posed reduction algorithm was tested on several UCI data sets. The experiment results show that the re-duction algorithm can select the core attributes on the condition of keeping or improving classification ac-curacy. Also ,how to impact the classification when specifying the value of the threshold used in the lim-ited neighborhood relation is specified also discussed.
出处
《广西师范大学学报(自然科学版)》
CAS
北大核心
2013年第3期30-36,共7页
Journal of Guangxi Normal University:Natural Science Edition
基金
国家自然科学基金资助项目(60975032)
山西省青年科技研究基金资助项目(2009021017-4)
山西省回国留学人员科研资助项目(2008-25)
山西省回国留学人员科研资助项目(2013-033)
关键词
不完备混合决策系统
限制邻域关系
条件熵
属性约简
incomplete mixed decision system
limited neighborhood relation
conditional entropy
at-tribute reduction