摘要
属性约简是粗糙集理论的重要研究内容,已有效应用于机器学习、数据挖掘等领域.基于条件信息熵的属性约简可有效推广代数观下的属性约简,但存在抗噪声弱且某些情况下冗余属性多的不足.为此,本文在引入决策表中基于条件信息熵的近似约简概念后,提出决策表中基于条件信息熵的近似约简算法,该算法可有效增强抗噪性,且可依据实际应用的需要有效地对冗余属性进行取舍.最后,本文侧重通过选择不同精度下的约简属性子集在Bench- mark上进行了分类器的性能测试.
Attribute reduction is not only one of important parts researched in rough set theory,but also widely applied to many fields such as machine learning,data mining and so on.The attribute reduction method based on conditional information entropy can also be used effectively in the algebra view.However,these are two main disadvantages:this method is sensitive to noise and in some cases the obtained attribute subset may contain some redundant attributes.Therefore,in this paper,after introducing a concept of approximate reduction based on conditional information entropy in decision tables,we present an approximate reduction algorithm based on conditional information entropy(ARABCIE).The algorithm can effectively improve sensitivity to noise and properly select those redundant attributes by applications.Finally,we discuss the robustness of ARABCIE algorithm by experimenting on benchmark using several attribute subsets with different precision.
出处
《电子学报》
EI
CAS
CSCD
北大核心
2007年第11期2156-2160,共5页
Acta Electronica Sinica
基金
国家自然科学基金(No.40771163)
江苏省自然科学基金(No.BK2005135)
江苏省高校自然科学研究项目基金(No.05KJB520066)
关键词
粗糙集
属性约简
条件信息熵
近似约简
rough set
attributes reduction
conditional information entropy
approximate reduction