摘要
针对不完备信息表预处理问题中的不完备数据的填补问题、冗余属性的约简问题和连续属性的离散化问题进行了研究.应用粗糙集理论,由相容信息表中条件属性与决策属性间的一致性对应关系,定义了划分区间的加法运算,解决了不完备数据填补问题;根据类别概念,定义了差别向量,利用差别向量加法运算删除了冗余属性;根据条件属性与决策属性之间的依赖关系及相对信息熵概念,实现了连续属性的离散化.数值示例和实验结果显示此方法是有效可行的.
This paper studied the problems of filling up incomplete data, reducing redundant attributes and discretizing continuous attributes in preprocessing the incomplete information table with continuous attributes in a rough set. According to the concept of interval value and the consistency of condition attributes and decision attributes, a plus rule for interval values was defined to filling up the incomplete data. Depending on the conception of classification, the discernible vector was defined and the discernible vector addition rule was used to delete redundant attributes. By use of the super-club data and entropy of the information table, the discretization of continuous attributes was implemented. The illustration and experimental results indicate that the method is effective.
出处
《北京科技大学学报》
EI
CAS
CSCD
北大核心
2006年第9期902-906,共5页
Journal of University of Science and Technology Beijing
基金
国家自然科学基金资助项目(No.70271068)
博士后科学基金资助项目(2005038319)
教育部春晖项目(Z-1-15007)
教育部博士点科研基金资助项目(20040147006)
科技攻关项目(2005219005)
关键词
不完备信息表
粗糙集
信息熵
属性约简
离散化
incomplete information table
rough set
information entropy
attributes reduction
discretization