摘要
针对传统聚类算法受数据空间分布影响大且效率较低的问题,提出一种应用粗糙集理论的聚类算法。以信息表中条件属性与决策属性的一致性原理为基础,以数据超立方体、信息熵实现数据属性约简和离散化。在此基础上,利用集合特征向量加法法则运算,只需扫描一次信息表就可实现对数据对象的聚类划分。实验结果表明该算法是有效可行的。
In order to improve the quality of traditional clustering algorithm and prevent the distribution of data from affecting the clustering algorithm greatly, a clustering algorithm based on rough set is proposed. Depending on the consistency of condition attributes and decision attributes in the decision table, the data is discretized and attributes are reduced by using data super-cube and information entropy. Based on the above, the algorithm can use the additivity of set feature vector to cluster data just by scanning the decision table only one time. Illustration indicates that the algorithm is efficient and effective.
出处
《计算机工程》
CAS
CSCD
北大核心
2007年第4期14-16,共3页
Computer Engineering
基金
国家自然科学基金资助项目(70572070)
博士后科学基金资助项目(2005038319)
教育部春晖项目(Z-1-15007)
教育部博士点科研基金资助项目(20040147006)
关键词
粗糙集
聚类
属性约简
离散化
Rough set
Clustering
Attributes reduction
Discretization