摘要
针对许多算法不适合对分类数据进行聚类的特点,提出了一种基于最长频繁闭项集(LFCI)的聚类算法。使用改造后的频繁模式树,得到每个事务的LFCI,由于LFCI的两个重要属性,因此可以将LFCI作为该事务的描述,从而直接得到聚类结果。实验证明了该算法的有效性。
Many clustering algorithms are not appropriate for categorical data. Therefore, a clustering algorithm based on the longest frequent closed itemsets (LFCIs) is presented. With the adaptation of traditional frequent-pattern tree, the LFCI of each transaction is found out accordingly. Due to two aspects of important attribute of LFCI, it can be considered as the description of the corresponding transaction. As a result, the clusters derive from LFCIs directly without a large intermediate set of frequent itemsets. The experiment results demonstrate the feasibility and robustness of this method.
出处
《计算机工程》
CAS
CSCD
北大核心
2007年第1期187-189,192,共4页
Computer Engineering
关键词
分类数据
聚类算法
闭项集
频繁模式树
Categorical data
Clustering algorithm
Closed itemsets
Frequent-pattern tree