摘要
频繁项集挖掘是数据挖掘应用中的关键问题,而巨大的频繁项集数目成为了现实应用中的阻碍。为了降低频繁项集数量,使其更加利于应用,提出了一种基于格结构的频繁项集精简模型,并证明了该方法产生支持度误差的范围;此外,在模型的基础上提出了一种模糊等价类精简表示算法FEC。实验结果表明,该方法能够保证在频繁项集数量大幅降低的同时,不会引入过大的支持度错误,与Index-Meta算法相比,产生的支持度错误较小。因此,基于模糊等价类的频繁项集精简表示模型及FEC算法有较高的应用价值。
Frequent itemset mining is a main problem in the application of data mining. But the large number of the frequent itemset makes it hard to apply. To reduce the number of frequent itemsets, this paper proposed a concise representation model of frequent itemset based on concept lattices. It proved the feasibility of the model and deduced the range of support error. In addition, this paper proposed a algorithm of fuzzy eqivalence based on the model which was called FEC. Experimental results show that this algorithm can reduce the number of frequent itemset sharply and the support error is also very low. The support error is much smaller than that of Index-Meta. So this method has great application value.
出处
《计算机应用研究》
CSCD
北大核心
2016年第7期1936-1940,共5页
Application Research of Computers
基金
国家"863"计划资助项目(2012AA011005)
国家自然科学基金资助项目(61273292)
关键词
数据挖掘
模糊等价类
类闭合集
频繁项集
精简表示
关联规则
data mining
fuzzy equivalence class
approximate close set
frequent itemset
concise representation
association rules