摘要
为了解决传统算法中存在的频繁模式集生成的瓶颈问题,本文将启发式背景知识和归纳背景知识同时运用在频繁模式的生成过程中,提出了一种基于背景知识的关联规则挖掘算法BasedBackground。该算法不仅通过启发式背景知识,有效降低了模式的计数代价,而且通过由样本挖掘获取的归纳背景知识,有效地减少I/O代价,因此提高了挖掘的效串和质量。本文最後通过恒星光谱数据作为实验数据集,验证了该算法的有效性。
In order to solve the bottleneck of generating frequent itemsets in traditional algorithms, this paper applies both heuristics background knowledge and inductive background knowledge to the generation of frequent itemsets, and brings forward an Association Rule mining algorithm which is based on background knowledge. The algorithm not only reduces the count costs of models by the use of heuristics background knowledge, but also cuts down the I/O costs because it adopts inductive background knowledge which is from Sample mining. Therefore, the algorithm in the paper improves the efficiency and quality of mining. In the end, using Star optical spectrum data as a data set, the paper gives an example to validate the algorithm.
出处
《通讯和计算机(中英文版)》
2005年第6期11-18,40,共9页
Journal of Communication and Computer
基金
本文得到国家“863”高技术研究发展计划基金项目资助(2003AA133060).
关键词
数据挖掘
关联规则
背景知识
样本挖掘
恒星光谱数据
Data Mining
Association Rule
Background Knowledge
Sample Mining
Star Optical Spectrum Data