摘要
针对关联规则数据挖掘中频繁项目集的二次挖掘问题,提出了一种能够解决当最小支持度发生变化而交易数据库不变情况下进行二次挖掘的改进算法(UMSA)。该算法充分利用频繁项目集的特性,通过新的拼接方法来减少候选项目集的生成,在扫描交易数据库确定k维频繁项目集时,采用在交易数据库中剔除无用的交易,达到不断减小交易数据库规模的目的,克服了一些算法中存在的漏采现象,并在一定程度上解决了非确定性问题。通过举例说明该算法的执行过程及其算法的正确性和有效性,并对其性能进行了分析。
To mine is updated for frequent itemsets, an improved algorithm is put forward to mine frequent itemsets again based on changeable minimum support in an unchangeable database. The characteristics of the frequent itemsets are utilized fully in the algorithm by a new jointing way to decrease candidacy itemsets and by rejecting useless affairs in the scanning database to decrease gradually its scale. The algorithm overcomes the miss mining and nondeterminate polynomial degree found in some algorithms. In the end an example is given to demonstrate the algorithm and its performance is analyzed.
出处
《系统工程与电子技术》
EI
CSCD
北大核心
2004年第11期1701-1704,共4页
Systems Engineering and Electronics
关键词
关联规则
二次挖掘
频繁项目集
association rules
mining again
frequent itemsets