摘要
针对分布式数据库和约束条件的特点,提出了2种在分布式环境下挖掘约束性关联规则的有效算法,即基于Apriori算法的DMAIC算法和基于频繁模式树的DAMICFP算法。此外,进行了实例验证和测试分析,指出了这2种算法各自的优缺点及适用条件。研究结果表明:DMAIC算法可靠性高,通信协议简单,适用于对通信性能要求不高的分布式数据库;DAMICFP算法执行效率高,通信性能好,适用于对通信性能要求较高的多项目分布式数据库;这2种算法均能有效地解决分布式挖掘约束性关联规则的问题。
According to the characteristics of distributed databases and constraints, two algorithms for distributed mining association rules with item constraints called DMAIC and DAMICFP are developed. The DMAIC algorithm is based on Apriori algorithm and DAMICFP on FP-growth algorithm. The two algorithms are both tested by an illustration and analyzed for their qualities. The advantages, shortcomings and suited conditions of the two algorithms are also given. The results show that DMAIC is an algorithm with high reliability and simple communication protocol, and it suits the system of low communication requirement. DAMICFP is an algorithm with high efficiency and excellent communication quality, and suits the system of high communication requirement. The two algorithms are effective ways to solve the problem of distributed mining association rules with item constraints.
出处
《中南大学学报(自然科学版)》
EI
CAS
CSCD
北大核心
2004年第6期998-1003,共6页
Journal of Central South University:Science and Technology
基金
教育部科学技术研究重点项目([2000]156)
国家杰出青年自然科学基金资助项目(69928201)
关键词
数据挖掘
分布式数据挖掘
约束性关联规则
data mining
distributed data mining
association rules with item constraints