In this paper, the problem of discovering association rules between items in a large database of sales transactions is discussed, and a novel algorithm, BitMatrix, is proposed. The proposed algorithm is fundamentally ...In this paper, the problem of discovering association rules between items in a large database of sales transactions is discussed, and a novel algorithm, BitMatrix, is proposed. The proposed algorithm is fundamentally different from the known algorithms Apriori and AprioriTid. Empirical evaluation shows that the algorithm outperforms the known ones for large databases. Scale-up experiments show that the algorithm scales linearly with the number of transactions.展开更多
This paper introduces a new algorithm of mining association rules. The algorithm RP counts the itemsets with different sizes in the same pass of scanning over the database by dividing the database into m partitions. ...This paper introduces a new algorithm of mining association rules. The algorithm RP counts the itemsets with different sizes in the same pass of scanning over the database by dividing the database into m partitions. The total number of passes over the database is only (k + 2m - 2)/m, where k is the longest size in the itemsets. It is much less than k.展开更多
基金This work was supported in part by the National '863' High-Tech Programme of China !(No.863-306-ZD06-2)
文摘In this paper, the problem of discovering association rules between items in a large database of sales transactions is discussed, and a novel algorithm, BitMatrix, is proposed. The proposed algorithm is fundamentally different from the known algorithms Apriori and AprioriTid. Empirical evaluation shows that the algorithm outperforms the known ones for large databases. Scale-up experiments show that the algorithm scales linearly with the number of transactions.
文摘This paper introduces a new algorithm of mining association rules. The algorithm RP counts the itemsets with different sizes in the same pass of scanning over the database by dividing the database into m partitions. The total number of passes over the database is only (k + 2m - 2)/m, where k is the longest size in the itemsets. It is much less than k.