期刊文献+
共找到4篇文章
< 1 >
每页显示 20 50 100
Improved Pattern Tree for Incremental Frequent-Pattern Mining 被引量:1
1
作者 周明 王太勇 《Transactions of Tianjin University》 EI CAS 2010年第2期129-134,共6页
By analyzing the existing prefix-tree data structure, an improved pattern tree was introduced for processing new transactions. It firstly stored transactions in a lexicographic order tree and then restructured the tre... By analyzing the existing prefix-tree data structure, an improved pattern tree was introduced for processing new transactions. It firstly stored transactions in a lexicographic order tree and then restructured the tree by sorting each path in a frequency-descending order. While updating the improved pattern tree, there was no need to rescan the entire new database or reconstruct a new tree for incremental updating. A test was performed on synthetic dataset T1014D100K with 100 000 transactions and 870 items. Experimental results show that the smaller the minimum sup- port threshold, the faster the improved pattern tree achieves over CanTree for all datasets. As the minimum support threshold increased from 2% to 3.5%, the runtime decreased from 452.71 s to 186.26 s. Meanwhile, the runtime re- quired by CanTree decreased from 1 367.03 s to 432.19 s. When the database was updated, the execution time of im- proved pattern tree consisted of construction of original improved pattern trees and reconstruction of initial tree. The experiment results showed that the runtime was saved by about 15% compared with that of CanTree. As the number of transactions increased, the runtime of improved pattern tree was about 25% shorter than that of FP-tree. The improved pattern tree also required less memory than CanTree. 展开更多
关键词 data mining association rules improved pattern tree incremental mining
下载PDF
Incremental Mining of the Schema ofSemistructured Data
2
作者 周傲英 金文 +2 位作者 周水庚 钱卫宁 田增平 《Journal of Computer Science & Technology》 SCIE EI CSCD 2000年第3期241-248,共8页
Semistructured data are specified in lack of any fixed and rigidschema, even though typically some implicit structure appears in the data. Thehuge amounts of on-line applications make it important and imperative to mi... Semistructured data are specified in lack of any fixed and rigidschema, even though typically some implicit structure appears in the data. Thehuge amounts of on-line applications make it important and imperative to mine theschema of semistructured data, both for the users (e.g., to gather useful informationand facilitate querying) and for the systems (e.g., to optimize access). The criticalproblem is to discover the hidden structure in the semistructured data. Currentmethods in extracting Web data structure are either in a general way independentof application background, or bound in some concrete environment such as HTML,XML etc. But both face the burden of expensive cost and difficulty in keeping alongwith the frequent and complicated variances of Web data. In this paper) the problemof incremental mining of schema for semistructured data after the update of the rawdata is discussed. An algorithm for incrementally mining the schema of semistruc-tured data is provided, and some experimental results are also given, which show thatincremental mining for semistructured data is more efficient than non-incrementalmining. 展开更多
关键词 data mining incremental mining semistructured data SCHEMA ALGORITHM
原文传递
Incremental Web Usage Mining Based on Active Ant Colony Clustering
3
作者 SHEN Jie LIN Ying CHEN Zhimin 《Wuhan University Journal of Natural Sciences》 CAS 2006年第5期1081-1085,共5页
To alleviate the scalability problem caused by the increasing Web using and changing users' interests, this paper presents a novel Web Usage Mining algorithm-Incremental Web Usage Mining algorithm based on Active Ant... To alleviate the scalability problem caused by the increasing Web using and changing users' interests, this paper presents a novel Web Usage Mining algorithm-Incremental Web Usage Mining algorithm based on Active Ant Colony Clustering. Firstly, an active movement strategy about direction selection and speed, different with the positive strategy employed by other Ant Colony Clustering algorithms, is proposed to construct an Active Ant Colony Clustering algorithm, which avoid the idle and "flying over the plane" moving phenomenon, effectively improve the quality and speed of clustering on large dataset. Then a mechanism of decomposing clusters based on above methods is introduced to form new clusters when users' interests change. Empirical studies on a real Web dataset show the active ant colony clustering algorithm has better performance than the previous algorithms, and the incremental approach based on the proposed mechanism can efficiently implement incremental Web usage mining. 展开更多
关键词 Web usage mining ant colony clustering incremental mining
下载PDF
Efficient Incremental Maintenance of Frequent Patterns with FP-Tree 被引量:9
4
作者 Xiu-LiMa Yun-HaiTong +1 位作者 Shi-WeiTang Dong-QingYang 《Journal of Computer Science & Technology》 SCIE EI CSCD 2004年第6期876-884,共9页
Mining frequent patterns has been studied popularly in data mining area. However, little work has been done on mining patterns when the database has an influx of fresh data constantly. In these dynamic scenarios, effi... Mining frequent patterns has been studied popularly in data mining area. However, little work has been done on mining patterns when the database has an influx of fresh data constantly. In these dynamic scenarios, efficient maintenance of the discovered patterns is crucial. Most existing methods need to scan the entire database repeatedly, which is an obvious disadvantage. In this paper, an efficient incremental mining algorithm, Incremental-Mining (IM), is proposed for maintenance of the frequent patterns when incremental data come. Based on the frequent pattern tree (FP-tree) structure, IM gives a way to make the most of the things from the previous mining process, and requires scanning the original data once at most. Furthermore, IM can identify directly the differential set of frequent patterns, which may be more informative to users. Moreover, IM can deal with changing thresholds as well as changing data, thus provide a full maintenance scheme. IM has been implemented and the performance study shows it outperforms three other incremental algorithms: FUP, DB-tree and re-running frequent pattern growth (FP-growth). Keywords data mining - association rule mining - frequent pattern mining - incremental mining Supported by the National Basic Research 973 Program of China under Grant No.G1999032705.Xiu-Li Ma received the Ph.D. degree in computer science from Peking University in 2003. She is currently a postdoctoral researcher at National Lab on Machine Perception of Peking University. Her main research interests include data warehousing, data mining, intelligent online analysis, and sensor network.Yun-Hai Tong received the Ph.D. degree in computer software from Peking University in 2002. He is currently an assistant professor at School of Electronics Engineering and Computer Science of Peking University. His research interests include data warehousing, online analysis processing and data mining.Shi-Wei Tang received the B.S. degree in mathematics from Peking University in 1964. Now, he is a professor and Ph.D. supervisor at School of Electronics Engineering and Computer Science of Peking University. His research interests include DBMS, information integration, data warehousing. OLAP, and data mining, database technology in specific application fields. He is the vice chair of the Database Society of China Computer Federation.Dong-Qing Yang received the B.S. degree in mathematics from Peking University in 1969. Now, she is a professor and Ph.D supervisor at School of Electronics Engineering and Computer Science of Peking University. Her research interests include database design methodology, database system implementation techniques, data warehousing and data mining, information integration and sharing in Web environment. She is a member of academic committee of Database Society of China Computer Federation. 展开更多
关键词 data mining association rule mining frequent pattern mining incremental mining
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部