By analyzing the existing prefix-tree data structure, an improved pattern tree was introduced for processing new transactions. It firstly stored transactions in a lexicographic order tree and then restructured the tre...By analyzing the existing prefix-tree data structure, an improved pattern tree was introduced for processing new transactions. It firstly stored transactions in a lexicographic order tree and then restructured the tree by sorting each path in a frequency-descending order. While updating the improved pattern tree, there was no need to rescan the entire new database or reconstruct a new tree for incremental updating. A test was performed on synthetic dataset T1014D100K with 100 000 transactions and 870 items. Experimental results show that the smaller the minimum sup- port threshold, the faster the improved pattern tree achieves over CanTree for all datasets. As the minimum support threshold increased from 2% to 3.5%, the runtime decreased from 452.71 s to 186.26 s. Meanwhile, the runtime re- quired by CanTree decreased from 1 367.03 s to 432.19 s. When the database was updated, the execution time of im- proved pattern tree consisted of construction of original improved pattern trees and reconstruction of initial tree. The experiment results showed that the runtime was saved by about 15% compared with that of CanTree. As the number of transactions increased, the runtime of improved pattern tree was about 25% shorter than that of FP-tree. The improved pattern tree also required less memory than CanTree.展开更多
Ontology mapping is a critical problem for integrating the heterogeneous information sources. It can identify the elements corresponding to each other. At present, there are many ontology mapping algorithms, but most ...Ontology mapping is a critical problem for integrating the heterogeneous information sources. It can identify the elements corresponding to each other. At present, there are many ontology mapping algorithms, but most of them are based on database schema. After analyzing the similarity and difference of ontology and schema, we propose a parsing graph-based algorithm for ontology mapping. The ontology parsing graph (OP-graph) extends the general concept of graph, encodes logic relationship, and semantic information which the ontology contains into vertices and edges of the graph. Thus, the problem of ontology mapping is translated into a problem of finding the optimal match between the two OP-graphs. With the definition of a universal measure for comparing the entities of two ontoiogies, we calculate the whole similarity between the two OP-graphs iteratively, until the optimal match is found. The results of experiments show that our algorithm is promising.展开更多
基金Supported by National Natural Science Foundation of China (No.50975193)Specialized Research Fund for Doctoral Program of Higher Education of China (No.20060056016)
文摘By analyzing the existing prefix-tree data structure, an improved pattern tree was introduced for processing new transactions. It firstly stored transactions in a lexicographic order tree and then restructured the tree by sorting each path in a frequency-descending order. While updating the improved pattern tree, there was no need to rescan the entire new database or reconstruct a new tree for incremental updating. A test was performed on synthetic dataset T1014D100K with 100 000 transactions and 870 items. Experimental results show that the smaller the minimum sup- port threshold, the faster the improved pattern tree achieves over CanTree for all datasets. As the minimum support threshold increased from 2% to 3.5%, the runtime decreased from 452.71 s to 186.26 s. Meanwhile, the runtime re- quired by CanTree decreased from 1 367.03 s to 432.19 s. When the database was updated, the execution time of im- proved pattern tree consisted of construction of original improved pattern trees and reconstruction of initial tree. The experiment results showed that the runtime was saved by about 15% compared with that of CanTree. As the number of transactions increased, the runtime of improved pattern tree was about 25% shorter than that of FP-tree. The improved pattern tree also required less memory than CanTree.
基金National Natural Science Fundation of China (No.60374071)National Basic Research Program of China( No.2003CB316905)
文摘Ontology mapping is a critical problem for integrating the heterogeneous information sources. It can identify the elements corresponding to each other. At present, there are many ontology mapping algorithms, but most of them are based on database schema. After analyzing the similarity and difference of ontology and schema, we propose a parsing graph-based algorithm for ontology mapping. The ontology parsing graph (OP-graph) extends the general concept of graph, encodes logic relationship, and semantic information which the ontology contains into vertices and edges of the graph. Thus, the problem of ontology mapping is translated into a problem of finding the optimal match between the two OP-graphs. With the definition of a universal measure for comparing the entities of two ontoiogies, we calculate the whole similarity between the two OP-graphs iteratively, until the optimal match is found. The results of experiments show that our algorithm is promising.