Sparse large-scale multi-objective optimization problems(SLMOPs)are common in science and engineering.However,the large-scale problem represents the high dimensionality of the decision space,requiring algorithms to tr...Sparse large-scale multi-objective optimization problems(SLMOPs)are common in science and engineering.However,the large-scale problem represents the high dimensionality of the decision space,requiring algorithms to traverse vast expanse with limited computational resources.Furthermore,in the context of sparse,most variables in Pareto optimal solutions are zero,making it difficult for algorithms to identify non-zero variables efficiently.This paper is dedicated to addressing the challenges posed by SLMOPs.To start,we introduce innovative objective functions customized to mine maximum and minimum candidate sets.This substantial enhancement dramatically improves the efficacy of frequent pattern mining.In this way,selecting candidate sets is no longer based on the quantity of nonzero variables they contain but on a higher proportion of nonzero variables within specific dimensions.Additionally,we unveil a novel approach to association rule mining,which delves into the intricate relationships between non-zero variables.This novel methodology aids in identifying sparse distributions that can potentially expedite reductions in the objective function value.We extensively tested our algorithm across eight benchmark problems and four real-world SLMOPs.The results demonstrate that our approach achieves competitive solutions across various challenges.展开更多
Maximum frequent pattern generation from a large database of transactions and items for association rule mining is an important research topic in data mining. Association rule mining aims to discover interesting corre...Maximum frequent pattern generation from a large database of transactions and items for association rule mining is an important research topic in data mining. Association rule mining aims to discover interesting correlations, frequent patterns, associations, or causal structures between items hidden in a large database. By exploiting quantum computing, we propose an efficient quantum search algorithm design to discover the maximum frequent patterns. We modified Grover’s search algorithm so that a subspace of arbitrary symmetric states is used instead of the whole search space. We presented a novel quantum oracle design that employs a quantum counter to count the maximum frequent items and a quantum comparator to check with a minimum support threshold. The proposed derived algorithm increases the rate of the correct solutions since the search is only in a subspace. Furthermore, our algorithm significantly scales and optimizes the required number of qubits in design, which directly reflected positively on the performance. Our proposed design can accommodate more transactions and items and still have a good performance with a small number of qubits.展开更多
Mining frequent pattern in transaction database, time series databases, and many other kinds of databases have been studied popularly in data mining research. Most of the previous studies adopt Apriori like candidat...Mining frequent pattern in transaction database, time series databases, and many other kinds of databases have been studied popularly in data mining research. Most of the previous studies adopt Apriori like candidate set generation and test approach. However, candidate set generation is very costly. Han J. proposed a novel algorithm FP growth that could generate frequent pattern without candidate set. Based on the analysis of the algorithm FP growth, this paper proposes a concept of equivalent FP tree and proposes an improved algorithm, denoted as FP growth * , which is much faster in speed, and easy to realize. FP growth * adopts a modified structure of FP tree and header table, and only generates a header table in each recursive operation and projects the tree to the original FP tree. The two algorithms get the same frequent pattern set in the same transaction database, but the performance study on computer shows that the speed of the improved algorithm, FP growth * , is at least two times as fast as that of FP growth.展开更多
Because data warehouse is frequently changing, incremental data leads to old knowledge which is mined formerly unavailable. In order to maintain the discovered knowledge and patterns dynamically, this study presents a...Because data warehouse is frequently changing, incremental data leads to old knowledge which is mined formerly unavailable. In order to maintain the discovered knowledge and patterns dynamically, this study presents a novel algorithm updating for global frequent patterns-IPARUC. A rapid clustering method is introduced to divide database into n parts in IPARUC firstly, where the data are similar in the same part. Then, the nodes in the tree are adjusted dynamically in inserting process by "pruning and laying back" to keep the frequency descending order so that they can be shared to approaching optimization. Finally local frequent itemsets mined from each local dataset are merged into global frequent itemsets. The results of experimental study are very encouraging. It is obvious from experiment that IPARUC is more effective and efficient than other two contrastive methods. Furthermore, there is significant application potential to a prototype of Web log Analyzer in web usage mining that can help us to discover useful knowledge effectively, even help managers making decision.展开更多
Pattern discovery from time series is of fundamental importance. Most of the algorithms of pattern discovery in time series capture the values of time series based on some kinds of similarity measures. Affected by the...Pattern discovery from time series is of fundamental importance. Most of the algorithms of pattern discovery in time series capture the values of time series based on some kinds of similarity measures. Affected by the scale and baseline, value-based methods bring about problem when the objective is to capture the shape. Thus, a similarity measure based on shape, Sh measure, is originally proposed, andthe properties of this similarity and corresponding proofs are given. Then a time series shape pattern discovery algorithm based on Sh measure is put forward. The proposed algorithm is terminated in finite iteration with given computational and storage complexity. Finally the experiments on synthetic datasets and sunspot datasets demonstrate that the time series shape pattern algorithm is valid.展开更多
A kind of single linked lists named aggregative chain is introduced to the algorithm, thus improving the architecture of FP tree. The new FP tree is a one-way tree and only the pointers that point its parent at each n...A kind of single linked lists named aggregative chain is introduced to the algorithm, thus improving the architecture of FP tree. The new FP tree is a one-way tree and only the pointers that point its parent at each node are kept. Route information of different nodes in a same item are compressed into aggregative chains so that the frequent patterns will be produced in aggregative chains without generating node links and conditional pattern bases. An example of Web key words retrieval is given to analyze and verify the frequent pattern algorithm in this paper.展开更多
It is nontrivial to maintain such discovered frequent query patterns in real XML-DBMS because the transaction database of queries may allow frequent updates and such updates may not only invalidate some existing frequ...It is nontrivial to maintain such discovered frequent query patterns in real XML-DBMS because the transaction database of queries may allow frequent updates and such updates may not only invalidate some existing frequent query patterns but also generate some new frequent query patterns. In this paper, two incremental updating algorithms, FUX-QMiner and FUXQMiner, are proposed for efficient maintenance of discovered frequent query patterns and generation the new frequent query patterns when new XMI, queries are added into the database. Experimental results from our implementation show that the proposed algorithms have good performance. Key words XML - frequent query pattern - incremental algorithm - data mining CLC number TP 311 Foudation item: Supported by the Youthful Foundation for Scientific Research of University of Shanghai for Science and TechnologyBiography: PENG Dun-lu (1974-), male, Associate professor, Ph.D, research direction: data mining, Web service and its application, peerto-peer computing.展开更多
A new method was worked out to improve the precision of springback prediction in sheet metal forming by combining the finite element method (FEM) with the data mining (DM) technique. First the genetic algorithm (GA) w...A new method was worked out to improve the precision of springback prediction in sheet metal forming by combining the finite element method (FEM) with the data mining (DM) technique. First the genetic algorithm (GA) was adopted for recognizing the material parameters. Then according to the even design idea, the suitable calculation scheme was confirmed, and FEM was used for calculating the springback. The computation results were compared with experiment data, the difference between them was taken as source data, and a new pattern recognition method of DM called hierarchical optimal map recognition method (HOMR) is applied for summarizing the calculation regulation in FEM. At the end, the mathematics model of the springback simulation was established. Based on the model, the calculation errors of springback can be controlled within 10% compared with the experimental results.展开更多
基金support by the Open Project of Xiangjiang Laboratory(22XJ02003)the University Fundamental Research Fund(23-ZZCX-JDZ-28,ZK21-07)+5 种基金the National Science Fund for Outstanding Young Scholars(62122093)the National Natural Science Foundation of China(72071205)the Hunan Graduate Research Innovation Project(CX20230074)the Hunan Natural Science Foundation Regional Joint Project(2023JJ50490)the Science and Technology Project for Young and Middle-aged Talents of Hunan(2023TJZ03)the Science and Technology Innovation Program of Humnan Province(2023RC1002).
文摘Sparse large-scale multi-objective optimization problems(SLMOPs)are common in science and engineering.However,the large-scale problem represents the high dimensionality of the decision space,requiring algorithms to traverse vast expanse with limited computational resources.Furthermore,in the context of sparse,most variables in Pareto optimal solutions are zero,making it difficult for algorithms to identify non-zero variables efficiently.This paper is dedicated to addressing the challenges posed by SLMOPs.To start,we introduce innovative objective functions customized to mine maximum and minimum candidate sets.This substantial enhancement dramatically improves the efficacy of frequent pattern mining.In this way,selecting candidate sets is no longer based on the quantity of nonzero variables they contain but on a higher proportion of nonzero variables within specific dimensions.Additionally,we unveil a novel approach to association rule mining,which delves into the intricate relationships between non-zero variables.This novel methodology aids in identifying sparse distributions that can potentially expedite reductions in the objective function value.We extensively tested our algorithm across eight benchmark problems and four real-world SLMOPs.The results demonstrate that our approach achieves competitive solutions across various challenges.
文摘Maximum frequent pattern generation from a large database of transactions and items for association rule mining is an important research topic in data mining. Association rule mining aims to discover interesting correlations, frequent patterns, associations, or causal structures between items hidden in a large database. By exploiting quantum computing, we propose an efficient quantum search algorithm design to discover the maximum frequent patterns. We modified Grover’s search algorithm so that a subspace of arbitrary symmetric states is used instead of the whole search space. We presented a novel quantum oracle design that employs a quantum counter to count the maximum frequent items and a quantum comparator to check with a minimum support threshold. The proposed derived algorithm increases the rate of the correct solutions since the search is only in a subspace. Furthermore, our algorithm significantly scales and optimizes the required number of qubits in design, which directly reflected positively on the performance. Our proposed design can accommodate more transactions and items and still have a good performance with a small number of qubits.
基金theFundoftheNationalManagementBureauofTraditionalChineseMedicine(No .2 0 0 0 J P 5 4 )
文摘Mining frequent pattern in transaction database, time series databases, and many other kinds of databases have been studied popularly in data mining research. Most of the previous studies adopt Apriori like candidate set generation and test approach. However, candidate set generation is very costly. Han J. proposed a novel algorithm FP growth that could generate frequent pattern without candidate set. Based on the analysis of the algorithm FP growth, this paper proposes a concept of equivalent FP tree and proposes an improved algorithm, denoted as FP growth * , which is much faster in speed, and easy to realize. FP growth * adopts a modified structure of FP tree and header table, and only generates a header table in each recursive operation and projects the tree to the original FP tree. The two algorithms get the same frequent pattern set in the same transaction database, but the performance study on computer shows that the speed of the improved algorithm, FP growth * , is at least two times as fast as that of FP growth.
基金Supported by the National Natural Science Foundation of China(60472099)Ningbo Natural Science Foundation(2006A610017)
文摘Because data warehouse is frequently changing, incremental data leads to old knowledge which is mined formerly unavailable. In order to maintain the discovered knowledge and patterns dynamically, this study presents a novel algorithm updating for global frequent patterns-IPARUC. A rapid clustering method is introduced to divide database into n parts in IPARUC firstly, where the data are similar in the same part. Then, the nodes in the tree are adjusted dynamically in inserting process by "pruning and laying back" to keep the frequency descending order so that they can be shared to approaching optimization. Finally local frequent itemsets mined from each local dataset are merged into global frequent itemsets. The results of experimental study are very encouraging. It is obvious from experiment that IPARUC is more effective and efficient than other two contrastive methods. Furthermore, there is significant application potential to a prototype of Web log Analyzer in web usage mining that can help us to discover useful knowledge effectively, even help managers making decision.
文摘Pattern discovery from time series is of fundamental importance. Most of the algorithms of pattern discovery in time series capture the values of time series based on some kinds of similarity measures. Affected by the scale and baseline, value-based methods bring about problem when the objective is to capture the shape. Thus, a similarity measure based on shape, Sh measure, is originally proposed, andthe properties of this similarity and corresponding proofs are given. Then a time series shape pattern discovery algorithm based on Sh measure is put forward. The proposed algorithm is terminated in finite iteration with given computational and storage complexity. Finally the experiments on synthetic datasets and sunspot datasets demonstrate that the time series shape pattern algorithm is valid.
基金Supported by the Natural Science Foundation ofLiaoning Province (20042020)
文摘A kind of single linked lists named aggregative chain is introduced to the algorithm, thus improving the architecture of FP tree. The new FP tree is a one-way tree and only the pointers that point its parent at each node are kept. Route information of different nodes in a same item are compressed into aggregative chains so that the frequent patterns will be produced in aggregative chains without generating node links and conditional pattern bases. An example of Web key words retrieval is given to analyze and verify the frequent pattern algorithm in this paper.
文摘It is nontrivial to maintain such discovered frequent query patterns in real XML-DBMS because the transaction database of queries may allow frequent updates and such updates may not only invalidate some existing frequent query patterns but also generate some new frequent query patterns. In this paper, two incremental updating algorithms, FUX-QMiner and FUXQMiner, are proposed for efficient maintenance of discovered frequent query patterns and generation the new frequent query patterns when new XMI, queries are added into the database. Experimental results from our implementation show that the proposed algorithms have good performance. Key words XML - frequent query pattern - incremental algorithm - data mining CLC number TP 311 Foudation item: Supported by the Youthful Foundation for Scientific Research of University of Shanghai for Science and TechnologyBiography: PENG Dun-lu (1974-), male, Associate professor, Ph.D, research direction: data mining, Web service and its application, peerto-peer computing.
文摘A new method was worked out to improve the precision of springback prediction in sheet metal forming by combining the finite element method (FEM) with the data mining (DM) technique. First the genetic algorithm (GA) was adopted for recognizing the material parameters. Then according to the even design idea, the suitable calculation scheme was confirmed, and FEM was used for calculating the springback. The computation results were compared with experiment data, the difference between them was taken as source data, and a new pattern recognition method of DM called hierarchical optimal map recognition method (HOMR) is applied for summarizing the calculation regulation in FEM. At the end, the mathematics model of the springback simulation was established. Based on the model, the calculation errors of springback can be controlled within 10% compared with the experimental results.