The Apriori algorithm is a classical method of association rules mining.Based on analysis of this theory,the paper provides an improved Apriori algorithm.The paper puts foward with algorithm combines HASH table techni...The Apriori algorithm is a classical method of association rules mining.Based on analysis of this theory,the paper provides an improved Apriori algorithm.The paper puts foward with algorithm combines HASH table technique and reduction of candidate item sets to enhance the usage efficiency of resources as well as the individualized service of the data library.展开更多
Data mining techniques offer great opportunities for developing ethics lines whose main aim is to ensure improvements and compliance with the values, conduct and commitments making up the code of ethics. The aim of th...Data mining techniques offer great opportunities for developing ethics lines whose main aim is to ensure improvements and compliance with the values, conduct and commitments making up the code of ethics. The aim of this study is to suggest a process for exploiting the data generated by the data generated and collected from an ethics line by extracting rules of association and applying the Apriori algorithm. This makes it possible to identify anomalies and behaviour patterns requiring action to review, correct, promote or expand them, as appropriate.展开更多
Distributed genetic algorithm can be combined with the adaptive genetic algorithm for mining the interesting and comprehensible classification rules.The paper gives the method to encode for the rules,the fitness funct...Distributed genetic algorithm can be combined with the adaptive genetic algorithm for mining the interesting and comprehensible classification rules.The paper gives the method to encode for the rules,the fitness function,the selecting,crossover,mutation and migration operator for the DAGA at the same time are designed.展开更多
This paper describes an Inductive method with gnnetic search which learns attribute based phraserllle of natural laguage from set of preclassified examples. Every example is described with some attributes/values. This...This paper describes an Inductive method with gnnetic search which learns attribute based phraserllle of natural laguage from set of preclassified examples. Every example is described with some attributes/values. This algorithm takes an example as a seed, generalizes it by genetic process, and makes it cover as many examples as possible. We use genetic operator in population to perform a probabilistic parallel search in rule space and it will reduce greatly possibe rule search space compared with many other inductive methods. In this paper, we give description of attribute, word, dictionary and rule at first. then we describe learning algoritm and genetic search Proctess, and at last, we give a computing method abour quility of roule C(r).展开更多
Power generation dispatching is a large complex system problem with multi-dimensional and nonlinear characteristics. A mathematical model was established based on the principle of reservoir operation. A large quantity...Power generation dispatching is a large complex system problem with multi-dimensional and nonlinear characteristics. A mathematical model was established based on the principle of reservoir operation. A large quantity of optimal scheduling processes were obtained by calculating the daily runoff process within three typical years, and a large number of simulated daily runoff processes were obtained using the progressive optimality algorithm (POA) in combination with the genetic algorithm (GA). After analyzing the optimal scheduling processes, the corresponding scheduling rules were determined, and the practical formulas were obtained. These rules can make full use of the rolling runoff forecast and carry out the rolling scheduling. Compared with the optimized results, the maximum relative difference of the annual power generation obtained by the scheduling rules is no more than 1%. The effectiveness and practical applicability of the scheduling rules are demonstrated by a case study. This study provides a new perspective for formulating the rules of power generation dispatching.展开更多
The purpose of this paper is to introduce a new pivot rule of the simplex algorithm. The simplex algorithm first presented by George B. Dantzig, is a widely used method for solving a linear programming problem (LP). O...The purpose of this paper is to introduce a new pivot rule of the simplex algorithm. The simplex algorithm first presented by George B. Dantzig, is a widely used method for solving a linear programming problem (LP). One of the important steps of the simplex algorithm is applying an appropriate pivot rule to select the basis-entering variable corresponding to the maximum reduced cost. Unfortunately, this pivot rule not only can lead to a critical cycling (solved by Bland’s rules), but does not improve efficiently the objective function. Our new pivot rule 1) solves the cycling problem in the original Dantzig’s simplex pivot rule, and 2) leads to an optimal improvement of the objective function at each iteration. The new pivot rule can lead to the optimal solution of LP with a lower number of iterations. In a maximization problem, Dantzig’s pivot rule selects a basis-entering variable corresponding to the most positive reduced cost;in some problems, it is well-known that Dantzig’s pivot rule, before reaching the optimal solution, may visit a large number of extreme points. Our goal is to improve the simplex algorithm so that the number of extreme points to visit is reduced;we propose an optimal improvement in the objective value per unit step of the basis-entering variable. In this paper, we propose a pivot rule that can reduce the number of such iterations over the Dantzig’s pivot rule and prevent cycling in the simplex algorithm. The idea is to have the maximum improvement in the objective value function: from the set of basis-entering variables with positive reduced cost, the efficient basis-entering variable corresponds to an optimal improvement of the objective function. Using computational complexity arguments and some examples, we prove that our optimal pivot rule is very effective and solves the cycling problem in LP. We test and compare the efficiency of this new pivot rule with Dantzig’s original pivot rule and the simplex algorithm in MATLAB environment.展开更多
Maximum frequent pattern generation from a large database of transactions and items for association rule mining is an important research topic in data mining. Association rule mining aims to discover interesting corre...Maximum frequent pattern generation from a large database of transactions and items for association rule mining is an important research topic in data mining. Association rule mining aims to discover interesting correlations, frequent patterns, associations, or causal structures between items hidden in a large database. By exploiting quantum computing, we propose an efficient quantum search algorithm design to discover the maximum frequent patterns. We modified Grover’s search algorithm so that a subspace of arbitrary symmetric states is used instead of the whole search space. We presented a novel quantum oracle design that employs a quantum counter to count the maximum frequent items and a quantum comparator to check with a minimum support threshold. The proposed derived algorithm increases the rate of the correct solutions since the search is only in a subspace. Furthermore, our algorithm significantly scales and optimizes the required number of qubits in design, which directly reflected positively on the performance. Our proposed design can accommodate more transactions and items and still have a good performance with a small number of qubits.展开更多
Particle swarm optimization (PSO) algorithm is an effective bio-inspired algorithm but it has shortage of premature convergence. Researchers have made some improvements especially in force rules and population topol...Particle swarm optimization (PSO) algorithm is an effective bio-inspired algorithm but it has shortage of premature convergence. Researchers have made some improvements especially in force rules and population topologies. However, the current algorithms only consider a single kind of force rules and lack consideration of comprehensive improvement in both multi force rules and population topologies. In this paper, a dynamic topology multi force particle swarm optimization (DTMFPSO) algorithm is proposed in order to get better search performance. First of all, the principle of the presented multi force particle swarm optimization (MFPSO) algorithm is that different force rules are used in different search stages, which can balance the ability of global and local search. Secondly, a fitness-driven edge-changing (FE) topology based on the probability selection mechanism of roulette method is designed to cut and add edges between the particles, and the DTMFPSO algorithm is proposed by combining the FE topology with the MFPSO algorithm through concurrent evolution of both algorithm and structure in order to further improve the search accuracy. Thirdly, Benchmark functions are employed to evaluate the performance of the DTMFPSO algorithm, and test results show that the proposed algorithm is better than the well-known PSO algorithms, such as gPSO, MPSO, and EPSO algorithms. Finally, the proposed algorithm is applied to optimize the process parameters for ultrasonic vibration cutting on SiC wafer, and the surface quality of the SiC wafer is improved by 12.8% compared with the PSO algorithm in Ref. [25]. This research proposes a DTMFPSO algorithm with multi force rules and dynamic population topologies evolved simultaneously, and it has better search performance.展开更多
Algorithm of fuzzy reasoning has been successful applied in fuzzy control,but its theoretical foundation of algorithms has not been thoroughly investigated. In this paper,structure of basic algorithms of fuzzy reasoni...Algorithm of fuzzy reasoning has been successful applied in fuzzy control,but its theoretical foundation of algorithms has not been thoroughly investigated. In this paper,structure of basic algorithms of fuzzy reasoning was studied, its rationality was discussed from the viewpoint of logic and mathematics, and three theorems were proved. These theorems shows that there always exists a mathe-~matical relation (that is, a bounded real function) between the premises and the conclusion for fuzzy reasoning, and in fact various algorithms of fuzzy reasoning are specific forms of this function. Thus these results show that algorithms of fuzzy reasoning are theoretically reliable.展开更多
Scattered storage means an item can be stored in multiple inventory bins. The scattered storage assignment problem based on association rules in Kiva mobile fulfillment system is investigated, which aims to decide the...Scattered storage means an item can be stored in multiple inventory bins. The scattered storage assignment problem based on association rules in Kiva mobile fulfillment system is investigated, which aims to decide the pods for each item to put on so as to minimize the number of pods to be moved when picking a batch of orders. This problem is formulated into an integer programming model. A genetic algorithm is developed to solve the large-sized problems. Computational experiments and comparison between the scattered storage strategy and random storage strategy are conducted to evaluate the performance of the model and algorithm.展开更多
As the first step of service restoration of distribution system,rapid fault diagnosis is a significant task for reducing power outage time,decreasing outage loss,and subsequently improving service reliability and safe...As the first step of service restoration of distribution system,rapid fault diagnosis is a significant task for reducing power outage time,decreasing outage loss,and subsequently improving service reliability and safety.This paper analyzes a fault diagnosis approach by using rough set theory in which how to reduce decision table of data set is a main calculation intensive task.Aiming at this reduction problem,a heuristic reduction algorithm based on attribution length and frequency is proposed.At the same time,the corresponding value reduction method is proposed in order to fulfill the reduction and diagnosis rules extraction.Meanwhile,a Euclid matching method is introduced to solve confliction problems among the extracted rules when some information is lacking.Principal of the whole algorithm is clear and diagnostic rules distilled from the reduction are concise.Moreover,it needs less calculation towards specific discernibility matrix,and thus avoids the corresponding NP hard problem.The whole process is realized by MATLAB programming.A simulation example shows that the method has a fast calculation speed,and the extracted rules can reflect the characteristic of fault with a concise form.The rule database,formed by different reduction of decision table,can diagnose single fault and multi-faults efficiently,and give satisfied results even when the existed information is incomplete.The proposed method has good error-tolerate capability and the potential for on-line fault diagnosis.展开更多
At present, most of the association rules algorithms are based on the Boolean attribute and single-level association rules mining. But data of the real world has various types, the multi-level and quantitative attribu...At present, most of the association rules algorithms are based on the Boolean attribute and single-level association rules mining. But data of the real world has various types, the multi-level and quantitative attributes are got more and more attention. And the most important step is to mine frequent sets. In this paper, we propose an algorithm that is called fuzzy multiple-level association (FMA) rules to mine frequent sets. It is based on the improved Eclat algorithm that is different to many researchers’ proposed algorithms thatused the Apriori algorithm. We analyze quantitative data’s frequent sets by using the fuzzy theory, dividing the hierarchy of concept and softening the boundary of attributes’ values and frequency. In this paper, we use the vertical-style data and the improved Eclat algorithm to describe the proposed method, we use this algorithm to analyze the data of Beijing logistics route. Experiments show that the algorithm has a good performance, it has better effectiveness and high efficiency.展开更多
This paper compared the difference between the traditional Petri nets and reasoning Petri nets(RPN),and presented a fuzzy reasoning Petri net(FRPN) model to represent the fuzzy production rules of a rule based system....This paper compared the difference between the traditional Petri nets and reasoning Petri nets(RPN),and presented a fuzzy reasoning Petri net(FRPN) model to represent the fuzzy production rules of a rule based system.Based on the FRPN model,a formal reasoning algorithm using the operators in max algebra was proposed to perform fuzzy reasoning automatically.The algorithm is consistent with the matrix equation expression method in the traditional Petri net.Its legitimacy and feasibility were testified through an example.展开更多
Because data warehouse is frequently changing, incremental data leads to old knowledge which is mined formerly unavailable. In order to maintain the discovered knowledge and patterns dynamically, this study presents a...Because data warehouse is frequently changing, incremental data leads to old knowledge which is mined formerly unavailable. In order to maintain the discovered knowledge and patterns dynamically, this study presents a novel algorithm updating for global frequent patterns-IPARUC. A rapid clustering method is introduced to divide database into n parts in IPARUC firstly, where the data are similar in the same part. Then, the nodes in the tree are adjusted dynamically in inserting process by "pruning and laying back" to keep the frequency descending order so that they can be shared to approaching optimization. Finally local frequent itemsets mined from each local dataset are merged into global frequent itemsets. The results of experimental study are very encouraging. It is obvious from experiment that IPARUC is more effective and efficient than other two contrastive methods. Furthermore, there is significant application potential to a prototype of Web log Analyzer in web usage mining that can help us to discover useful knowledge effectively, even help managers making decision.展开更多
Water levels in reservoirs are generally not allowed to exceed the flood-limited water level during the flood season, which means that huge amounts of water spill in order to provide adequate storage for flood prevent...Water levels in reservoirs are generally not allowed to exceed the flood-limited water level during the flood season, which means that huge amounts of water spill in order to provide adequate storage for flood prevention and that it is difficult to fill the reservoir fully at the end of year. Early reservoir refill is an effective method for addressing the contradiction between the needs of flood control and of comprehensive utilization. This study selected the Danjiangkou Reservoir, which is the water source for the middle route of the South-North Water Diversion Project (SNWDP) in China, as a case study, and analyzed the necessity and operational feasibility of early reservoir refill. An early reservoir refill model is proposed based on the maximum average storage ratio, optimized by the progressive optimality algorithm, and the optimal scheduling schemes were obtained. Results show that the best time of refill operation for the Danjiangkou Reservoir is September 15, and the upper limit water level during September is 166 m. The proposed early refill scheme, in stages, can increase the annual average storage ratio from 77.51% to 81.99%, and decrease spilled water from 2.439 × 109 m^3 to 1.692×109 m^3, in comparison to the original design scheme. The suggested early significant comprehensive benefits, which decision-making. reservoir refill scheme can be easily operated with may provide a good reference for scheduling展开更多
This paper aims to mine the knowledge and rules on compatibility of drugs from the prescriptions for curing arrhythmia in the Chinese traditional medicine database by Apriori algorithm. For data preparation, 1 113 pre...This paper aims to mine the knowledge and rules on compatibility of drugs from the prescriptions for curing arrhythmia in the Chinese traditional medicine database by Apriori algorithm. For data preparation, 1 113 prescriptions for arrhythmia, including 535 herbs ( totally 10884 counts of herbs) were collected into the database. The prescription data were preprocessed through redundancy reduction, normalized storage, and knowledge induction according to the pretreatment demands of data mining. Then the Apriori algorithm was used to analyze the data and form the related technical rules and treatment procedures. The experimental result of compatibility of drugs for curing arrhythmia from the Chinese traditional medicine database shows that the prescription compatibility obtained by Apriori algorithm generally accords with the basic law of traditional Chinese medicine for arrhythmia. Some special compatibilities unreported were also discovered in the experiment, which may be used as the basis for developing new prescriptions for arrhythmia.展开更多
文摘The Apriori algorithm is a classical method of association rules mining.Based on analysis of this theory,the paper provides an improved Apriori algorithm.The paper puts foward with algorithm combines HASH table technique and reduction of candidate item sets to enhance the usage efficiency of resources as well as the individualized service of the data library.
文摘Data mining techniques offer great opportunities for developing ethics lines whose main aim is to ensure improvements and compliance with the values, conduct and commitments making up the code of ethics. The aim of this study is to suggest a process for exploiting the data generated by the data generated and collected from an ethics line by extracting rules of association and applying the Apriori algorithm. This makes it possible to identify anomalies and behaviour patterns requiring action to review, correct, promote or expand them, as appropriate.
基金National Ethnic Affairs Commission NatureScience Foundation of China(PMZY06004)the Education Science Foundation of Guangxi(2006A-E004)
文摘Distributed genetic algorithm can be combined with the adaptive genetic algorithm for mining the interesting and comprehensible classification rules.The paper gives the method to encode for the rules,the fitness function,the selecting,crossover,mutation and migration operator for the DAGA at the same time are designed.
文摘This paper describes an Inductive method with gnnetic search which learns attribute based phraserllle of natural laguage from set of preclassified examples. Every example is described with some attributes/values. This algorithm takes an example as a seed, generalizes it by genetic process, and makes it cover as many examples as possible. We use genetic operator in population to perform a probabilistic parallel search in rule space and it will reduce greatly possibe rule search space compared with many other inductive methods. In this paper, we give description of attribute, word, dictionary and rule at first. then we describe learning algoritm and genetic search Proctess, and at last, we give a computing method abour quility of roule C(r).
基金supported by the National Key Basic Research Development Program of China (Grant No. 2002CCA00700)
文摘Power generation dispatching is a large complex system problem with multi-dimensional and nonlinear characteristics. A mathematical model was established based on the principle of reservoir operation. A large quantity of optimal scheduling processes were obtained by calculating the daily runoff process within three typical years, and a large number of simulated daily runoff processes were obtained using the progressive optimality algorithm (POA) in combination with the genetic algorithm (GA). After analyzing the optimal scheduling processes, the corresponding scheduling rules were determined, and the practical formulas were obtained. These rules can make full use of the rolling runoff forecast and carry out the rolling scheduling. Compared with the optimized results, the maximum relative difference of the annual power generation obtained by the scheduling rules is no more than 1%. The effectiveness and practical applicability of the scheduling rules are demonstrated by a case study. This study provides a new perspective for formulating the rules of power generation dispatching.
文摘The purpose of this paper is to introduce a new pivot rule of the simplex algorithm. The simplex algorithm first presented by George B. Dantzig, is a widely used method for solving a linear programming problem (LP). One of the important steps of the simplex algorithm is applying an appropriate pivot rule to select the basis-entering variable corresponding to the maximum reduced cost. Unfortunately, this pivot rule not only can lead to a critical cycling (solved by Bland’s rules), but does not improve efficiently the objective function. Our new pivot rule 1) solves the cycling problem in the original Dantzig’s simplex pivot rule, and 2) leads to an optimal improvement of the objective function at each iteration. The new pivot rule can lead to the optimal solution of LP with a lower number of iterations. In a maximization problem, Dantzig’s pivot rule selects a basis-entering variable corresponding to the most positive reduced cost;in some problems, it is well-known that Dantzig’s pivot rule, before reaching the optimal solution, may visit a large number of extreme points. Our goal is to improve the simplex algorithm so that the number of extreme points to visit is reduced;we propose an optimal improvement in the objective value per unit step of the basis-entering variable. In this paper, we propose a pivot rule that can reduce the number of such iterations over the Dantzig’s pivot rule and prevent cycling in the simplex algorithm. The idea is to have the maximum improvement in the objective value function: from the set of basis-entering variables with positive reduced cost, the efficient basis-entering variable corresponds to an optimal improvement of the objective function. Using computational complexity arguments and some examples, we prove that our optimal pivot rule is very effective and solves the cycling problem in LP. We test and compare the efficiency of this new pivot rule with Dantzig’s original pivot rule and the simplex algorithm in MATLAB environment.
文摘Maximum frequent pattern generation from a large database of transactions and items for association rule mining is an important research topic in data mining. Association rule mining aims to discover interesting correlations, frequent patterns, associations, or causal structures between items hidden in a large database. By exploiting quantum computing, we propose an efficient quantum search algorithm design to discover the maximum frequent patterns. We modified Grover’s search algorithm so that a subspace of arbitrary symmetric states is used instead of the whole search space. We presented a novel quantum oracle design that employs a quantum counter to count the maximum frequent items and a quantum comparator to check with a minimum support threshold. The proposed derived algorithm increases the rate of the correct solutions since the search is only in a subspace. Furthermore, our algorithm significantly scales and optimizes the required number of qubits in design, which directly reflected positively on the performance. Our proposed design can accommodate more transactions and items and still have a good performance with a small number of qubits.
基金Supported by National Natural Science Foundation of China(Grant No.51405426)Hebei Provincial Natural Science Foundation of China(Grant No.E2016203306)
文摘Particle swarm optimization (PSO) algorithm is an effective bio-inspired algorithm but it has shortage of premature convergence. Researchers have made some improvements especially in force rules and population topologies. However, the current algorithms only consider a single kind of force rules and lack consideration of comprehensive improvement in both multi force rules and population topologies. In this paper, a dynamic topology multi force particle swarm optimization (DTMFPSO) algorithm is proposed in order to get better search performance. First of all, the principle of the presented multi force particle swarm optimization (MFPSO) algorithm is that different force rules are used in different search stages, which can balance the ability of global and local search. Secondly, a fitness-driven edge-changing (FE) topology based on the probability selection mechanism of roulette method is designed to cut and add edges between the particles, and the DTMFPSO algorithm is proposed by combining the FE topology with the MFPSO algorithm through concurrent evolution of both algorithm and structure in order to further improve the search accuracy. Thirdly, Benchmark functions are employed to evaluate the performance of the DTMFPSO algorithm, and test results show that the proposed algorithm is better than the well-known PSO algorithms, such as gPSO, MPSO, and EPSO algorithms. Finally, the proposed algorithm is applied to optimize the process parameters for ultrasonic vibration cutting on SiC wafer, and the surface quality of the SiC wafer is improved by 12.8% compared with the PSO algorithm in Ref. [25]. This research proposes a DTMFPSO algorithm with multi force rules and dynamic population topologies evolved simultaneously, and it has better search performance.
文摘Algorithm of fuzzy reasoning has been successful applied in fuzzy control,but its theoretical foundation of algorithms has not been thoroughly investigated. In this paper,structure of basic algorithms of fuzzy reasoning was studied, its rationality was discussed from the viewpoint of logic and mathematics, and three theorems were proved. These theorems shows that there always exists a mathe-~matical relation (that is, a bounded real function) between the premises and the conclusion for fuzzy reasoning, and in fact various algorithms of fuzzy reasoning are specific forms of this function. Thus these results show that algorithms of fuzzy reasoning are theoretically reliable.
文摘Scattered storage means an item can be stored in multiple inventory bins. The scattered storage assignment problem based on association rules in Kiva mobile fulfillment system is investigated, which aims to decide the pods for each item to put on so as to minimize the number of pods to be moved when picking a batch of orders. This problem is formulated into an integer programming model. A genetic algorithm is developed to solve the large-sized problems. Computational experiments and comparison between the scattered storage strategy and random storage strategy are conducted to evaluate the performance of the model and algorithm.
基金Project Supported by National Natural Science Foundation of China (50607023), Natural Science Femdation of CQ CSTC (2006BB2189)
文摘As the first step of service restoration of distribution system,rapid fault diagnosis is a significant task for reducing power outage time,decreasing outage loss,and subsequently improving service reliability and safety.This paper analyzes a fault diagnosis approach by using rough set theory in which how to reduce decision table of data set is a main calculation intensive task.Aiming at this reduction problem,a heuristic reduction algorithm based on attribution length and frequency is proposed.At the same time,the corresponding value reduction method is proposed in order to fulfill the reduction and diagnosis rules extraction.Meanwhile,a Euclid matching method is introduced to solve confliction problems among the extracted rules when some information is lacking.Principal of the whole algorithm is clear and diagnostic rules distilled from the reduction are concise.Moreover,it needs less calculation towards specific discernibility matrix,and thus avoids the corresponding NP hard problem.The whole process is realized by MATLAB programming.A simulation example shows that the method has a fast calculation speed,and the extracted rules can reflect the characteristic of fault with a concise form.The rule database,formed by different reduction of decision table,can diagnose single fault and multi-faults efficiently,and give satisfied results even when the existed information is incomplete.The proposed method has good error-tolerate capability and the potential for on-line fault diagnosis.
基金supported by the Fundamental Research Funds for the Central Universities under Grants No.ZYGX2014J051 and No.ZYGX2014J066Science and Technology Projects in Sichuan Province under Grants No.2015JY0178,No.2016FZ0002,No.2014GZ0109,No.2015KZ002 and No.2015JY0030China Postdoctoral Science Foundation under Grant No.2015M572464
文摘At present, most of the association rules algorithms are based on the Boolean attribute and single-level association rules mining. But data of the real world has various types, the multi-level and quantitative attributes are got more and more attention. And the most important step is to mine frequent sets. In this paper, we propose an algorithm that is called fuzzy multiple-level association (FMA) rules to mine frequent sets. It is based on the improved Eclat algorithm that is different to many researchers’ proposed algorithms thatused the Apriori algorithm. We analyze quantitative data’s frequent sets by using the fuzzy theory, dividing the hierarchy of concept and softening the boundary of attributes’ values and frequency. In this paper, we use the vertical-style data and the improved Eclat algorithm to describe the proposed method, we use this algorithm to analyze the data of Beijing logistics route. Experiments show that the algorithm has a good performance, it has better effectiveness and high efficiency.
文摘This paper compared the difference between the traditional Petri nets and reasoning Petri nets(RPN),and presented a fuzzy reasoning Petri net(FRPN) model to represent the fuzzy production rules of a rule based system.Based on the FRPN model,a formal reasoning algorithm using the operators in max algebra was proposed to perform fuzzy reasoning automatically.The algorithm is consistent with the matrix equation expression method in the traditional Petri net.Its legitimacy and feasibility were testified through an example.
基金Supported by the National Natural Science Foundation of China(60472099)Ningbo Natural Science Foundation(2006A610017)
文摘Because data warehouse is frequently changing, incremental data leads to old knowledge which is mined formerly unavailable. In order to maintain the discovered knowledge and patterns dynamically, this study presents a novel algorithm updating for global frequent patterns-IPARUC. A rapid clustering method is introduced to divide database into n parts in IPARUC firstly, where the data are similar in the same part. Then, the nodes in the tree are adjusted dynamically in inserting process by "pruning and laying back" to keep the frequency descending order so that they can be shared to approaching optimization. Finally local frequent itemsets mined from each local dataset are merged into global frequent itemsets. The results of experimental study are very encouraging. It is obvious from experiment that IPARUC is more effective and efficient than other two contrastive methods. Furthermore, there is significant application potential to a prototype of Web log Analyzer in web usage mining that can help us to discover useful knowledge effectively, even help managers making decision.
基金supported by the National Natural Science Foundation of China(Grant No.51190094)the National Key Technologies Research and Development Program of China(Grant No.2009BAC56B02)
文摘Water levels in reservoirs are generally not allowed to exceed the flood-limited water level during the flood season, which means that huge amounts of water spill in order to provide adequate storage for flood prevention and that it is difficult to fill the reservoir fully at the end of year. Early reservoir refill is an effective method for addressing the contradiction between the needs of flood control and of comprehensive utilization. This study selected the Danjiangkou Reservoir, which is the water source for the middle route of the South-North Water Diversion Project (SNWDP) in China, as a case study, and analyzed the necessity and operational feasibility of early reservoir refill. An early reservoir refill model is proposed based on the maximum average storage ratio, optimized by the progressive optimality algorithm, and the optimal scheduling schemes were obtained. Results show that the best time of refill operation for the Danjiangkou Reservoir is September 15, and the upper limit water level during September is 166 m. The proposed early refill scheme, in stages, can increase the annual average storage ratio from 77.51% to 81.99%, and decrease spilled water from 2.439 × 109 m^3 to 1.692×109 m^3, in comparison to the original design scheme. The suggested early significant comprehensive benefits, which decision-making. reservoir refill scheme can be easily operated with may provide a good reference for scheduling
文摘This paper aims to mine the knowledge and rules on compatibility of drugs from the prescriptions for curing arrhythmia in the Chinese traditional medicine database by Apriori algorithm. For data preparation, 1 113 prescriptions for arrhythmia, including 535 herbs ( totally 10884 counts of herbs) were collected into the database. The prescription data were preprocessed through redundancy reduction, normalized storage, and knowledge induction according to the pretreatment demands of data mining. Then the Apriori algorithm was used to analyze the data and form the related technical rules and treatment procedures. The experimental result of compatibility of drugs for curing arrhythmia from the Chinese traditional medicine database shows that the prescription compatibility obtained by Apriori algorithm generally accords with the basic law of traditional Chinese medicine for arrhythmia. Some special compatibilities unreported were also discovered in the experiment, which may be used as the basis for developing new prescriptions for arrhythmia.