It is of great significance to improve the efficiency of railway production and operation by realizing the fault knowledge association through the efficient data mining algorithm.However,high utility quantitative freq...It is of great significance to improve the efficiency of railway production and operation by realizing the fault knowledge association through the efficient data mining algorithm.However,high utility quantitative frequent pattern mining algorithms in the field of data mining still suffer from the problems of low time-memory performance and are not easy to scale up.In the context of such needs,we propose a related degree-based frequent pattern mining algorithm,named Related High Utility Quantitative Item set Mining(RHUQI-Miner),to enable the effective mining of railway fault data.The algorithm constructs the item-related degree structure of fault data and gives a pruning optimization strategy to find frequent patterns with higher related degrees,reducing redundancy and invalid frequent patterns.Subsequently,it uses the fixed pattern length strategy to modify the utility information of the item in the mining process so that the algorithm can control the length of the output frequent pattern according to the actual data situation and further improve the performance and practicability of the algorithm.The experimental results on the real fault dataset show that RHUQI-Miner can effectively reduce the time and memory consumption in the mining process,thus providing data support for differentiated and precise maintenance strategies.展开更多
A new algorithm based on an FC-tree (frequent closed pattern tree) and a max-FCIA (maximal frequent closed itemsets algorithm) is presented, which is used to mine the frequent closed itemsets for solving memory an...A new algorithm based on an FC-tree (frequent closed pattern tree) and a max-FCIA (maximal frequent closed itemsets algorithm) is presented, which is used to mine the frequent closed itemsets for solving memory and time consuming problems. This algorithm maps the transaction database by using a Hash table,gets the support of all frequent itemsets through operating the Hash table and forms a lexicographic subset tree including the frequent itemsets.Efficient pruning methods are used to get the FC-tree including all the minimum frequent closed itemsets through processing the lexicographic subset tree.Finally,frequent closed itemsets are generated from minimum frequent closed itemsets.The experimental results show that the mapping transaction database is introduced in the algorithm to reduce time consumption and to improve the efficiency of the program.Furthermore,the effective pruning strategy restrains the number of candidates,which saves space.The results show that the algorithm is effective.展开更多
Association rules mining is a major data mining field that leads to discovery of associations and correlations among items in today’s big data environment. The conventional association rule mining focuses mainly on p...Association rules mining is a major data mining field that leads to discovery of associations and correlations among items in today’s big data environment. The conventional association rule mining focuses mainly on positive itemsets generated from frequently occurring itemsets (PFIS). However, there has been a significant study focused on infrequent itemsets with utilization of negative association rules to mine interesting frequent itemsets (NFIS) from transactions. In this work, we propose an efficient backward calculating negative frequent itemset algorithm namely EBC-NFIS for computing backward supports that can extract both positive and negative frequent itemsets synchronously from dataset. EBC-NFIS algorithm is based on popular e-NFIS algorithm that computes supports of negative itemsets from the supports of positive itemsets. The proposed algorithm makes use of previously computed supports from memory to minimize the computation time. In addition, association rules, i.e. positive and negative association rules (PNARs) are generated from discovered frequent itemsets using EBC-NFIS algorithm. The efficiency of the proposed algorithm is verified by several experiments and comparing results with e-NFIS algorithm. The experimental results confirm that the proposed algorithm successfully discovers NFIS and PNARs and runs significantly faster than conventional e-NFIS algorithm.展开更多
Periodic patternmining has become a popular research subject in recent years;this approach involves the discoveryof frequently recurring patterns in a transaction sequence. However, previous algorithms for periodic pa...Periodic patternmining has become a popular research subject in recent years;this approach involves the discoveryof frequently recurring patterns in a transaction sequence. However, previous algorithms for periodic patternmining have ignored the utility (profit, value) of patterns. Additionally, these algorithms only identify periodicpatterns in a single sequence. However, identifying patterns of high utility that are common to a set of sequencesis more valuable. In several fields, identifying high-utility periodic frequent patterns in multiple sequences isimportant. In this study, an efficient algorithm called MHUPFPS was proposed to identify such patterns. To addressexisting problems, three new measures are defined: the utility, high support, and high-utility period sequenceratios. Further, a new upper bound, upSeqRa, and two new pruning properties were proposed. MHUPFPS usesa newly defined HUPFPS-list structure to significantly accelerate the reduction of the search space and improvethe overall performance of the algorithm. Furthermore, the proposed algorithmis evaluated using several datasets.The experimental results indicate that the algorithm is accurate and effective in filtering several non-high-utilityperiodic frequent patterns.展开更多
In the network security system,intrusion detection plays a significant role.The network security system detects the malicious actions in the network and also conforms the availability,integrity and confidentiality of da...In the network security system,intrusion detection plays a significant role.The network security system detects the malicious actions in the network and also conforms the availability,integrity and confidentiality of data informa-tion resources.Intrusion identification system can easily detect the false positive alerts.If large number of false positive alerts are created then it makes intrusion detection system as difficult to differentiate the false positive alerts from genuine attacks.Many research works have been done.The issues in the existing algo-rithms are more memory space and need more time to execute the transactions of records.This paper proposes a novel framework of network security Intrusion Detection System(IDS)using Modified Frequent Pattern(MFP-Tree)via K-means algorithm.The accuracy rate of Modified Frequent Pattern Tree(MFPT)-K means method infinding the various attacks are Normal 94.89%,for DoS based attack 98.34%,for User to Root(U2R)attacks got 96.73%,Remote to Local(R2L)got 95.89%and Probe attack got 92.67%and is optimal when it is compared with other existing algorithms of K-Means and APRIORI.展开更多
We reported a biopsy proved case of minimal change nephrotic syndrome in a 72-year-old patient. The minimal change nephrotic syndrome has been steroid sensitive, but the patient had 7 relapses over a span of 5 years. ...We reported a biopsy proved case of minimal change nephrotic syndrome in a 72-year-old patient. The minimal change nephrotic syndrome has been steroid sensitive, but the patient had 7 relapses over a span of 5 years. Each time the dose of steroid is tapered, a relapse of the nephrotic syndrome occurred. Eventually, the patient was complaining of dysphagia and difficulty swallowing. Hospital work-up with barium swallow, endoscopy, and CT of the chest, abdomen and pelvis, revealed a focal stenotic lesion with mild to moderate esophageal dysmotility 7/15/2022. A diagnosis of an ulcerating lesion with biopsy confirmed a neuro-endocrine carcinoma of the gastro-esophageal junction was entertained. The CT of the chest/abdomen/pelvis, 7/19/2022, has shown, an esophageal mass of 5.1 × 5.6 × 7 cm of the gastro-esophageal junction with ulceration. No evidence of spread beyond the esophagus and stomach. The histology revealed a poorly differentiated neuroendocrine tumor of the gastro-esophageal junction. The patient underwent several rounds of chemotherapy, radiation, and surgery culminating in tumor control. His nephrotic syndrome was resolved after the tumor has been controlled by surgery and chemotherapy.展开更多
Maximum frequent pattern generation from a large database of transactions and items for association rule mining is an important research topic in data mining. Association rule mining aims to discover interesting corre...Maximum frequent pattern generation from a large database of transactions and items for association rule mining is an important research topic in data mining. Association rule mining aims to discover interesting correlations, frequent patterns, associations, or causal structures between items hidden in a large database. By exploiting quantum computing, we propose an efficient quantum search algorithm design to discover the maximum frequent patterns. We modified Grover’s search algorithm so that a subspace of arbitrary symmetric states is used instead of the whole search space. We presented a novel quantum oracle design that employs a quantum counter to count the maximum frequent items and a quantum comparator to check with a minimum support threshold. The proposed derived algorithm increases the rate of the correct solutions since the search is only in a subspace. Furthermore, our algorithm significantly scales and optimizes the required number of qubits in design, which directly reflected positively on the performance. Our proposed design can accommodate more transactions and items and still have a good performance with a small number of qubits.展开更多
Because mining complete set of frequent patterns from dense database could be impractical, an interesting alternative has been proposed recently. Instead of mining the complete set of frequent patterns, the new model ...Because mining complete set of frequent patterns from dense database could be impractical, an interesting alternative has been proposed recently. Instead of mining the complete set of frequent patterns, the new model only finds out the maximal frequent patterns, which can generate all frequent patterns. FP-growth algorithm is one of the most efficient frequent-pattern mining methods published so far. However, because FP-tree and conditional FP-trees must be two-way traversable, a great deal memory is needed in process of mining. This paper proposes an efficient algorithm Unid_FP-Max for mining maximal frequent patterns based on unidirectional FP-tree. Because of generation method of unidirectional FP-tree and conditional unidirectional FP-trees, the algorithm reduces the space consumption to the fullest extent. With the development of two techniques: single path pruning and header table pruning which can cut down many conditional unidirectional FP-trees generated recursively in mining process, Unid_FP-Max further lowers the expense of time and space.展开更多
In this paper, we propose an enhanced associative classification method by integrating the dynamic property in the process of associative classification. In the proposed method, we employ a support vector machine(SVM...In this paper, we propose an enhanced associative classification method by integrating the dynamic property in the process of associative classification. In the proposed method, we employ a support vector machine(SVM) based method to refine the discovered emerging ~equent patterns for classification rule extension for class label prediction. The empirical study shows that our method can be used to classify increasing resources efficiently and effectively.展开更多
Shake table testing was performed to investigate the dynamic stability of a mid-dip bedding rock slope under frequent earthquakes. Then, numerical modelling was established to further study the slope dynamic stability...Shake table testing was performed to investigate the dynamic stability of a mid-dip bedding rock slope under frequent earthquakes. Then, numerical modelling was established to further study the slope dynamic stability under purely microseisms and the influence of five factors, including seismic amplitude, slope height, slope angle, strata inclination and strata thickness, were considered. The experimental results show that the natural frequency of the slope decreases and damping ratio increases as the earthquake loading times increase. The dynamic strength reduction method is adopted for the stability evaluation of the bedding rock slope in numerical simulation, and the slope stability decreases with the increase of seismic amplitude, increase of slope height, reduction of strata thickness and increase of slope angle. The failure mode of a mid-dip bedding rock slope in the shaking table test is integral slipping along the bedding surface with dipping tensile cracks at the slope rear edge going through the bedding surfaces. In the numerical simulation, the long-term stability of a mid-dip bedding slope is worst under frequent microseisms and the slope is at risk of integral sliding instability, whereas the slope rock mass is more broken than shown in the shaking table test. The research results are of practical significance to better understand the formation mechanism of reservoir landslides and prevent future landslide disasters.展开更多
Numerous models have been proposed to reduce the classification error of Naive Bayes by weakening its attribute independence assumption and some have demonstrated remarkable error performance. Considering that ensembl...Numerous models have been proposed to reduce the classification error of Naive Bayes by weakening its attribute independence assumption and some have demonstrated remarkable error performance. Considering that ensemble learning is an effective method of reducing the classifmation error of the classifier, this paper proposes a double-layer Bayesian classifier ensembles (DLBCE) algorithm based on frequent itemsets. DLBCE constructs a double-layer Bayesian classifier (DLBC) for each frequent itemset the new instance contained and finally ensembles all the classifiers by assigning different weight to different classifier according to the conditional mutual information. The experimental results show that the proposed algorithm outperforms other outstanding algorithms.展开更多
Current technology for frequent itemset mining mostly applies to the data stored in a single transaction database. This paper presents a novel algorithm MultiClose for frequent itemset mining in data warehouses. Multi...Current technology for frequent itemset mining mostly applies to the data stored in a single transaction database. This paper presents a novel algorithm MultiClose for frequent itemset mining in data warehouses. MultiClose respectively computes the results in single dimension tables and merges the results with a very efficient approach. Close itemsets technique is used to improve the performance of the algorithm. The authors propose an efficient implementation for star schemas in which their al- gorithm outperforms state-of-the-art single-table algorithms.展开更多
With the development of information technology, the amount of power grid topology data has gradually increased. Therefore, accurate querying of this data has become particularly important. Several researchers have cho...With the development of information technology, the amount of power grid topology data has gradually increased. Therefore, accurate querying of this data has become particularly important. Several researchers have chosen different indexing methods in the filtering stage to obtain more optimized query results because currently there is no uniform and efficient indexing mechanism that achieves good query results. In the traditional algorithm, the hash table for index storage is prone to "collision" problems, which decrease the index construction efficiency. Aiming at the problem of quick index entry, based on the construction of frequent subgraph indexes, a method of serialized storage optimization based on multiple hash tables is proposed. This method mainly uses the exploration sequence to make the keywords evenly distributed; it avoids conflicts of the stored procedure and performs a quick search of the index. The proposed algorithm mainly adopts the "filterverify" mechanism; in the filtering stage, the index is first established offline, and then the frequent subgraphs are found using the "contains logic" rule to obtain the candidate set. Experimental results show that this method can reduce the time and scale of candidate set generation and improve query efficiency.展开更多
On-site stormwater detention (OSD) is a conventional component of urban drainage systems, designed with the intention of mitigating the increase to peak discharge of stormwater runoff that inevitably results from urba...On-site stormwater detention (OSD) is a conventional component of urban drainage systems, designed with the intention of mitigating the increase to peak discharge of stormwater runoff that inevitably results from urbanization. In Australia, singular temporal patterns for design storms have governed the inputs of hydrograph generation and in turn the design process of OSD for the last three decades. This paper raises the concern that many existing OSD systems designed using the singular temporal pattern for design storms may not be achieving their stated objectives when they are assessed against a variety of alternative temporal patterns. The performance of twenty real OSD systems was investigated using two methods:(1) ensembles of design temporal patterns prescribed in the latest version of Australian Rainfall and Runoff, and (2) real recorded rainfall data taken from pluviograph stations modeled with continuous simulation. It is shown conclusively that the use of singular temporal patterns is ineffective in providing assurance that an OSD will mitigate the increase to peak discharge for all possible storm events. Ensemble analysis is shown to provide improved results. However, it also falls short of providing any guarantee in the face of naturally occurring rainfall.展开更多
A novel binary particle swarm optimization for frequent item sets mining from high-dimensional dataset(BPSO-HD) was proposed, where two improvements were joined. Firstly, the dimensionality reduction of initial partic...A novel binary particle swarm optimization for frequent item sets mining from high-dimensional dataset(BPSO-HD) was proposed, where two improvements were joined. Firstly, the dimensionality reduction of initial particles was designed to ensure the reasonable initial fitness, and then, the dynamically dimensionality cutting of dataset was built to decrease the search space. Based on four high-dimensional datasets, BPSO-HD was compared with Apriori to test its reliability, and was compared with the ordinary BPSO and quantum swarm evolutionary(QSE) to prove its advantages. The experiments show that the results given by BPSO-HD is reliable and better than the results generated by BPSO and QSE.展开更多
Mining frequent pattern in transaction database, time series databases, and many other kinds of databases have been studied popularly in data mining research. Most of the previous studies adopt Apriori like candidat...Mining frequent pattern in transaction database, time series databases, and many other kinds of databases have been studied popularly in data mining research. Most of the previous studies adopt Apriori like candidate set generation and test approach. However, candidate set generation is very costly. Han J. proposed a novel algorithm FP growth that could generate frequent pattern without candidate set. Based on the analysis of the algorithm FP growth, this paper proposes a concept of equivalent FP tree and proposes an improved algorithm, denoted as FP growth * , which is much faster in speed, and easy to realize. FP growth * adopts a modified structure of FP tree and header table, and only generates a header table in each recursive operation and projects the tree to the original FP tree. The two algorithms get the same frequent pattern set in the same transaction database, but the performance study on computer shows that the speed of the improved algorithm, FP growth * , is at least two times as fast as that of FP growth.展开更多
Because data warehouse is frequently changing, incremental data leads to old knowledge which is mined formerly unavailable. In order to maintain the discovered knowledge and patterns dynamically, this study presents a...Because data warehouse is frequently changing, incremental data leads to old knowledge which is mined formerly unavailable. In order to maintain the discovered knowledge and patterns dynamically, this study presents a novel algorithm updating for global frequent patterns-IPARUC. A rapid clustering method is introduced to divide database into n parts in IPARUC firstly, where the data are similar in the same part. Then, the nodes in the tree are adjusted dynamically in inserting process by "pruning and laying back" to keep the frequency descending order so that they can be shared to approaching optimization. Finally local frequent itemsets mined from each local dataset are merged into global frequent itemsets. The results of experimental study are very encouraging. It is obvious from experiment that IPARUC is more effective and efficient than other two contrastive methods. Furthermore, there is significant application potential to a prototype of Web log Analyzer in web usage mining that can help us to discover useful knowledge effectively, even help managers making decision.展开更多
Frequent itemset mining is an essential problem in data mining and plays a key role in many data mining applications.However,users’personal privacy will be leaked in the mining process.In recent years,application of ...Frequent itemset mining is an essential problem in data mining and plays a key role in many data mining applications.However,users’personal privacy will be leaked in the mining process.In recent years,application of local differential privacy protection models to mine frequent itemsets is a relatively reliable and secure protection method.Local differential privacy means that users first perturb the original data and then send these data to the aggregator,preventing the aggregator from revealing the user’s private information.We propose a novel framework that implements frequent itemset mining under local differential privacy and is applicable to user’s multi-attribute.The main technique has bitmap encoding for converting the user’s original data into a binary string.It also includes how to choose the best perturbation algorithm for varying user attributes,and uses the frequent pattern tree(FP-tree)algorithm to mine frequent itemsets.Finally,we incorporate the threshold random response(TRR)algorithm in the framework and compare it with the existing algorithms,and demonstrate that the TRR algorithm has higher accuracy for mining frequent itemsets.展开更多
We propose an efficient hybrid algorithm WDHP in this paper for mining frequent access patterns. WDHP adopts the techniques of DHP to optimize its performance, which is using hash table to filter candidate set and tri...We propose an efficient hybrid algorithm WDHP in this paper for mining frequent access patterns. WDHP adopts the techniques of DHP to optimize its performance, which is using hash table to filter candidate set and trimming database. Whenever the database is trimmed to a size less than a specified threshold, the algorithm puts the database into main memory by constructing a tree, and finds frequent patterns on the tree. The experiment shows that WDHP outperform algorithm DHP and main memory based algorithm WAP in execution efficiency.展开更多
Debris slopes are widely distributed across the Three Gorges Reservoir area in China,and seasonal fluctuations of the water level in the area tend to cause high-frequency microseisms that subsequently induce landslide...Debris slopes are widely distributed across the Three Gorges Reservoir area in China,and seasonal fluctuations of the water level in the area tend to cause high-frequency microseisms that subsequently induce landslides on such debris slopes.In this study,a cumulative damage model of debris slope with varying slope characteristics under the effects of frequent microseisms was established,based on the accurate definition of slope damage variables.The cumulative damage behaviour and the mechanisms of slope instability and sliding under frequent microseisms were thus systematically investigated through a series of shaking table tests and discrete element numerical simulations,and the influences of related parameters such as bedrock,dry density and stone content were discussed.The results showed that the instability mode of a debris slope can be divided into a vibration-compaction stage,a crack generation stage,a crack development stage,and an instability stage.Under the action of frequent microseisms,debris slope undergoes the last three stages cyclically,which causes the accumulation to slide out in layers under the synergistic action of tension and shear,causing the slope to become destabilised.There are two sliding surfaces as well as the parallel tensile surfaces in the final instability of the debris slope.In the process of instability,the development trend of the damage accumulation curve remains similar for debris slopes with different parameters.However,the initial vibration compaction effect in the bedrock-free model is stronger than that in the bedrock model,with the overall cumulative damage degree in the former being lower than that of the latter.The damage degree of the debris slope with high dry density also develops more slowly than that of the debris slope with low dry density.The damage development rate of the debris slope does not always decrease with the increase of stone content.The damage degree growth rate of the debris slope with the optimal stone content is the lowest,and the increase or decrease of the stone content makes the debris slope instability happen earlier.The numerical simulation study also further reveals that the damage in the debris slope mainly develops in the form of crack formation and penetration,in which,shear failure occurs more frequently in the debris slope.展开更多
基金supported by the Research on Key Technologies and Typical Applications of Big Data in Railway Production and Operation(P2023S006)the Fundamental Research Funds for the Central Universities(2022JBZY023).
文摘It is of great significance to improve the efficiency of railway production and operation by realizing the fault knowledge association through the efficient data mining algorithm.However,high utility quantitative frequent pattern mining algorithms in the field of data mining still suffer from the problems of low time-memory performance and are not easy to scale up.In the context of such needs,we propose a related degree-based frequent pattern mining algorithm,named Related High Utility Quantitative Item set Mining(RHUQI-Miner),to enable the effective mining of railway fault data.The algorithm constructs the item-related degree structure of fault data and gives a pruning optimization strategy to find frequent patterns with higher related degrees,reducing redundancy and invalid frequent patterns.Subsequently,it uses the fixed pattern length strategy to modify the utility information of the item in the mining process so that the algorithm can control the length of the output frequent pattern according to the actual data situation and further improve the performance and practicability of the algorithm.The experimental results on the real fault dataset show that RHUQI-Miner can effectively reduce the time and memory consumption in the mining process,thus providing data support for differentiated and precise maintenance strategies.
基金The National Natural Science Foundation of China(No.60603047)the Natural Science Foundation of Liaoning ProvinceLiaoning Higher Education Research Foundation(No.2008341)
文摘A new algorithm based on an FC-tree (frequent closed pattern tree) and a max-FCIA (maximal frequent closed itemsets algorithm) is presented, which is used to mine the frequent closed itemsets for solving memory and time consuming problems. This algorithm maps the transaction database by using a Hash table,gets the support of all frequent itemsets through operating the Hash table and forms a lexicographic subset tree including the frequent itemsets.Efficient pruning methods are used to get the FC-tree including all the minimum frequent closed itemsets through processing the lexicographic subset tree.Finally,frequent closed itemsets are generated from minimum frequent closed itemsets.The experimental results show that the mapping transaction database is introduced in the algorithm to reduce time consumption and to improve the efficiency of the program.Furthermore,the effective pruning strategy restrains the number of candidates,which saves space.The results show that the algorithm is effective.
文摘Association rules mining is a major data mining field that leads to discovery of associations and correlations among items in today’s big data environment. The conventional association rule mining focuses mainly on positive itemsets generated from frequently occurring itemsets (PFIS). However, there has been a significant study focused on infrequent itemsets with utilization of negative association rules to mine interesting frequent itemsets (NFIS) from transactions. In this work, we propose an efficient backward calculating negative frequent itemset algorithm namely EBC-NFIS for computing backward supports that can extract both positive and negative frequent itemsets synchronously from dataset. EBC-NFIS algorithm is based on popular e-NFIS algorithm that computes supports of negative itemsets from the supports of positive itemsets. The proposed algorithm makes use of previously computed supports from memory to minimize the computation time. In addition, association rules, i.e. positive and negative association rules (PNARs) are generated from discovered frequent itemsets using EBC-NFIS algorithm. The efficiency of the proposed algorithm is verified by several experiments and comparing results with e-NFIS algorithm. The experimental results confirm that the proposed algorithm successfully discovers NFIS and PNARs and runs significantly faster than conventional e-NFIS algorithm.
文摘Periodic patternmining has become a popular research subject in recent years;this approach involves the discoveryof frequently recurring patterns in a transaction sequence. However, previous algorithms for periodic patternmining have ignored the utility (profit, value) of patterns. Additionally, these algorithms only identify periodicpatterns in a single sequence. However, identifying patterns of high utility that are common to a set of sequencesis more valuable. In several fields, identifying high-utility periodic frequent patterns in multiple sequences isimportant. In this study, an efficient algorithm called MHUPFPS was proposed to identify such patterns. To addressexisting problems, three new measures are defined: the utility, high support, and high-utility period sequenceratios. Further, a new upper bound, upSeqRa, and two new pruning properties were proposed. MHUPFPS usesa newly defined HUPFPS-list structure to significantly accelerate the reduction of the search space and improvethe overall performance of the algorithm. Furthermore, the proposed algorithmis evaluated using several datasets.The experimental results indicate that the algorithm is accurate and effective in filtering several non-high-utilityperiodic frequent patterns.
文摘In the network security system,intrusion detection plays a significant role.The network security system detects the malicious actions in the network and also conforms the availability,integrity and confidentiality of data informa-tion resources.Intrusion identification system can easily detect the false positive alerts.If large number of false positive alerts are created then it makes intrusion detection system as difficult to differentiate the false positive alerts from genuine attacks.Many research works have been done.The issues in the existing algo-rithms are more memory space and need more time to execute the transactions of records.This paper proposes a novel framework of network security Intrusion Detection System(IDS)using Modified Frequent Pattern(MFP-Tree)via K-means algorithm.The accuracy rate of Modified Frequent Pattern Tree(MFPT)-K means method infinding the various attacks are Normal 94.89%,for DoS based attack 98.34%,for User to Root(U2R)attacks got 96.73%,Remote to Local(R2L)got 95.89%and Probe attack got 92.67%and is optimal when it is compared with other existing algorithms of K-Means and APRIORI.
文摘We reported a biopsy proved case of minimal change nephrotic syndrome in a 72-year-old patient. The minimal change nephrotic syndrome has been steroid sensitive, but the patient had 7 relapses over a span of 5 years. Each time the dose of steroid is tapered, a relapse of the nephrotic syndrome occurred. Eventually, the patient was complaining of dysphagia and difficulty swallowing. Hospital work-up with barium swallow, endoscopy, and CT of the chest, abdomen and pelvis, revealed a focal stenotic lesion with mild to moderate esophageal dysmotility 7/15/2022. A diagnosis of an ulcerating lesion with biopsy confirmed a neuro-endocrine carcinoma of the gastro-esophageal junction was entertained. The CT of the chest/abdomen/pelvis, 7/19/2022, has shown, an esophageal mass of 5.1 × 5.6 × 7 cm of the gastro-esophageal junction with ulceration. No evidence of spread beyond the esophagus and stomach. The histology revealed a poorly differentiated neuroendocrine tumor of the gastro-esophageal junction. The patient underwent several rounds of chemotherapy, radiation, and surgery culminating in tumor control. His nephrotic syndrome was resolved after the tumor has been controlled by surgery and chemotherapy.
文摘Maximum frequent pattern generation from a large database of transactions and items for association rule mining is an important research topic in data mining. Association rule mining aims to discover interesting correlations, frequent patterns, associations, or causal structures between items hidden in a large database. By exploiting quantum computing, we propose an efficient quantum search algorithm design to discover the maximum frequent patterns. We modified Grover’s search algorithm so that a subspace of arbitrary symmetric states is used instead of the whole search space. We presented a novel quantum oracle design that employs a quantum counter to count the maximum frequent items and a quantum comparator to check with a minimum support threshold. The proposed derived algorithm increases the rate of the correct solutions since the search is only in a subspace. Furthermore, our algorithm significantly scales and optimizes the required number of qubits in design, which directly reflected positively on the performance. Our proposed design can accommodate more transactions and items and still have a good performance with a small number of qubits.
基金Supported by the National Natural Science Foundation of China ( No.60474022)Henan Innovation Project for University Prominent Research Talents (No.2007KYCX018)
文摘Because mining complete set of frequent patterns from dense database could be impractical, an interesting alternative has been proposed recently. Instead of mining the complete set of frequent patterns, the new model only finds out the maximal frequent patterns, which can generate all frequent patterns. FP-growth algorithm is one of the most efficient frequent-pattern mining methods published so far. However, because FP-tree and conditional FP-trees must be two-way traversable, a great deal memory is needed in process of mining. This paper proposes an efficient algorithm Unid_FP-Max for mining maximal frequent patterns based on unidirectional FP-tree. Because of generation method of unidirectional FP-tree and conditional unidirectional FP-trees, the algorithm reduces the space consumption to the fullest extent. With the development of two techniques: single path pruning and header table pruning which can cut down many conditional unidirectional FP-trees generated recursively in mining process, Unid_FP-Max further lowers the expense of time and space.
基金Supported by the National High Technology Research and Development Program of China (No. 2007AA01Z132) the National Natural Science Foundation of China (No.60775035, 60933004, 60970088, 60903141)+1 种基金 the National Basic Research Priorities Programme (No. 2007CB311004) the National Science and Technology Support Plan (No.2006BAC08B06).
文摘In this paper, we propose an enhanced associative classification method by integrating the dynamic property in the process of associative classification. In the proposed method, we employ a support vector machine(SVM) based method to refine the discovered emerging ~equent patterns for classification rule extension for class label prediction. The empirical study shows that our method can be used to classify increasing resources efficiently and effectively.
基金National Natural Science Foundation of China under Grant No. 41372356the College Cultivation Project of the National Natural Science Foundation of China under Grant No. 2018PY30+1 种基金the Basic Research and Frontier Exploration Project of Chongqing,China under Grant No. cstc2018jcyj A1597the Graduate Scientific Research and Innovation Foundation of Chongqing,China under Grant No. CYS18026。
文摘Shake table testing was performed to investigate the dynamic stability of a mid-dip bedding rock slope under frequent earthquakes. Then, numerical modelling was established to further study the slope dynamic stability under purely microseisms and the influence of five factors, including seismic amplitude, slope height, slope angle, strata inclination and strata thickness, were considered. The experimental results show that the natural frequency of the slope decreases and damping ratio increases as the earthquake loading times increase. The dynamic strength reduction method is adopted for the stability evaluation of the bedding rock slope in numerical simulation, and the slope stability decreases with the increase of seismic amplitude, increase of slope height, reduction of strata thickness and increase of slope angle. The failure mode of a mid-dip bedding rock slope in the shaking table test is integral slipping along the bedding surface with dipping tensile cracks at the slope rear edge going through the bedding surfaces. In the numerical simulation, the long-term stability of a mid-dip bedding slope is worst under frequent microseisms and the slope is at risk of integral sliding instability, whereas the slope rock mass is more broken than shown in the shaking table test. The research results are of practical significance to better understand the formation mechanism of reservoir landslides and prevent future landslide disasters.
基金supported by National Natural Science Foundation of China (Nos. 61073133, 60973067, and 61175053)Fundamental Research Funds for the Central Universities of China(No. 2011ZD010)
文摘Numerous models have been proposed to reduce the classification error of Naive Bayes by weakening its attribute independence assumption and some have demonstrated remarkable error performance. Considering that ensemble learning is an effective method of reducing the classifmation error of the classifier, this paper proposes a double-layer Bayesian classifier ensembles (DLBCE) algorithm based on frequent itemsets. DLBCE constructs a double-layer Bayesian classifier (DLBC) for each frequent itemset the new instance contained and finally ensembles all the classifiers by assigning different weight to different classifier according to the conditional mutual information. The experimental results show that the proposed algorithm outperforms other outstanding algorithms.
文摘Current technology for frequent itemset mining mostly applies to the data stored in a single transaction database. This paper presents a novel algorithm MultiClose for frequent itemset mining in data warehouses. MultiClose respectively computes the results in single dimension tables and merges the results with a very efficient approach. Close itemsets technique is used to improve the performance of the algorithm. The authors propose an efficient implementation for star schemas in which their al- gorithm outperforms state-of-the-art single-table algorithms.
基金supported by the State Grid Science and Technology Project (Title: Research on High Performance Analysis Technology of Power Grid GIS Topology Based on Graph Database, 5455HJ160005)
文摘With the development of information technology, the amount of power grid topology data has gradually increased. Therefore, accurate querying of this data has become particularly important. Several researchers have chosen different indexing methods in the filtering stage to obtain more optimized query results because currently there is no uniform and efficient indexing mechanism that achieves good query results. In the traditional algorithm, the hash table for index storage is prone to "collision" problems, which decrease the index construction efficiency. Aiming at the problem of quick index entry, based on the construction of frequent subgraph indexes, a method of serialized storage optimization based on multiple hash tables is proposed. This method mainly uses the exploration sequence to make the keywords evenly distributed; it avoids conflicts of the stored procedure and performs a quick search of the index. The proposed algorithm mainly adopts the "filterverify" mechanism; in the filtering stage, the index is first established offline, and then the frequent subgraphs are found using the "contains logic" rule to obtain the candidate set. Experimental results show that this method can reduce the time and scale of candidate set generation and improve query efficiency.
文摘On-site stormwater detention (OSD) is a conventional component of urban drainage systems, designed with the intention of mitigating the increase to peak discharge of stormwater runoff that inevitably results from urbanization. In Australia, singular temporal patterns for design storms have governed the inputs of hydrograph generation and in turn the design process of OSD for the last three decades. This paper raises the concern that many existing OSD systems designed using the singular temporal pattern for design storms may not be achieving their stated objectives when they are assessed against a variety of alternative temporal patterns. The performance of twenty real OSD systems was investigated using two methods:(1) ensembles of design temporal patterns prescribed in the latest version of Australian Rainfall and Runoff, and (2) real recorded rainfall data taken from pluviograph stations modeled with continuous simulation. It is shown conclusively that the use of singular temporal patterns is ineffective in providing assurance that an OSD will mitigate the increase to peak discharge for all possible storm events. Ensemble analysis is shown to provide improved results. However, it also falls short of providing any guarantee in the face of naturally occurring rainfall.
文摘A novel binary particle swarm optimization for frequent item sets mining from high-dimensional dataset(BPSO-HD) was proposed, where two improvements were joined. Firstly, the dimensionality reduction of initial particles was designed to ensure the reasonable initial fitness, and then, the dynamically dimensionality cutting of dataset was built to decrease the search space. Based on four high-dimensional datasets, BPSO-HD was compared with Apriori to test its reliability, and was compared with the ordinary BPSO and quantum swarm evolutionary(QSE) to prove its advantages. The experiments show that the results given by BPSO-HD is reliable and better than the results generated by BPSO and QSE.
基金theFundoftheNationalManagementBureauofTraditionalChineseMedicine(No .2 0 0 0 J P 5 4 )
文摘Mining frequent pattern in transaction database, time series databases, and many other kinds of databases have been studied popularly in data mining research. Most of the previous studies adopt Apriori like candidate set generation and test approach. However, candidate set generation is very costly. Han J. proposed a novel algorithm FP growth that could generate frequent pattern without candidate set. Based on the analysis of the algorithm FP growth, this paper proposes a concept of equivalent FP tree and proposes an improved algorithm, denoted as FP growth * , which is much faster in speed, and easy to realize. FP growth * adopts a modified structure of FP tree and header table, and only generates a header table in each recursive operation and projects the tree to the original FP tree. The two algorithms get the same frequent pattern set in the same transaction database, but the performance study on computer shows that the speed of the improved algorithm, FP growth * , is at least two times as fast as that of FP growth.
基金Supported by the National Natural Science Foundation of China(60472099)Ningbo Natural Science Foundation(2006A610017)
文摘Because data warehouse is frequently changing, incremental data leads to old knowledge which is mined formerly unavailable. In order to maintain the discovered knowledge and patterns dynamically, this study presents a novel algorithm updating for global frequent patterns-IPARUC. A rapid clustering method is introduced to divide database into n parts in IPARUC firstly, where the data are similar in the same part. Then, the nodes in the tree are adjusted dynamically in inserting process by "pruning and laying back" to keep the frequency descending order so that they can be shared to approaching optimization. Finally local frequent itemsets mined from each local dataset are merged into global frequent itemsets. The results of experimental study are very encouraging. It is obvious from experiment that IPARUC is more effective and efficient than other two contrastive methods. Furthermore, there is significant application potential to a prototype of Web log Analyzer in web usage mining that can help us to discover useful knowledge effectively, even help managers making decision.
基金This paper is supported by the Inner Mongolia Natural Science Foundation(Grant Number:2018MS06026,Sponsored Authors:Liu,H.and Ma,X.,Sponsors’Websites:http://kjt.nmg.gov.cn/)the Science and Technology Program of Inner Mongolia Autonomous Region(Grant Number:2019GG116,Sponsored Authors:Liu,H.and Ma,X.,Sponsors’Websites:http://kjt.nmg.gov.cn/).
文摘Frequent itemset mining is an essential problem in data mining and plays a key role in many data mining applications.However,users’personal privacy will be leaked in the mining process.In recent years,application of local differential privacy protection models to mine frequent itemsets is a relatively reliable and secure protection method.Local differential privacy means that users first perturb the original data and then send these data to the aggregator,preventing the aggregator from revealing the user’s private information.We propose a novel framework that implements frequent itemset mining under local differential privacy and is applicable to user’s multi-attribute.The main technique has bitmap encoding for converting the user’s original data into a binary string.It also includes how to choose the best perturbation algorithm for varying user attributes,and uses the frequent pattern tree(FP-tree)algorithm to mine frequent itemsets.Finally,we incorporate the threshold random response(TRR)algorithm in the framework and compare it with the existing algorithms,and demonstrate that the TRR algorithm has higher accuracy for mining frequent itemsets.
文摘We propose an efficient hybrid algorithm WDHP in this paper for mining frequent access patterns. WDHP adopts the techniques of DHP to optimize its performance, which is using hash table to filter candidate set and trimming database. Whenever the database is trimmed to a size less than a specified threshold, the algorithm puts the database into main memory by constructing a tree, and finds frequent patterns on the tree. The experiment shows that WDHP outperform algorithm DHP and main memory based algorithm WAP in execution efficiency.
基金funded by the Natural Science Foundation of Chongqing municipality(Grant No.CSTC2021JCYJMSXMX0558)the National Key R&D Program of China(Grant No.2018YFC1504802)the Fundamental Research Funds for the Central Universities(Project No.2019CDCG0013)。
文摘Debris slopes are widely distributed across the Three Gorges Reservoir area in China,and seasonal fluctuations of the water level in the area tend to cause high-frequency microseisms that subsequently induce landslides on such debris slopes.In this study,a cumulative damage model of debris slope with varying slope characteristics under the effects of frequent microseisms was established,based on the accurate definition of slope damage variables.The cumulative damage behaviour and the mechanisms of slope instability and sliding under frequent microseisms were thus systematically investigated through a series of shaking table tests and discrete element numerical simulations,and the influences of related parameters such as bedrock,dry density and stone content were discussed.The results showed that the instability mode of a debris slope can be divided into a vibration-compaction stage,a crack generation stage,a crack development stage,and an instability stage.Under the action of frequent microseisms,debris slope undergoes the last three stages cyclically,which causes the accumulation to slide out in layers under the synergistic action of tension and shear,causing the slope to become destabilised.There are two sliding surfaces as well as the parallel tensile surfaces in the final instability of the debris slope.In the process of instability,the development trend of the damage accumulation curve remains similar for debris slopes with different parameters.However,the initial vibration compaction effect in the bedrock-free model is stronger than that in the bedrock model,with the overall cumulative damage degree in the former being lower than that of the latter.The damage degree of the debris slope with high dry density also develops more slowly than that of the debris slope with low dry density.The damage development rate of the debris slope does not always decrease with the increase of stone content.The damage degree growth rate of the debris slope with the optimal stone content is the lowest,and the increase or decrease of the stone content makes the debris slope instability happen earlier.The numerical simulation study also further reveals that the damage in the debris slope mainly develops in the form of crack formation and penetration,in which,shear failure occurs more frequently in the debris slope.