Currently,the top-rank-k has been widely applied to mine frequent patterns with a rank not exceeding k.In the existing algorithms,although a level-wise-search could fully mine the target patterns,it usually leads to t...Currently,the top-rank-k has been widely applied to mine frequent patterns with a rank not exceeding k.In the existing algorithms,although a level-wise-search could fully mine the target patterns,it usually leads to the delay of high rank patterns generation,resulting in the slow growth of the support threshold and the mining efficiency.Aiming at this problem,a greedy-strategy-based top-rank-k frequent patterns hybrid mining algorithm(GTK)is proposed in this paper.In this algorithm,top-rank-k patterns are stored in a static doubly linked list called RSL,and the patterns are divided into short patterns and long patterns.The short patterns generated by a rank-first-search always joins the two patterns of the highest rank in RSL that have not yet been joined.On the basis of the short patterns satisfying specific conditions,the long patterns are extracted through level-wise-search.To reduce redundancy,GTK improves the generation method of subsume index and designs the new pruning strategies of candidates.This algorithm also takes the use of reasonable pruning strategies to reduce the amount of computation to improve the computational speed.Real datasets and synthetic datasets are adopted in experiments to evaluate the proposed algorithm.The experimental results show the obvious advantages in both time efficiency and space efficiency of GTK.展开更多
The task of mining erasable patterns(EPs)is a data mining problem that can help factory managers come up with the best product plans for the future.This problem has been studied by many scientists in recent times,and ...The task of mining erasable patterns(EPs)is a data mining problem that can help factory managers come up with the best product plans for the future.This problem has been studied by many scientists in recent times,and many approaches for mining EPs have been proposed.Erasable closed patterns(ECPs)are an abbreviated representation of EPs and can be con-sidered condensed representations of EPs without information loss.Current methods of mining ECPs identify huge numbers of such patterns,whereas intelligent systems only need a small number.A ranking process therefore needs to be applied prior to use,which causes a reduction in efficiency.To overcome this limitation,this study presents a robust method for mining top-rank-k ECPs in which the mining and ranking phases are combined into a single step.First,we propose a virtual-threshold-based pruning strategy to improve the mining speed.Based on this strategy and dPidset structure,we then develop a fast algorithm for mining top-rank-k ECPs,which we call TRK-ECP.Finally,we carry out experiments to compare the runtime of our TRK-ECP algorithm with two algorithms modified from dVM and TEPUS(Top-rank-k Erasable Pattern mining Using the Subsume concept),which are state-of-the-art algorithms for mining top-rank-k EPs.The results for the running time confirm that TRK-ECP outperforms the other experimental approaches in terms of mining the top-rank-k ECPs.展开更多
In our today’s life, it is obvious that cloud computing is one of the new and most important innovations in the field of information technology which constitutes the ground for speeding up the development in great si...In our today’s life, it is obvious that cloud computing is one of the new and most important innovations in the field of information technology which constitutes the ground for speeding up the development in great size storage of data as well as the processing and distribution of data on the largest scale. In other words, the most important interests of any data owner nowadays are related to all of the security as well as the privacy of data, especially in the case of outsourcing private data on a cloud server publicly which has not been one of the well-trusted and reliable domains. With the aim of avoiding any leakage or disclosure of information, we will encrypt any information important or confidential prior to being uploaded to the server and this may lead to an obstacle which encounters any attempt to support any efficient keyword query to be and ranked with matching results on such encrypted data. Recent researches conducted in this area have focused on a single keyword query with no proper ranking scheme in hand. In this paper, we will propose a new model called Secure Model for Preserving Privacy Over Encrypted Cloud Computing (SPEC) to improve the performance of cloud computing and to safeguard privacy of data in comparison to the results of previous researches in regard to accuracy, privacy, security, key generation, storage capacity as well as trapdoor, index generation, index encryption, index update, and finally files retrieval depending on access frequency.展开更多
基金This research was supported in part by the Hunan Province’s Strategic and Emerging Industrial Projects under Grant 2018GK4035in part by the Hunan Province’s Changsha Zhuzhou Xiangtan National Independent Innovation Demonstration Zone projects under Grant 2017XK2058+1 种基金in part by the National Natural Science Foundation of China under Grant 61602171in part by the Scientific Research Fund of Hunan Provincial Education Department under Grant 17C0960 and 18B037.
文摘Currently,the top-rank-k has been widely applied to mine frequent patterns with a rank not exceeding k.In the existing algorithms,although a level-wise-search could fully mine the target patterns,it usually leads to the delay of high rank patterns generation,resulting in the slow growth of the support threshold and the mining efficiency.Aiming at this problem,a greedy-strategy-based top-rank-k frequent patterns hybrid mining algorithm(GTK)is proposed in this paper.In this algorithm,top-rank-k patterns are stored in a static doubly linked list called RSL,and the patterns are divided into short patterns and long patterns.The short patterns generated by a rank-first-search always joins the two patterns of the highest rank in RSL that have not yet been joined.On the basis of the short patterns satisfying specific conditions,the long patterns are extracted through level-wise-search.To reduce redundancy,GTK improves the generation method of subsume index and designs the new pruning strategies of candidates.This algorithm also takes the use of reasonable pruning strategies to reduce the amount of computation to improve the computational speed.Real datasets and synthetic datasets are adopted in experiments to evaluate the proposed algorithm.The experimental results show the obvious advantages in both time efficiency and space efficiency of GTK.
文摘The task of mining erasable patterns(EPs)is a data mining problem that can help factory managers come up with the best product plans for the future.This problem has been studied by many scientists in recent times,and many approaches for mining EPs have been proposed.Erasable closed patterns(ECPs)are an abbreviated representation of EPs and can be con-sidered condensed representations of EPs without information loss.Current methods of mining ECPs identify huge numbers of such patterns,whereas intelligent systems only need a small number.A ranking process therefore needs to be applied prior to use,which causes a reduction in efficiency.To overcome this limitation,this study presents a robust method for mining top-rank-k ECPs in which the mining and ranking phases are combined into a single step.First,we propose a virtual-threshold-based pruning strategy to improve the mining speed.Based on this strategy and dPidset structure,we then develop a fast algorithm for mining top-rank-k ECPs,which we call TRK-ECP.Finally,we carry out experiments to compare the runtime of our TRK-ECP algorithm with two algorithms modified from dVM and TEPUS(Top-rank-k Erasable Pattern mining Using the Subsume concept),which are state-of-the-art algorithms for mining top-rank-k EPs.The results for the running time confirm that TRK-ECP outperforms the other experimental approaches in terms of mining the top-rank-k ECPs.
文摘In our today’s life, it is obvious that cloud computing is one of the new and most important innovations in the field of information technology which constitutes the ground for speeding up the development in great size storage of data as well as the processing and distribution of data on the largest scale. In other words, the most important interests of any data owner nowadays are related to all of the security as well as the privacy of data, especially in the case of outsourcing private data on a cloud server publicly which has not been one of the well-trusted and reliable domains. With the aim of avoiding any leakage or disclosure of information, we will encrypt any information important or confidential prior to being uploaded to the server and this may lead to an obstacle which encounters any attempt to support any efficient keyword query to be and ranked with matching results on such encrypted data. Recent researches conducted in this area have focused on a single keyword query with no proper ranking scheme in hand. In this paper, we will propose a new model called Secure Model for Preserving Privacy Over Encrypted Cloud Computing (SPEC) to improve the performance of cloud computing and to safeguard privacy of data in comparison to the results of previous researches in regard to accuracy, privacy, security, key generation, storage capacity as well as trapdoor, index generation, index encryption, index update, and finally files retrieval depending on access frequency.