BACKGROUND It is increasingly common to find patients affected by a combination of type 2 diabetes mellitus(T2DM)and coronary artery disease(CAD),and studies are able to correlate their relationships with available bi...BACKGROUND It is increasingly common to find patients affected by a combination of type 2 diabetes mellitus(T2DM)and coronary artery disease(CAD),and studies are able to correlate their relationships with available biological and clinical evidence.The aim of the current study was to apply association rule mining(ARM)to discover whether there are consistent patterns of clinical features relevant to these diseases.ARM leverages clinical and laboratory data to the meaningful patterns for diabetic CAD by harnessing the power help of data-driven algorithms to optimise the decision-making in patient care.AIM To reinforce the evidence of the T2DM-CAD interplay and demonstrate the ability of ARM to provide new insights into multivariate pattern discovery.METHODS This cross-sectional study was conducted at the Department of Biochemistry in a specialized tertiary care centre in Delhi,involving a total of 300 consented subjects categorized into three groups:CAD with diabetes,CAD without diabetes,and healthy controls,with 100 subjects in each group.The participants were enrolled from the Cardiology IPD&OPD for the sample collection.The study employed ARM technique to extract the meaningful patterns and relationships from the clinical data with its original value.RESULTS The clinical dataset comprised 35 attributes from enrolled subjects.The analysis produced rules with a maximum branching factor of 4 and a rule length of 5,necessitating a 1%probability increase for enhancement.Prominent patterns emerged,highlighting strong links between health indicators and diabetes likelihood,particularly elevated HbA1C and random blood sugar levels.The ARM technique identified individuals with a random blood sugar level>175 and HbA1C>6.6 are likely in the“CAD-with-diabetes”group,offering valuable insights into health indicators and influencing factors on disease outcomes.CONCLUSION The application of this method holds promise for healthcare practitioners to offer valuable insights for enhancing patient treatment targeting specific subtypes of CAD with diabetes.Implying artificial intelligence techniques with medical data,we have shown the potential for personalized healthcare and the development of user-friendly applications aimed at improving cardiovascular health outcomes for this high-risk population to optimise the decision-making in patient care.展开更多
In order to discover the main causes of elevator group accidents in edge computing environment, a multi-dimensional data model of elevator accident data is established by using data cube technology, proposing and impl...In order to discover the main causes of elevator group accidents in edge computing environment, a multi-dimensional data model of elevator accident data is established by using data cube technology, proposing and implementing a method by combining classical Apriori algorithm with the model, digging out frequent items of elevator accident data to explore the main reasons for the occurrence of elevator accidents. In addition, a collaborative edge model of elevator accidents is set to achieve data sharing, making it possible to check the detail of each cause to confirm the causes of elevator accidents. Lastly the association rules are applied to find the law of elevator Accidents.展开更多
Maximum frequent pattern generation from a large database of transactions and items for association rule mining is an important research topic in data mining. Association rule mining aims to discover interesting corre...Maximum frequent pattern generation from a large database of transactions and items for association rule mining is an important research topic in data mining. Association rule mining aims to discover interesting correlations, frequent patterns, associations, or causal structures between items hidden in a large database. By exploiting quantum computing, we propose an efficient quantum search algorithm design to discover the maximum frequent patterns. We modified Grover’s search algorithm so that a subspace of arbitrary symmetric states is used instead of the whole search space. We presented a novel quantum oracle design that employs a quantum counter to count the maximum frequent items and a quantum comparator to check with a minimum support threshold. The proposed derived algorithm increases the rate of the correct solutions since the search is only in a subspace. Furthermore, our algorithm significantly scales and optimizes the required number of qubits in design, which directly reflected positively on the performance. Our proposed design can accommodate more transactions and items and still have a good performance with a small number of qubits.展开更多
Although association rule mining is an important pattern recognition and data analysis technique, extracting and finding significant rules from a large collection has always been challenging. The ability of informatio...Although association rule mining is an important pattern recognition and data analysis technique, extracting and finding significant rules from a large collection has always been challenging. The ability of information visualization to enable users to gain an understanding of high dimensional and large-scale data can play a major role in the exploration, identification, and interpretation of association rules. In this paper, we propose a method that provides multiple views of the association rules, linked together through a filtering mechanism. A visual inspection of the entire association rule set is enabled within a matrix view. Items of interest can be selected, resulting in their corresponding association rules being shown in a graph view. At any time, individual rules can be selected in either view, resulting in their information being shown in the detail view. The fundamental premise in this work is that by providing such a visual and interactive representation of the association rules, users will be able to find important rules quickly and easily, even as the number of rules that must be inspected becomes large. A user evaluation was conducted which validates this premise.展开更多
The typical model, which involves the measures: support, confidence, and interest, is often adapted to mining association rules. In the model, the related parameters are usually chosen by experience; consequently, th...The typical model, which involves the measures: support, confidence, and interest, is often adapted to mining association rules. In the model, the related parameters are usually chosen by experience; consequently, the number of useful rules is hard to estimate. If the number is too large, we cannot effectively extract the meaningful rules. This paper analyzes the meanings of the parameters and designs a variety of equations between the number of rules and the parameters by using regression method. Finally, we experimentally obtain a preferable regression equation. This paper uses multiple correlation coeficients to test the fitting efiects of the equations and uses significance test to verify whether the coeficients of parameters are significantly zero or not. The regression equation that has a larger multiple correlation coeficient will be chosen as the optimally fitted equation. With the selected optimal equation, we can predict the number of rules under the given parameters and further optimize the choice of the three parameters and determine their ranges of values.展开更多
As data mining more and more popular applied in computer system,the quality as-surance test of its software would be get more and more attention.However,because of the ex-istence of the 'oracle' problem,the tr...As data mining more and more popular applied in computer system,the quality as-surance test of its software would be get more and more attention.However,because of the ex-istence of the 'oracle' problem,the traditional test method is not ease fit for the application program in the field of the data mining.In this paper,based on metamorphic testing,a software testing method is proposed in the field of the data mining,makes an association rules algorithm as the specific case,and constructs the metamorphic relation on the algorithm.Experiences show that the method can achieve the testing target and is feasible to apply to other domain.展开更多
Discovering cyclic generalized association rules from transaction datbases can reveal the relationship of differ-ent levels of the taxonomies and display cyclic variations over time.Information about such variations i...Discovering cyclic generalized association rules from transaction datbases can reveal the relationship of differ-ent levels of the taxonomies and display cyclic variations over time.Information about such variations is great use of better identifying trends in associations and forecast-ing.Because cyclic rules are quite sensitive to a littlenoise,this paper uses the noise-ratio as the criterion of i-dentifing cydclic itemsets for dealing with the problem and utilizes the cycle-pruning technique to reduce the comput-ing time of the data mining process by exploiting the real-tionship between the cycle and generalized frequent item-sets.The paper gives the algorithm of mining cyclic gen-eralized itemsets(CGI).Experiment shows that the CGI algorithm can efficiently yield results.展开更多
Association rule mining methods, as a set of important data mining tools, could be used for mining spatial association rules of spatial data. However, applications of these methods are limited for mining results conta...Association rule mining methods, as a set of important data mining tools, could be used for mining spatial association rules of spatial data. However, applications of these methods are limited for mining results containing large number of redundant rules. In this paper, a new method named Geo-Filtered Association Rules Mining(GFARM) is proposed to effectively eliminate the redundant rules. An application of GFARM is performed as a case study in which association rules are discovered between building land distribution and potential driving factors in Wuhan, China from 1995 to 2015. Ten sets of regular sampling grids with different sizes are used for detecting the influence of multi-scales on GFARM. Results show that the proposed method can filter 50%–70% of redundant rules. GFARM is also successful in discovering spatial association pattern between building land distribution and driving factors.展开更多
Data-mining techniques have been developed to turn data into useful task-oriented knowledge. Most algorithms for mining association rules identify relationships among transactions using binary values and find rules at...Data-mining techniques have been developed to turn data into useful task-oriented knowledge. Most algorithms for mining association rules identify relationships among transactions using binary values and find rules at a single-concept level. Extracting multilevel association rules in transaction databases is most commonly used in data mining. This paper proposes a multilevel fuzzy association rule mining model for extraction of implicit knowledge which stored as quantitative values in transactions. For this reason it uses different support value at each level as well as different membership function for each item. By integrating fuzzy-set concepts, data-mining technologies and multiple-level taxonomy, our method finds fuzzy association rules from transaction data sets. This approach adopts a top-down progressively deepening approach to derive large itemsets and also incorporates fuzzy boundaries instead of sharp boundary intervals. Comparing our method with previous ones in simulation shows that the proposed method maintains higher precision, the mined rules are closer to reality, and it gives ability to mine association rules at different levels based on the user’s tendency as well.展开更多
The Apriori algorithm is a classical method of association rules mining.Based on analysis of this theory,the paper provides an improved Apriori algorithm.The paper puts foward with algorithm combines HASH table techni...The Apriori algorithm is a classical method of association rules mining.Based on analysis of this theory,the paper provides an improved Apriori algorithm.The paper puts foward with algorithm combines HASH table technique and reduction of candidate item sets to enhance the usage efficiency of resources as well as the individualized service of the data library.展开更多
The traditional generalization-based knowledge discovery method is introduced. A new kind of multilevel spatial association of the rules mining method based on the cloud model is presented. The cloud model integrates ...The traditional generalization-based knowledge discovery method is introduced. A new kind of multilevel spatial association of the rules mining method based on the cloud model is presented. The cloud model integrates the vague and random use of linguistic terms in a unified way. With these models, spatial and nonspatial attribute values are well generalized at multiple levels, allowing discovery of strong spatial association rules. Combining the cloud model based method with Apriori algorithms for mining association rules from a spatial database shows benefits in being effective and flexible.展开更多
Based on the rough set theory which is a powerful tool in dealing with vagueness and uncertainty, an algorithm to mine association rules in incomplete information systems was presented and the support and confidence w...Based on the rough set theory which is a powerful tool in dealing with vagueness and uncertainty, an algorithm to mine association rules in incomplete information systems was presented and the support and confidence were redefined. The algorithm can mine the association rules with decision attributes directly without processing missing values. Using the incomplete dataset Mushroom from UCI machine learning repository, the new algorithm was compared with the classical association rules mining algorithm based on Apriori from the number of rules extracted, testing accuracy and execution time. The experiment results show that the new algorithm has advantages of short execution time and high accuracy.展开更多
At present, most of the association rules algorithms are based on the Boolean attribute and single-level association rules mining. But data of the real world has various types, the multi-level and quantitative attribu...At present, most of the association rules algorithms are based on the Boolean attribute and single-level association rules mining. But data of the real world has various types, the multi-level and quantitative attributes are got more and more attention. And the most important step is to mine frequent sets. In this paper, we propose an algorithm that is called fuzzy multiple-level association (FMA) rules to mine frequent sets. It is based on the improved Eclat algorithm that is different to many researchers’ proposed algorithms thatused the Apriori algorithm. We analyze quantitative data’s frequent sets by using the fuzzy theory, dividing the hierarchy of concept and softening the boundary of attributes’ values and frequency. In this paper, we use the vertical-style data and the improved Eclat algorithm to describe the proposed method, we use this algorithm to analyze the data of Beijing logistics route. Experiments show that the algorithm has a good performance, it has better effectiveness and high efficiency.展开更多
Extracting objects from legacy systems is a basic step in system's object orientation to improve the maintainability and understandability of the systems. A new object extraction model using association rules and...Extracting objects from legacy systems is a basic step in system's object orientation to improve the maintainability and understandability of the systems. A new object extraction model using association rules and dependence analysis is proposed. In this model data are classified by association rules and the corresponding operations are partitioned by dependence analysis.展开更多
We discuss the basic intrusion detection techniques, and focus on how to apply association rules to intrusion detection. Begin with analyzing some close relations between user’s behaviors, we discuss the mining algor...We discuss the basic intrusion detection techniques, and focus on how to apply association rules to intrusion detection. Begin with analyzing some close relations between user’s behaviors, we discuss the mining algorithm of association rules and apply to detect anomaly in IDS. Moreover, according to the characteristic of intrusion detection, we optimize the mining algorithm of association rules, and use fuzzy logic to improve the system performance.展开更多
Typical association rules consider only items enumerated in transactions. Such rules are referred to as positive association rules. Negative association rules also consider the same items, but in addition consider neg...Typical association rules consider only items enumerated in transactions. Such rules are referred to as positive association rules. Negative association rules also consider the same items, but in addition consider negated items (i. e. absent from transactions). Negative association rules are useful in market-basket analysis to identify products that conflict with each other or products that complement each other. They are also very convenient for associative classifiers, classifiers that build their classification model based on association rules. Indeed, mining for such rules necessitates the examination of an exponentially large search space. Despite their usefulness, very few algorithms to mine them have been proposed to date. In this paper, an algorithm based on FP tree is presented to discover negative association rules.展开更多
A new algorithm for mining quantitative association rules with standard SQL is presented. The association rules are evaluated with the sufficiency gene LS of subjectivity Bayes reasoning. This algorithm is proved to b...A new algorithm for mining quantitative association rules with standard SQL is presented. The association rules are evaluated with the sufficiency gene LS of subjectivity Bayes reasoning. This algorithm is proved to be quick and effective with its application in Lujiang insects and pests database.展开更多
This paper devises a scheme which can discover the state association rules of process object. The scheme aims to dig the hidden close relationships of different links in process object. We adopt a method based on diff...This paper devises a scheme which can discover the state association rules of process object. The scheme aims to dig the hidden close relationships of different links in process object. We adopt a method based on difference and extremum to compute the timing. Clustering is used to classifying the adjusted data, and the next is associating the clusters. Based on the rules of clusters, we produce the rules of links. Association degrees between each two links can be determined. It is easy to get association chains according to the degree. The state association rules that can be obtained in accordance with association rules are the final results. Some industry guidance can be directly summarized from the state association rules, and we can apply the guidance to improve the efficiency of production and operational in allied industries.展开更多
These days, health care systems such as pharmacies and drugstores normally produce high volumes of data. Consequently, utilizing data mining methods in health care systems has become a conventional process. In this re...These days, health care systems such as pharmacies and drugstores normally produce high volumes of data. Consequently, utilizing data mining methods in health care systems has become a conventional process. In this research, Apriori algorithm has been applied to perform data mining using the data obtained from the prescriptions ordered within a pharmacy. Ten association rules were achieved from the assigned pharmaceutical drugs in those prescriptions using the aforementioned Apriori algorithm. The accuracy of these rules is also manually studied and reviewed by a physician. Among these association rules, Vitamin D and Calcium pills are the most interrelated medications, and Omeprazole and Metronidazole rankd second in terms of association. The results of this study provide useful feedback information about associations among drugs.展开更多
Hotspots (active fires) indicate spatial distribution of fires. A study on determining influence factors for hotspot occurrence is essential so that fire events can be predicted based on characteristics of a certain a...Hotspots (active fires) indicate spatial distribution of fires. A study on determining influence factors for hotspot occurrence is essential so that fire events can be predicted based on characteristics of a certain area. This study discovers the possible influence factors on the occurrence of fire events using the association rule algorithm namely Apriori in the study area of Rokan Hilir Riau Province Indonesia. The Apriori algorithm was applied on a forest fire dataset which containeddata on physical environment (land cover, river, road and city center), socio-economic (income source, population, and number of school), weather (precipitation, wind speed, and screen temperature), and peatlands. The experiment results revealed 324 multidimensional association rules indicating relationships between hotspots occurrence and other factors.The association among hotspots occurrence with other geographical objects was discovered for the minimum support of 10% and the minimum confidence of 80%. The results show that strong relations between hotspots occurrence and influence factors are found for the support about 12.42%, the confidence of 1, and the lift of 2.26. These factors are precipitation greater than or equal to 3 mm/day, wind speed in [1m/s, 2m/s), non peatland area, screen temperature in [297K, 298K), the number of school in 1 km2 less than or equal to 0.1, and the distance of each hotspot to the nearest road less than or equal to 2.5 km.展开更多
文摘BACKGROUND It is increasingly common to find patients affected by a combination of type 2 diabetes mellitus(T2DM)and coronary artery disease(CAD),and studies are able to correlate their relationships with available biological and clinical evidence.The aim of the current study was to apply association rule mining(ARM)to discover whether there are consistent patterns of clinical features relevant to these diseases.ARM leverages clinical and laboratory data to the meaningful patterns for diabetic CAD by harnessing the power help of data-driven algorithms to optimise the decision-making in patient care.AIM To reinforce the evidence of the T2DM-CAD interplay and demonstrate the ability of ARM to provide new insights into multivariate pattern discovery.METHODS This cross-sectional study was conducted at the Department of Biochemistry in a specialized tertiary care centre in Delhi,involving a total of 300 consented subjects categorized into three groups:CAD with diabetes,CAD without diabetes,and healthy controls,with 100 subjects in each group.The participants were enrolled from the Cardiology IPD&OPD for the sample collection.The study employed ARM technique to extract the meaningful patterns and relationships from the clinical data with its original value.RESULTS The clinical dataset comprised 35 attributes from enrolled subjects.The analysis produced rules with a maximum branching factor of 4 and a rule length of 5,necessitating a 1%probability increase for enhancement.Prominent patterns emerged,highlighting strong links between health indicators and diabetes likelihood,particularly elevated HbA1C and random blood sugar levels.The ARM technique identified individuals with a random blood sugar level>175 and HbA1C>6.6 are likely in the“CAD-with-diabetes”group,offering valuable insights into health indicators and influencing factors on disease outcomes.CONCLUSION The application of this method holds promise for healthcare practitioners to offer valuable insights for enhancing patient treatment targeting specific subtypes of CAD with diabetes.Implying artificial intelligence techniques with medical data,we have shown the potential for personalized healthcare and the development of user-friendly applications aimed at improving cardiovascular health outcomes for this high-risk population to optimise the decision-making in patient care.
文摘In order to discover the main causes of elevator group accidents in edge computing environment, a multi-dimensional data model of elevator accident data is established by using data cube technology, proposing and implementing a method by combining classical Apriori algorithm with the model, digging out frequent items of elevator accident data to explore the main reasons for the occurrence of elevator accidents. In addition, a collaborative edge model of elevator accidents is set to achieve data sharing, making it possible to check the detail of each cause to confirm the causes of elevator accidents. Lastly the association rules are applied to find the law of elevator Accidents.
文摘Maximum frequent pattern generation from a large database of transactions and items for association rule mining is an important research topic in data mining. Association rule mining aims to discover interesting correlations, frequent patterns, associations, or causal structures between items hidden in a large database. By exploiting quantum computing, we propose an efficient quantum search algorithm design to discover the maximum frequent patterns. We modified Grover’s search algorithm so that a subspace of arbitrary symmetric states is used instead of the whole search space. We presented a novel quantum oracle design that employs a quantum counter to count the maximum frequent items and a quantum comparator to check with a minimum support threshold. The proposed derived algorithm increases the rate of the correct solutions since the search is only in a subspace. Furthermore, our algorithm significantly scales and optimizes the required number of qubits in design, which directly reflected positively on the performance. Our proposed design can accommodate more transactions and items and still have a good performance with a small number of qubits.
文摘Although association rule mining is an important pattern recognition and data analysis technique, extracting and finding significant rules from a large collection has always been challenging. The ability of information visualization to enable users to gain an understanding of high dimensional and large-scale data can play a major role in the exploration, identification, and interpretation of association rules. In this paper, we propose a method that provides multiple views of the association rules, linked together through a filtering mechanism. A visual inspection of the entire association rule set is enabled within a matrix view. Items of interest can be selected, resulting in their corresponding association rules being shown in a graph view. At any time, individual rules can be selected in either view, resulting in their information being shown in the detail view. The fundamental premise in this work is that by providing such a visual and interactive representation of the association rules, users will be able to find important rules quickly and easily, even as the number of rules that must be inspected becomes large. A user evaluation was conducted which validates this premise.
基金supported by the National Natural Science Foundation of China (No. J07240003, No. 60773084, No. 60603023)National Research Fund for the Doctoral Program of Higher Education of China (No. 20070151009)
文摘The typical model, which involves the measures: support, confidence, and interest, is often adapted to mining association rules. In the model, the related parameters are usually chosen by experience; consequently, the number of useful rules is hard to estimate. If the number is too large, we cannot effectively extract the meaningful rules. This paper analyzes the meanings of the parameters and designs a variety of equations between the number of rules and the parameters by using regression method. Finally, we experimentally obtain a preferable regression equation. This paper uses multiple correlation coeficients to test the fitting efiects of the equations and uses significance test to verify whether the coeficients of parameters are significantly zero or not. The regression equation that has a larger multiple correlation coeficient will be chosen as the optimally fitted equation. With the selected optimal equation, we can predict the number of rules under the given parameters and further optimize the choice of the three parameters and determine their ranges of values.
文摘As data mining more and more popular applied in computer system,the quality as-surance test of its software would be get more and more attention.However,because of the ex-istence of the 'oracle' problem,the traditional test method is not ease fit for the application program in the field of the data mining.In this paper,based on metamorphic testing,a software testing method is proposed in the field of the data mining,makes an association rules algorithm as the specific case,and constructs the metamorphic relation on the algorithm.Experiences show that the method can achieve the testing target and is feasible to apply to other domain.
文摘Discovering cyclic generalized association rules from transaction datbases can reveal the relationship of differ-ent levels of the taxonomies and display cyclic variations over time.Information about such variations is great use of better identifying trends in associations and forecast-ing.Because cyclic rules are quite sensitive to a littlenoise,this paper uses the noise-ratio as the criterion of i-dentifing cydclic itemsets for dealing with the problem and utilizes the cycle-pruning technique to reduce the comput-ing time of the data mining process by exploiting the real-tionship between the cycle and generalized frequent item-sets.The paper gives the algorithm of mining cyclic gen-eralized itemsets(CGI).Experiment shows that the CGI algorithm can efficiently yield results.
基金Under the auspices of Special Fund of Ministry of Land and Resources of China in Public Interest(No.201511001)
文摘Association rule mining methods, as a set of important data mining tools, could be used for mining spatial association rules of spatial data. However, applications of these methods are limited for mining results containing large number of redundant rules. In this paper, a new method named Geo-Filtered Association Rules Mining(GFARM) is proposed to effectively eliminate the redundant rules. An application of GFARM is performed as a case study in which association rules are discovered between building land distribution and potential driving factors in Wuhan, China from 1995 to 2015. Ten sets of regular sampling grids with different sizes are used for detecting the influence of multi-scales on GFARM. Results show that the proposed method can filter 50%–70% of redundant rules. GFARM is also successful in discovering spatial association pattern between building land distribution and driving factors.
文摘Data-mining techniques have been developed to turn data into useful task-oriented knowledge. Most algorithms for mining association rules identify relationships among transactions using binary values and find rules at a single-concept level. Extracting multilevel association rules in transaction databases is most commonly used in data mining. This paper proposes a multilevel fuzzy association rule mining model for extraction of implicit knowledge which stored as quantitative values in transactions. For this reason it uses different support value at each level as well as different membership function for each item. By integrating fuzzy-set concepts, data-mining technologies and multiple-level taxonomy, our method finds fuzzy association rules from transaction data sets. This approach adopts a top-down progressively deepening approach to derive large itemsets and also incorporates fuzzy boundaries instead of sharp boundary intervals. Comparing our method with previous ones in simulation shows that the proposed method maintains higher precision, the mined rules are closer to reality, and it gives ability to mine association rules at different levels based on the user’s tendency as well.
文摘The Apriori algorithm is a classical method of association rules mining.Based on analysis of this theory,the paper provides an improved Apriori algorithm.The paper puts foward with algorithm combines HASH table technique and reduction of candidate item sets to enhance the usage efficiency of resources as well as the individualized service of the data library.
文摘The traditional generalization-based knowledge discovery method is introduced. A new kind of multilevel spatial association of the rules mining method based on the cloud model is presented. The cloud model integrates the vague and random use of linguistic terms in a unified way. With these models, spatial and nonspatial attribute values are well generalized at multiple levels, allowing discovery of strong spatial association rules. Combining the cloud model based method with Apriori algorithms for mining association rules from a spatial database shows benefits in being effective and flexible.
基金Projects(10871031, 60474070) supported by the National Natural Science Foundation of ChinaProject(07A001) supported by the Scientific Research Fund of Hunan Provincial Education Department, China
文摘Based on the rough set theory which is a powerful tool in dealing with vagueness and uncertainty, an algorithm to mine association rules in incomplete information systems was presented and the support and confidence were redefined. The algorithm can mine the association rules with decision attributes directly without processing missing values. Using the incomplete dataset Mushroom from UCI machine learning repository, the new algorithm was compared with the classical association rules mining algorithm based on Apriori from the number of rules extracted, testing accuracy and execution time. The experiment results show that the new algorithm has advantages of short execution time and high accuracy.
基金supported by the Fundamental Research Funds for the Central Universities under Grants No.ZYGX2014J051 and No.ZYGX2014J066Science and Technology Projects in Sichuan Province under Grants No.2015JY0178,No.2016FZ0002,No.2014GZ0109,No.2015KZ002 and No.2015JY0030China Postdoctoral Science Foundation under Grant No.2015M572464
文摘At present, most of the association rules algorithms are based on the Boolean attribute and single-level association rules mining. But data of the real world has various types, the multi-level and quantitative attributes are got more and more attention. And the most important step is to mine frequent sets. In this paper, we propose an algorithm that is called fuzzy multiple-level association (FMA) rules to mine frequent sets. It is based on the improved Eclat algorithm that is different to many researchers’ proposed algorithms thatused the Apriori algorithm. We analyze quantitative data’s frequent sets by using the fuzzy theory, dividing the hierarchy of concept and softening the boundary of attributes’ values and frequency. In this paper, we use the vertical-style data and the improved Eclat algorithm to describe the proposed method, we use this algorithm to analyze the data of Beijing logistics route. Experiments show that the algorithm has a good performance, it has better effectiveness and high efficiency.
基金Supported in part by the National Natural Science F oundation of China(6 0 0 730 12 )
文摘Extracting objects from legacy systems is a basic step in system's object orientation to improve the maintainability and understandability of the systems. A new object extraction model using association rules and dependence analysis is proposed. In this model data are classified by association rules and the corresponding operations are partitioned by dependence analysis.
文摘We discuss the basic intrusion detection techniques, and focus on how to apply association rules to intrusion detection. Begin with analyzing some close relations between user’s behaviors, we discuss the mining algorithm of association rules and apply to detect anomaly in IDS. Moreover, according to the characteristic of intrusion detection, we optimize the mining algorithm of association rules, and use fuzzy logic to improve the system performance.
基金Supported by the National Natural Science Foun-dation of China(70371015) and the Science Foundation of JiangsuUniversity ( 04KJD001)
文摘Typical association rules consider only items enumerated in transactions. Such rules are referred to as positive association rules. Negative association rules also consider the same items, but in addition consider negated items (i. e. absent from transactions). Negative association rules are useful in market-basket analysis to identify products that conflict with each other or products that complement each other. They are also very convenient for associative classifiers, classifiers that build their classification model based on association rules. Indeed, mining for such rules necessitates the examination of an exponentially large search space. Despite their usefulness, very few algorithms to mine them have been proposed to date. In this paper, an algorithm based on FP tree is presented to discover negative association rules.
文摘A new algorithm for mining quantitative association rules with standard SQL is presented. The association rules are evaluated with the sufficiency gene LS of subjectivity Bayes reasoning. This algorithm is proved to be quick and effective with its application in Lujiang insects and pests database.
文摘This paper devises a scheme which can discover the state association rules of process object. The scheme aims to dig the hidden close relationships of different links in process object. We adopt a method based on difference and extremum to compute the timing. Clustering is used to classifying the adjusted data, and the next is associating the clusters. Based on the rules of clusters, we produce the rules of links. Association degrees between each two links can be determined. It is easy to get association chains according to the degree. The state association rules that can be obtained in accordance with association rules are the final results. Some industry guidance can be directly summarized from the state association rules, and we can apply the guidance to improve the efficiency of production and operational in allied industries.
文摘These days, health care systems such as pharmacies and drugstores normally produce high volumes of data. Consequently, utilizing data mining methods in health care systems has become a conventional process. In this research, Apriori algorithm has been applied to perform data mining using the data obtained from the prescriptions ordered within a pharmacy. Ten association rules were achieved from the assigned pharmaceutical drugs in those prescriptions using the aforementioned Apriori algorithm. The accuracy of these rules is also manually studied and reviewed by a physician. Among these association rules, Vitamin D and Calcium pills are the most interrelated medications, and Omeprazole and Metronidazole rankd second in terms of association. The results of this study provide useful feedback information about associations among drugs.
文摘Hotspots (active fires) indicate spatial distribution of fires. A study on determining influence factors for hotspot occurrence is essential so that fire events can be predicted based on characteristics of a certain area. This study discovers the possible influence factors on the occurrence of fire events using the association rule algorithm namely Apriori in the study area of Rokan Hilir Riau Province Indonesia. The Apriori algorithm was applied on a forest fire dataset which containeddata on physical environment (land cover, river, road and city center), socio-economic (income source, population, and number of school), weather (precipitation, wind speed, and screen temperature), and peatlands. The experiment results revealed 324 multidimensional association rules indicating relationships between hotspots occurrence and other factors.The association among hotspots occurrence with other geographical objects was discovered for the minimum support of 10% and the minimum confidence of 80%. The results show that strong relations between hotspots occurrence and influence factors are found for the support about 12.42%, the confidence of 1, and the lift of 2.26. These factors are precipitation greater than or equal to 3 mm/day, wind speed in [1m/s, 2m/s), non peatland area, screen temperature in [297K, 298K), the number of school in 1 km2 less than or equal to 0.1, and the distance of each hotspot to the nearest road less than or equal to 2.5 km.