BACKGROUND It is increasingly common to find patients affected by a combination of type 2 diabetes mellitus(T2DM)and coronary artery disease(CAD),and studies are able to correlate their relationships with available bi...BACKGROUND It is increasingly common to find patients affected by a combination of type 2 diabetes mellitus(T2DM)and coronary artery disease(CAD),and studies are able to correlate their relationships with available biological and clinical evidence.The aim of the current study was to apply association rule mining(ARM)to discover whether there are consistent patterns of clinical features relevant to these diseases.ARM leverages clinical and laboratory data to the meaningful patterns for diabetic CAD by harnessing the power help of data-driven algorithms to optimise the decision-making in patient care.AIM To reinforce the evidence of the T2DM-CAD interplay and demonstrate the ability of ARM to provide new insights into multivariate pattern discovery.METHODS This cross-sectional study was conducted at the Department of Biochemistry in a specialized tertiary care centre in Delhi,involving a total of 300 consented subjects categorized into three groups:CAD with diabetes,CAD without diabetes,and healthy controls,with 100 subjects in each group.The participants were enrolled from the Cardiology IPD&OPD for the sample collection.The study employed ARM technique to extract the meaningful patterns and relationships from the clinical data with its original value.RESULTS The clinical dataset comprised 35 attributes from enrolled subjects.The analysis produced rules with a maximum branching factor of 4 and a rule length of 5,necessitating a 1%probability increase for enhancement.Prominent patterns emerged,highlighting strong links between health indicators and diabetes likelihood,particularly elevated HbA1C and random blood sugar levels.The ARM technique identified individuals with a random blood sugar level>175 and HbA1C>6.6 are likely in the“CAD-with-diabetes”group,offering valuable insights into health indicators and influencing factors on disease outcomes.CONCLUSION The application of this method holds promise for healthcare practitioners to offer valuable insights for enhancing patient treatment targeting specific subtypes of CAD with diabetes.Implying artificial intelligence techniques with medical data,we have shown the potential for personalized healthcare and the development of user-friendly applications aimed at improving cardiovascular health outcomes for this high-risk population to optimise the decision-making in patient care.展开更多
Maximum frequent pattern generation from a large database of transactions and items for association rule mining is an important research topic in data mining. Association rule mining aims to discover interesting corre...Maximum frequent pattern generation from a large database of transactions and items for association rule mining is an important research topic in data mining. Association rule mining aims to discover interesting correlations, frequent patterns, associations, or causal structures between items hidden in a large database. By exploiting quantum computing, we propose an efficient quantum search algorithm design to discover the maximum frequent patterns. We modified Grover’s search algorithm so that a subspace of arbitrary symmetric states is used instead of the whole search space. We presented a novel quantum oracle design that employs a quantum counter to count the maximum frequent items and a quantum comparator to check with a minimum support threshold. The proposed derived algorithm increases the rate of the correct solutions since the search is only in a subspace. Furthermore, our algorithm significantly scales and optimizes the required number of qubits in design, which directly reflected positively on the performance. Our proposed design can accommodate more transactions and items and still have a good performance with a small number of qubits.展开更多
The market trends rapidly changed over the last two decades.The primary reason is the newly created opportunities and the increased number of competitors competing to grasp market share using business analysis techniq...The market trends rapidly changed over the last two decades.The primary reason is the newly created opportunities and the increased number of competitors competing to grasp market share using business analysis techniques.Market Basket Analysis has a tangible effect in facilitating current change in the market.Market Basket Analysis is one of the famous fields that deal with Big Data and Data Mining applications.MBA initially uses Association Rule Learning(ARL)as a mean for realization.ARL has a beneficial effect in providing a plenty benefit in analyzing the market data and understanding customers’behavior.An important motive of using such techniques is maximizing the business profit as well as matching the exact customer needs as closely as possible.In this survey paper,we discussed several applications and methods of MBA based on ARL.Also,we reviewed some association rule learning measurements including trust,lift,leverage,and others.Furthermore,we discuss some open issues and future topics in the area of market basket analysis and association rule learning.展开更多
In order to discover the main causes of elevator group accidents in edge computing environment, a multi-dimensional data model of elevator accident data is established by using data cube technology, proposing and impl...In order to discover the main causes of elevator group accidents in edge computing environment, a multi-dimensional data model of elevator accident data is established by using data cube technology, proposing and implementing a method by combining classical Apriori algorithm with the model, digging out frequent items of elevator accident data to explore the main reasons for the occurrence of elevator accidents. In addition, a collaborative edge model of elevator accidents is set to achieve data sharing, making it possible to check the detail of each cause to confirm the causes of elevator accidents. Lastly the association rules are applied to find the law of elevator Accidents.展开更多
This paper is aimed to develop an algorithm for extracting association rules,called Context-Based Association Rule Mining algorithm(CARM),which can be regarded as an extension of the Context-Based Positive and Negativ...This paper is aimed to develop an algorithm for extracting association rules,called Context-Based Association Rule Mining algorithm(CARM),which can be regarded as an extension of the Context-Based Positive and Negative Association Rule Mining algorithm(CBPNARM).CBPNARM was developed to extract positive and negative association rules from Spatiotemporal(space-time)data only,while the proposed algorithm can be applied to both spatial and non-spatial data.The proposed algorithm is applied to the energy dataset to classify a country’s energy development by uncovering the enthralling interdependencies between the set of variables to get positive and negative associations.Many association rules related to sustainable energy development are extracted by the proposed algorithm that needs to be pruned by some pruning technique.The context,in this paper serves as a pruning measure to extract pertinent association rules from non-spatial data.Conditional Probability Increment Ratio(CPIR)is also added in the proposed algorithm that was not used in CBPNARM.The inclusion of the context variable and CPIR resulted in fewer rules and improved robustness and ease of use.Also,the extraction of a common negative frequent itemset in CARM is different from that of CBPNARM.The rules created by the proposed algorithm are more meaningful,significant,relevant and insightful.The accuracy of the proposed algorithm is compared with the Apriori,PNARM and CBPNARM algorithms.The results demonstrated enhanced accuracy,relevance and timeliness.展开更多
The typical model, which involves the measures: support, confidence, and interest, is often adapted to mining association rules. In the model, the related parameters are usually chosen by experience; consequently, th...The typical model, which involves the measures: support, confidence, and interest, is often adapted to mining association rules. In the model, the related parameters are usually chosen by experience; consequently, the number of useful rules is hard to estimate. If the number is too large, we cannot effectively extract the meaningful rules. This paper analyzes the meanings of the parameters and designs a variety of equations between the number of rules and the parameters by using regression method. Finally, we experimentally obtain a preferable regression equation. This paper uses multiple correlation coeficients to test the fitting efiects of the equations and uses significance test to verify whether the coeficients of parameters are significantly zero or not. The regression equation that has a larger multiple correlation coeficient will be chosen as the optimally fitted equation. With the selected optimal equation, we can predict the number of rules under the given parameters and further optimize the choice of the three parameters and determine their ranges of values.展开更多
Data mining, i.e., mining knowledge from large amounts of data, is a demanding field since huge amounts of data have been collected in various applications. The collected data far exceed people's ability to analyz...Data mining, i.e., mining knowledge from large amounts of data, is a demanding field since huge amounts of data have been collected in various applications. The collected data far exceed people's ability to analyze it. Thus, some new and efficient methods are needed to discover knowledge from large database. Association rule discovery is an important problem in knowledge discovery and data mining. The association mining task consists of identifying the frequent item sets and then forming conditional implication rules among them. In this paper, we describe and summarize recent work on association rule discovery, offer a new method to association rule mining and point out that association rule discovery can be applied in spatial data mining. It is useful to discover knowledge from remote sensing and geographical information system.展开更多
As data mining more and more popular applied in computer system,the quality as-surance test of its software would be get more and more attention.However,because of the ex-istence of the 'oracle' problem,the tr...As data mining more and more popular applied in computer system,the quality as-surance test of its software would be get more and more attention.However,because of the ex-istence of the 'oracle' problem,the traditional test method is not ease fit for the application program in the field of the data mining.In this paper,based on metamorphic testing,a software testing method is proposed in the field of the data mining,makes an association rules algorithm as the specific case,and constructs the metamorphic relation on the algorithm.Experiences show that the method can achieve the testing target and is feasible to apply to other domain.展开更多
Constipation is a common complication of stroke,and it is increasing year by year,which is worthy of attention.In fact,as an effective treatment for Post-ischemic Stroke Constipation,massage has been recognized by doc...Constipation is a common complication of stroke,and it is increasing year by year,which is worthy of attention.In fact,as an effective treatment for Post-ischemic Stroke Constipation,massage has been recognized by doctors at home and abroad.However,In the known research reports,massage prescriptions are complicated,therefore,a simple and effective massage prescription is urgently needed to effectively guide the clinic and promote it.In this study,we used association rule and entropy clustering analysis methods to mine clinical literature on Post-ischemic Stroke Constipation in 7 databases,and combined with data analysis,traditional chinese massage theory and clinical practice,a core new prescription is summarized.The core new prescription of massage in treating Post-ischemic Stroke Constipation take tonifying spleen,nourishing Qi and generating Body Fluid,promoting Qi,invigorating the circulation of blood and eliminating phlegm as the principle of treatment,which is accord with the pathogenesis of this disease,can better guide the clinical practice and facilitate the popularization and application of massage therapy.展开更多
Aiming at the research that using more new knowledge to develope knowledge system with dynamic accordance, and under the background of using Fuzzy language field and Fuzzy language values structure as description fram...Aiming at the research that using more new knowledge to develope knowledge system with dynamic accordance, and under the background of using Fuzzy language field and Fuzzy language values structure as description framework, the generalized cell Automation that can synthetically process fuzzy indeterminacy and random indeterminacy and generalized inductive logic causal model is brought forward. On this basis, a kind of the new method that can discover causal association rules is provded. According to the causal information of standard sample space and commonly sample space, through constructing its state (abnormality) relation matrix, causal association rules can be gained by using inductive reasoning mechanism. The estimate of this algorithm complexity is given,and its validiw is proved through case.展开更多
The rapid growth of the use of social media opens up new challenges and opportunities to analyze various aspects and patterns in communication.In-text mining,several techniques are available such as information cluste...The rapid growth of the use of social media opens up new challenges and opportunities to analyze various aspects and patterns in communication.In-text mining,several techniques are available such as information clustering,extraction,summarization,classification.In this study,a text mining framework was presented which consists of 4 phases retrieving,processing,indexing,and mine association rule phase.It is applied by using the association rule mining technique to check the associated term with the Huawei P30 Pro phone.Customer reviews are extracted from many websites and Facebook groups,such as re-view.cnet.com,CNET.Facebook and amazon.com technology,where customers from all over the world placed their notes on cell phones.In this analysis,a total of 192 reviews of Huawei P30 Pro were collected to evaluate them by text mining techniques.The findings demonstrate that Huawei P30 Pro,has strong points such as the best safety,high-quality camera,battery that lasts more than 24 hours,and the processor is very fast.This paper aims to prove that text mining decreases human efforts by recognizing significant documents.This will lead to improving the awareness of customers to choose their products and at the same time sales managers also get to know what their products were accepted by customers suspended.展开更多
The exchange of information is an innate and natural process that assist in content dispersal.Social networking sites emerge to enrich their users by providing the facility for sharing information and social interacti...The exchange of information is an innate and natural process that assist in content dispersal.Social networking sites emerge to enrich their users by providing the facility for sharing information and social interaction.The extensive adoption of social networking sites also resulted in user content generation.There are diverse research areas explored by the researchers to investigate the influence of social media on users and confirmed that social media sites have a significant impact on markets,politics and social life.Facebook is extensively used platform to share information,thoughts and opinions through posts and comments.The identification of influential users on the social web has grown as hot research field because of vast applications in diverse areas for instance political campaigns marketing,e-commerce,commercial and,etc.Prior research studies either uses linguistic content or graph-based representation of social network for the detection of influential users.In this article,we incorporate association rule mining algorithms to identify the top influential users through frequent patterns.The association rules have been computed using the standard evaluation measures such as support,confidence,lift,and conviction.To verify the results,we also involve conventional metrics for example accuracy,precision,recall and F1-measure according to the association rules perspective.The detailed experiments are carried out using the benchmark College-Msg dataset extracted by Facebook.The obtained results validate the quality and visibility of the proposed approach.The outcome of propose model verify that the association rule mining is able to generate rules to identify the temporal influential users on Facebook who are consistent on regular basis.The preparation of rule set help to create knowledge-based systems which are efficient and widely used in recent era for decision making to solve real-world problems.展开更多
Association rules’learning is a machine learning method used in finding underlying associations in large datasets.Whether intentionally or unintentionally present,noise in training instances causes overfitting while ...Association rules’learning is a machine learning method used in finding underlying associations in large datasets.Whether intentionally or unintentionally present,noise in training instances causes overfitting while building the classifier and negatively impacts classification accuracy.This paper uses instance reduction techniques for the datasets before mining the association rules and building the classifier.Instance reduction techniques were originally developed to reduce memory requirements in instance-based learning.This paper utilizes them to remove noise from the dataset before training the association rules classifier.Extensive experiments were conducted to assess the accuracy of association rules with different instance reduction techniques,namely:DecrementalReduction Optimization Procedure(DROP)3,DROP5,ALL K-Nearest Neighbors(ALLKNN),Edited Nearest Neighbor(ENN),and Repeated Edited Nearest Neighbor(RENN)in different noise ratios.Experiments show that instance reduction techniques substantially improved the average classification accuracy on three different noise levels:0%,5%,and 10%.The RENN algorithm achieved the highest levels of accuracy with a significant improvement on seven out of eight used datasets from the University of California Irvine(UCI)machine learning repository.The improvements were more apparent in the 5%and the 10%noise cases.When RENN was applied,the average classification accuracy for the eight datasets in the zero-noise test enhanced from 70.47%to 76.65%compared to the original test.The average accuracy was improved from 66.08%to 77.47%for the 5%-noise case and from 59.89%to 77.59%in the 10%-noise case.Higher confidence was also reported in building the association rules when RENN was used.The above results indicate that RENN is a good solution in removing noise and avoiding overfitting during the construction of the association rules classifier,especially in noisy domains.展开更多
Typical association rules consider only items enumerated in transactions. Such rules are referred to as positive association rules. Negative association rules also consider the same items, but in addition consider neg...Typical association rules consider only items enumerated in transactions. Such rules are referred to as positive association rules. Negative association rules also consider the same items, but in addition consider negated items (i. e. absent from transactions). Negative association rules are useful in market-basket analysis to identify products that conflict with each other or products that complement each other. They are also very convenient for associative classifiers, classifiers that build their classification model based on association rules. Indeed, mining for such rules necessitates the examination of an exponentially large search space. Despite their usefulness, very few algorithms to mine them have been proposed to date. In this paper, an algorithm based on FP tree is presented to discover negative association rules.展开更多
At present, most of the association rules algorithms are based on the Boolean attribute and single-level association rules mining. But data of the real world has various types, the multi-level and quantitative attribu...At present, most of the association rules algorithms are based on the Boolean attribute and single-level association rules mining. But data of the real world has various types, the multi-level and quantitative attributes are got more and more attention. And the most important step is to mine frequent sets. In this paper, we propose an algorithm that is called fuzzy multiple-level association (FMA) rules to mine frequent sets. It is based on the improved Eclat algorithm that is different to many researchers’ proposed algorithms thatused the Apriori algorithm. We analyze quantitative data’s frequent sets by using the fuzzy theory, dividing the hierarchy of concept and softening the boundary of attributes’ values and frequency. In this paper, we use the vertical-style data and the improved Eclat algorithm to describe the proposed method, we use this algorithm to analyze the data of Beijing logistics route. Experiments show that the algorithm has a good performance, it has better effectiveness and high efficiency.展开更多
We discuss the basic intrusion detection techniques, and focus on how to apply association rules to intrusion detection. Begin with analyzing some close relations between user’s behaviors, we discuss the mining algor...We discuss the basic intrusion detection techniques, and focus on how to apply association rules to intrusion detection. Begin with analyzing some close relations between user’s behaviors, we discuss the mining algorithm of association rules and apply to detect anomaly in IDS. Moreover, according to the characteristic of intrusion detection, we optimize the mining algorithm of association rules, and use fuzzy logic to improve the system performance.展开更多
With the wider growth of web-based documents,the necessity of automatic document clustering and text summarization is increased.Here,document summarization that is extracting the essential task with appropriate inform...With the wider growth of web-based documents,the necessity of automatic document clustering and text summarization is increased.Here,document summarization that is extracting the essential task with appropriate information,removal of unnecessary data and providing the data in a cohesive and coherent manner is determined to be a most confronting task.In this research,a novel intelligent model for document clustering is designed with graph model and Fuzzy based association rule generation(gFAR).Initially,the graph model is used to map the relationship among the data(multi-source)followed by the establishment of document clustering with the generation of association rule using the fuzzy concept.This method shows benefit in redundancy elimination by mapping the relevant document using graph model and reduces the time consumption and improves the accuracy using the association rule generation with fuzzy.This framework is provided in an interpretable way for document clustering.It iteratively reduces the error rate during relationship mapping among the data(clusters)with the assistance of weighted document content.Also,this model represents the significance of data features with class discrimination.It is also helpful in measuring the significance of the features during the data clustering process.The simulation is done with MATLAB 2016b environment and evaluated with the empirical standards like Relative Risk Patterns(RRP),ROUGE score,and Discrimination Information Measure(DMI)respectively.Here,DailyMail and DUC 2004 dataset is used to extract the empirical results.The proposed gFAR model gives better trade-off while compared with various prevailing approaches.展开更多
Association rule mining is an important issue in data mining. The paper proposed an binary system based method to generate candidate frequent itemsets and corresponding supporting counts efficiently, which needs only ...Association rule mining is an important issue in data mining. The paper proposed an binary system based method to generate candidate frequent itemsets and corresponding supporting counts efficiently, which needs only some operations such as "and", "or" and "xor". Applying this idea in the existed distributed association rule mining al gorithm FDM, the improved algorithm BFDM is proposed. The theoretical analysis and experiment testify that BFDM is effective and efficient.展开更多
A new algorithm for mining quantitative association rules with standard SQL is presented. The association rules are evaluated with the sufficiency gene LS of subjectivity Bayes reasoning. This algorithm is proved to b...A new algorithm for mining quantitative association rules with standard SQL is presented. The association rules are evaluated with the sufficiency gene LS of subjectivity Bayes reasoning. This algorithm is proved to be quick and effective with its application in Lujiang insects and pests database.展开更多
Objective:To analyze the rule of prescribing traditional Chinese medicine for treating pneumoconiosis,so as to provide reference for differential diagnosis and treatment of pneumoconiosis as well as for the developmen...Objective:To analyze the rule of prescribing traditional Chinese medicine for treating pneumoconiosis,so as to provide reference for differential diagnosis and treatment of pneumoconiosis as well as for the development of new drugs for treatingthe disease.Methods:We searched China National Knowledge Infrastructure,Wanfang Database and VIP Chinese PublicationDatabase to retrieve relevant literatures which were then screened according to the enrollment criteria to establish a prescriptiondatabase of traditional Chinese medicine for the treatment of pneumoconiosis.The inheritance calculation platform of traditionalChinese medicine was used to analyze the prescribing rule of traditional Chinese medicine in the treatment of pneumoconiosisbased on association rules,k-means clustering algorithm and regression model analysis.Results:A total of 131 related literature were preliminarily selected,from which 97 prescriptions of traditional Chinese medicine with a total of 195 herbs were included.The most frequently prescribed herbs included Radix astragali,Platycodon grandiflorum,Pinellia ternata,licorice,Codonopsispilosula,Salvia miltiorrhiza,bitter almond etc.A total of 14 association rules,13 high-frequency herb pairs were found and 5groups of formulas were revealed by cluster analysis.Conclusion:The prescriptions for the treatment of pneumoconiosis are mainly composed of herbs for tonifying deficiency,resolving phlegm,relieving cough and asthma,activating blood circulation and removingblood stasis,which are supplemented with herbs for clearing heat,relieving appearance,regulating qi,promoting waterand permeating dampness,etc.,The prescribing rules reflect the basic pathological characteristics of lung deficiency and collateral arthralgia in pneumoconiosis,which provides some ideas for the clinical differentiation and treatment of pneumoconiosis in traditionalChinese medicine.It also provides reference for the research and development of new treatment methods.展开更多
文摘BACKGROUND It is increasingly common to find patients affected by a combination of type 2 diabetes mellitus(T2DM)and coronary artery disease(CAD),and studies are able to correlate their relationships with available biological and clinical evidence.The aim of the current study was to apply association rule mining(ARM)to discover whether there are consistent patterns of clinical features relevant to these diseases.ARM leverages clinical and laboratory data to the meaningful patterns for diabetic CAD by harnessing the power help of data-driven algorithms to optimise the decision-making in patient care.AIM To reinforce the evidence of the T2DM-CAD interplay and demonstrate the ability of ARM to provide new insights into multivariate pattern discovery.METHODS This cross-sectional study was conducted at the Department of Biochemistry in a specialized tertiary care centre in Delhi,involving a total of 300 consented subjects categorized into three groups:CAD with diabetes,CAD without diabetes,and healthy controls,with 100 subjects in each group.The participants were enrolled from the Cardiology IPD&OPD for the sample collection.The study employed ARM technique to extract the meaningful patterns and relationships from the clinical data with its original value.RESULTS The clinical dataset comprised 35 attributes from enrolled subjects.The analysis produced rules with a maximum branching factor of 4 and a rule length of 5,necessitating a 1%probability increase for enhancement.Prominent patterns emerged,highlighting strong links between health indicators and diabetes likelihood,particularly elevated HbA1C and random blood sugar levels.The ARM technique identified individuals with a random blood sugar level>175 and HbA1C>6.6 are likely in the“CAD-with-diabetes”group,offering valuable insights into health indicators and influencing factors on disease outcomes.CONCLUSION The application of this method holds promise for healthcare practitioners to offer valuable insights for enhancing patient treatment targeting specific subtypes of CAD with diabetes.Implying artificial intelligence techniques with medical data,we have shown the potential for personalized healthcare and the development of user-friendly applications aimed at improving cardiovascular health outcomes for this high-risk population to optimise the decision-making in patient care.
文摘Maximum frequent pattern generation from a large database of transactions and items for association rule mining is an important research topic in data mining. Association rule mining aims to discover interesting correlations, frequent patterns, associations, or causal structures between items hidden in a large database. By exploiting quantum computing, we propose an efficient quantum search algorithm design to discover the maximum frequent patterns. We modified Grover’s search algorithm so that a subspace of arbitrary symmetric states is used instead of the whole search space. We presented a novel quantum oracle design that employs a quantum counter to count the maximum frequent items and a quantum comparator to check with a minimum support threshold. The proposed derived algorithm increases the rate of the correct solutions since the search is only in a subspace. Furthermore, our algorithm significantly scales and optimizes the required number of qubits in design, which directly reflected positively on the performance. Our proposed design can accommodate more transactions and items and still have a good performance with a small number of qubits.
文摘The market trends rapidly changed over the last two decades.The primary reason is the newly created opportunities and the increased number of competitors competing to grasp market share using business analysis techniques.Market Basket Analysis has a tangible effect in facilitating current change in the market.Market Basket Analysis is one of the famous fields that deal with Big Data and Data Mining applications.MBA initially uses Association Rule Learning(ARL)as a mean for realization.ARL has a beneficial effect in providing a plenty benefit in analyzing the market data and understanding customers’behavior.An important motive of using such techniques is maximizing the business profit as well as matching the exact customer needs as closely as possible.In this survey paper,we discussed several applications and methods of MBA based on ARL.Also,we reviewed some association rule learning measurements including trust,lift,leverage,and others.Furthermore,we discuss some open issues and future topics in the area of market basket analysis and association rule learning.
文摘In order to discover the main causes of elevator group accidents in edge computing environment, a multi-dimensional data model of elevator accident data is established by using data cube technology, proposing and implementing a method by combining classical Apriori algorithm with the model, digging out frequent items of elevator accident data to explore the main reasons for the occurrence of elevator accidents. In addition, a collaborative edge model of elevator accidents is set to achieve data sharing, making it possible to check the detail of each cause to confirm the causes of elevator accidents. Lastly the association rules are applied to find the law of elevator Accidents.
文摘This paper is aimed to develop an algorithm for extracting association rules,called Context-Based Association Rule Mining algorithm(CARM),which can be regarded as an extension of the Context-Based Positive and Negative Association Rule Mining algorithm(CBPNARM).CBPNARM was developed to extract positive and negative association rules from Spatiotemporal(space-time)data only,while the proposed algorithm can be applied to both spatial and non-spatial data.The proposed algorithm is applied to the energy dataset to classify a country’s energy development by uncovering the enthralling interdependencies between the set of variables to get positive and negative associations.Many association rules related to sustainable energy development are extracted by the proposed algorithm that needs to be pruned by some pruning technique.The context,in this paper serves as a pruning measure to extract pertinent association rules from non-spatial data.Conditional Probability Increment Ratio(CPIR)is also added in the proposed algorithm that was not used in CBPNARM.The inclusion of the context variable and CPIR resulted in fewer rules and improved robustness and ease of use.Also,the extraction of a common negative frequent itemset in CARM is different from that of CBPNARM.The rules created by the proposed algorithm are more meaningful,significant,relevant and insightful.The accuracy of the proposed algorithm is compared with the Apriori,PNARM and CBPNARM algorithms.The results demonstrated enhanced accuracy,relevance and timeliness.
基金supported by the National Natural Science Foundation of China (No. J07240003, No. 60773084, No. 60603023)National Research Fund for the Doctoral Program of Higher Education of China (No. 20070151009)
文摘The typical model, which involves the measures: support, confidence, and interest, is often adapted to mining association rules. In the model, the related parameters are usually chosen by experience; consequently, the number of useful rules is hard to estimate. If the number is too large, we cannot effectively extract the meaningful rules. This paper analyzes the meanings of the parameters and designs a variety of equations between the number of rules and the parameters by using regression method. Finally, we experimentally obtain a preferable regression equation. This paper uses multiple correlation coeficients to test the fitting efiects of the equations and uses significance test to verify whether the coeficients of parameters are significantly zero or not. The regression equation that has a larger multiple correlation coeficient will be chosen as the optimally fitted equation. With the selected optimal equation, we can predict the number of rules under the given parameters and further optimize the choice of the three parameters and determine their ranges of values.
基金theNationalNaturalScienceFoundationofChina (No .496 780 49)
文摘Data mining, i.e., mining knowledge from large amounts of data, is a demanding field since huge amounts of data have been collected in various applications. The collected data far exceed people's ability to analyze it. Thus, some new and efficient methods are needed to discover knowledge from large database. Association rule discovery is an important problem in knowledge discovery and data mining. The association mining task consists of identifying the frequent item sets and then forming conditional implication rules among them. In this paper, we describe and summarize recent work on association rule discovery, offer a new method to association rule mining and point out that association rule discovery can be applied in spatial data mining. It is useful to discover knowledge from remote sensing and geographical information system.
文摘As data mining more and more popular applied in computer system,the quality as-surance test of its software would be get more and more attention.However,because of the ex-istence of the 'oracle' problem,the traditional test method is not ease fit for the application program in the field of the data mining.In this paper,based on metamorphic testing,a software testing method is proposed in the field of the data mining,makes an association rules algorithm as the specific case,and constructs the metamorphic relation on the algorithm.Experiences show that the method can achieve the testing target and is feasible to apply to other domain.
基金This study was supported by the Local Standard Construction Project of Jilin Provincial Market Supervision and Administration Department(DBXM097-2020)Standardization Construction Project of Jilin Provincial Administration of Traditional Chinese Medicine(zybz-zc-2020-004)of China JilinOperation specification of acupuncture on upper limb motor dysfunction after ischemic stroke(zybz-2021-005).
文摘Constipation is a common complication of stroke,and it is increasing year by year,which is worthy of attention.In fact,as an effective treatment for Post-ischemic Stroke Constipation,massage has been recognized by doctors at home and abroad.However,In the known research reports,massage prescriptions are complicated,therefore,a simple and effective massage prescription is urgently needed to effectively guide the clinic and promote it.In this study,we used association rule and entropy clustering analysis methods to mine clinical literature on Post-ischemic Stroke Constipation in 7 databases,and combined with data analysis,traditional chinese massage theory and clinical practice,a core new prescription is summarized.The core new prescription of massage in treating Post-ischemic Stroke Constipation take tonifying spleen,nourishing Qi and generating Body Fluid,promoting Qi,invigorating the circulation of blood and eliminating phlegm as the principle of treatment,which is accord with the pathogenesis of this disease,can better guide the clinical practice and facilitate the popularization and application of massage therapy.
文摘Aiming at the research that using more new knowledge to develope knowledge system with dynamic accordance, and under the background of using Fuzzy language field and Fuzzy language values structure as description framework, the generalized cell Automation that can synthetically process fuzzy indeterminacy and random indeterminacy and generalized inductive logic causal model is brought forward. On this basis, a kind of the new method that can discover causal association rules is provded. According to the causal information of standard sample space and commonly sample space, through constructing its state (abnormality) relation matrix, causal association rules can be gained by using inductive reasoning mechanism. The estimate of this algorithm complexity is given,and its validiw is proved through case.
文摘The rapid growth of the use of social media opens up new challenges and opportunities to analyze various aspects and patterns in communication.In-text mining,several techniques are available such as information clustering,extraction,summarization,classification.In this study,a text mining framework was presented which consists of 4 phases retrieving,processing,indexing,and mine association rule phase.It is applied by using the association rule mining technique to check the associated term with the Huawei P30 Pro phone.Customer reviews are extracted from many websites and Facebook groups,such as re-view.cnet.com,CNET.Facebook and amazon.com technology,where customers from all over the world placed their notes on cell phones.In this analysis,a total of 192 reviews of Huawei P30 Pro were collected to evaluate them by text mining techniques.The findings demonstrate that Huawei P30 Pro,has strong points such as the best safety,high-quality camera,battery that lasts more than 24 hours,and the processor is very fast.This paper aims to prove that text mining decreases human efforts by recognizing significant documents.This will lead to improving the awareness of customers to choose their products and at the same time sales managers also get to know what their products were accepted by customers suspended.
基金The authors extend their appreciation to the Deanship of Scientific Research at Imam Mohammad Ibn Saud Islamic University for funding this work through Research Group No.RG-21-51-01.
文摘The exchange of information is an innate and natural process that assist in content dispersal.Social networking sites emerge to enrich their users by providing the facility for sharing information and social interaction.The extensive adoption of social networking sites also resulted in user content generation.There are diverse research areas explored by the researchers to investigate the influence of social media on users and confirmed that social media sites have a significant impact on markets,politics and social life.Facebook is extensively used platform to share information,thoughts and opinions through posts and comments.The identification of influential users on the social web has grown as hot research field because of vast applications in diverse areas for instance political campaigns marketing,e-commerce,commercial and,etc.Prior research studies either uses linguistic content or graph-based representation of social network for the detection of influential users.In this article,we incorporate association rule mining algorithms to identify the top influential users through frequent patterns.The association rules have been computed using the standard evaluation measures such as support,confidence,lift,and conviction.To verify the results,we also involve conventional metrics for example accuracy,precision,recall and F1-measure according to the association rules perspective.The detailed experiments are carried out using the benchmark College-Msg dataset extracted by Facebook.The obtained results validate the quality and visibility of the proposed approach.The outcome of propose model verify that the association rule mining is able to generate rules to identify the temporal influential users on Facebook who are consistent on regular basis.The preparation of rule set help to create knowledge-based systems which are efficient and widely used in recent era for decision making to solve real-world problems.
基金The APC was funded by the Deanship of Scientific Research,Saudi Electronic University.
文摘Association rules’learning is a machine learning method used in finding underlying associations in large datasets.Whether intentionally or unintentionally present,noise in training instances causes overfitting while building the classifier and negatively impacts classification accuracy.This paper uses instance reduction techniques for the datasets before mining the association rules and building the classifier.Instance reduction techniques were originally developed to reduce memory requirements in instance-based learning.This paper utilizes them to remove noise from the dataset before training the association rules classifier.Extensive experiments were conducted to assess the accuracy of association rules with different instance reduction techniques,namely:DecrementalReduction Optimization Procedure(DROP)3,DROP5,ALL K-Nearest Neighbors(ALLKNN),Edited Nearest Neighbor(ENN),and Repeated Edited Nearest Neighbor(RENN)in different noise ratios.Experiments show that instance reduction techniques substantially improved the average classification accuracy on three different noise levels:0%,5%,and 10%.The RENN algorithm achieved the highest levels of accuracy with a significant improvement on seven out of eight used datasets from the University of California Irvine(UCI)machine learning repository.The improvements were more apparent in the 5%and the 10%noise cases.When RENN was applied,the average classification accuracy for the eight datasets in the zero-noise test enhanced from 70.47%to 76.65%compared to the original test.The average accuracy was improved from 66.08%to 77.47%for the 5%-noise case and from 59.89%to 77.59%in the 10%-noise case.Higher confidence was also reported in building the association rules when RENN was used.The above results indicate that RENN is a good solution in removing noise and avoiding overfitting during the construction of the association rules classifier,especially in noisy domains.
基金Supported by the National Natural Science Foun-dation of China(70371015) and the Science Foundation of JiangsuUniversity ( 04KJD001)
文摘Typical association rules consider only items enumerated in transactions. Such rules are referred to as positive association rules. Negative association rules also consider the same items, but in addition consider negated items (i. e. absent from transactions). Negative association rules are useful in market-basket analysis to identify products that conflict with each other or products that complement each other. They are also very convenient for associative classifiers, classifiers that build their classification model based on association rules. Indeed, mining for such rules necessitates the examination of an exponentially large search space. Despite their usefulness, very few algorithms to mine them have been proposed to date. In this paper, an algorithm based on FP tree is presented to discover negative association rules.
基金supported by the Fundamental Research Funds for the Central Universities under Grants No.ZYGX2014J051 and No.ZYGX2014J066Science and Technology Projects in Sichuan Province under Grants No.2015JY0178,No.2016FZ0002,No.2014GZ0109,No.2015KZ002 and No.2015JY0030China Postdoctoral Science Foundation under Grant No.2015M572464
文摘At present, most of the association rules algorithms are based on the Boolean attribute and single-level association rules mining. But data of the real world has various types, the multi-level and quantitative attributes are got more and more attention. And the most important step is to mine frequent sets. In this paper, we propose an algorithm that is called fuzzy multiple-level association (FMA) rules to mine frequent sets. It is based on the improved Eclat algorithm that is different to many researchers’ proposed algorithms thatused the Apriori algorithm. We analyze quantitative data’s frequent sets by using the fuzzy theory, dividing the hierarchy of concept and softening the boundary of attributes’ values and frequency. In this paper, we use the vertical-style data and the improved Eclat algorithm to describe the proposed method, we use this algorithm to analyze the data of Beijing logistics route. Experiments show that the algorithm has a good performance, it has better effectiveness and high efficiency.
文摘We discuss the basic intrusion detection techniques, and focus on how to apply association rules to intrusion detection. Begin with analyzing some close relations between user’s behaviors, we discuss the mining algorithm of association rules and apply to detect anomaly in IDS. Moreover, according to the characteristic of intrusion detection, we optimize the mining algorithm of association rules, and use fuzzy logic to improve the system performance.
文摘With the wider growth of web-based documents,the necessity of automatic document clustering and text summarization is increased.Here,document summarization that is extracting the essential task with appropriate information,removal of unnecessary data and providing the data in a cohesive and coherent manner is determined to be a most confronting task.In this research,a novel intelligent model for document clustering is designed with graph model and Fuzzy based association rule generation(gFAR).Initially,the graph model is used to map the relationship among the data(multi-source)followed by the establishment of document clustering with the generation of association rule using the fuzzy concept.This method shows benefit in redundancy elimination by mapping the relevant document using graph model and reduces the time consumption and improves the accuracy using the association rule generation with fuzzy.This framework is provided in an interpretable way for document clustering.It iteratively reduces the error rate during relationship mapping among the data(clusters)with the assistance of weighted document content.Also,this model represents the significance of data features with class discrimination.It is also helpful in measuring the significance of the features during the data clustering process.The simulation is done with MATLAB 2016b environment and evaluated with the empirical standards like Relative Risk Patterns(RRP),ROUGE score,and Discrimination Information Measure(DMI)respectively.Here,DailyMail and DUC 2004 dataset is used to extract the empirical results.The proposed gFAR model gives better trade-off while compared with various prevailing approaches.
基金Supported by the National Natural Science Foun-dation of China (70371015)
文摘Association rule mining is an important issue in data mining. The paper proposed an binary system based method to generate candidate frequent itemsets and corresponding supporting counts efficiently, which needs only some operations such as "and", "or" and "xor". Applying this idea in the existed distributed association rule mining al gorithm FDM, the improved algorithm BFDM is proposed. The theoretical analysis and experiment testify that BFDM is effective and efficient.
文摘A new algorithm for mining quantitative association rules with standard SQL is presented. The association rules are evaluated with the sufficiency gene LS of subjectivity Bayes reasoning. This algorithm is proved to be quick and effective with its application in Lujiang insects and pests database.
基金General Project of National Natural Science Foundation of China(No.81573970)BeijingMunicipal Natural Science Foundation(No.7202118)。
文摘Objective:To analyze the rule of prescribing traditional Chinese medicine for treating pneumoconiosis,so as to provide reference for differential diagnosis and treatment of pneumoconiosis as well as for the development of new drugs for treatingthe disease.Methods:We searched China National Knowledge Infrastructure,Wanfang Database and VIP Chinese PublicationDatabase to retrieve relevant literatures which were then screened according to the enrollment criteria to establish a prescriptiondatabase of traditional Chinese medicine for the treatment of pneumoconiosis.The inheritance calculation platform of traditionalChinese medicine was used to analyze the prescribing rule of traditional Chinese medicine in the treatment of pneumoconiosisbased on association rules,k-means clustering algorithm and regression model analysis.Results:A total of 131 related literature were preliminarily selected,from which 97 prescriptions of traditional Chinese medicine with a total of 195 herbs were included.The most frequently prescribed herbs included Radix astragali,Platycodon grandiflorum,Pinellia ternata,licorice,Codonopsispilosula,Salvia miltiorrhiza,bitter almond etc.A total of 14 association rules,13 high-frequency herb pairs were found and 5groups of formulas were revealed by cluster analysis.Conclusion:The prescriptions for the treatment of pneumoconiosis are mainly composed of herbs for tonifying deficiency,resolving phlegm,relieving cough and asthma,activating blood circulation and removingblood stasis,which are supplemented with herbs for clearing heat,relieving appearance,regulating qi,promoting waterand permeating dampness,etc.,The prescribing rules reflect the basic pathological characteristics of lung deficiency and collateral arthralgia in pneumoconiosis,which provides some ideas for the clinical differentiation and treatment of pneumoconiosis in traditionalChinese medicine.It also provides reference for the research and development of new treatment methods.