BACKGROUND It is increasingly common to find patients affected by a combination of type 2 diabetes mellitus(T2DM)and coronary artery disease(CAD),and studies are able to correlate their relationships with available bi...BACKGROUND It is increasingly common to find patients affected by a combination of type 2 diabetes mellitus(T2DM)and coronary artery disease(CAD),and studies are able to correlate their relationships with available biological and clinical evidence.The aim of the current study was to apply association rule mining(ARM)to discover whether there are consistent patterns of clinical features relevant to these diseases.ARM leverages clinical and laboratory data to the meaningful patterns for diabetic CAD by harnessing the power help of data-driven algorithms to optimise the decision-making in patient care.AIM To reinforce the evidence of the T2DM-CAD interplay and demonstrate the ability of ARM to provide new insights into multivariate pattern discovery.METHODS This cross-sectional study was conducted at the Department of Biochemistry in a specialized tertiary care centre in Delhi,involving a total of 300 consented subjects categorized into three groups:CAD with diabetes,CAD without diabetes,and healthy controls,with 100 subjects in each group.The participants were enrolled from the Cardiology IPD&OPD for the sample collection.The study employed ARM technique to extract the meaningful patterns and relationships from the clinical data with its original value.RESULTS The clinical dataset comprised 35 attributes from enrolled subjects.The analysis produced rules with a maximum branching factor of 4 and a rule length of 5,necessitating a 1%probability increase for enhancement.Prominent patterns emerged,highlighting strong links between health indicators and diabetes likelihood,particularly elevated HbA1C and random blood sugar levels.The ARM technique identified individuals with a random blood sugar level>175 and HbA1C>6.6 are likely in the“CAD-with-diabetes”group,offering valuable insights into health indicators and influencing factors on disease outcomes.CONCLUSION The application of this method holds promise for healthcare practitioners to offer valuable insights for enhancing patient treatment targeting specific subtypes of CAD with diabetes.Implying artificial intelligence techniques with medical data,we have shown the potential for personalized healthcare and the development of user-friendly applications aimed at improving cardiovascular health outcomes for this high-risk population to optimise the decision-making in patient care.展开更多
Association rule learning(ARL)is a widely used technique for discovering relationships within datasets.However,it often generates excessive irrelevant or ambiguous rules.Therefore,post-processing is crucial not only f...Association rule learning(ARL)is a widely used technique for discovering relationships within datasets.However,it often generates excessive irrelevant or ambiguous rules.Therefore,post-processing is crucial not only for removing irrelevant or redundant rules but also for uncovering hidden associations that impact other factors.Recently,several post-processing methods have been proposed,each with its own strengths and weaknesses.In this paper,we propose THAPE(Tunable Hybrid Associative Predictive Engine),which combines descriptive and predictive techniques.By leveraging both techniques,our aim is to enhance the quality of analyzing generated rules.This includes removing irrelevant or redundant rules,uncovering interesting and useful rules,exploring hidden association rules that may affect other factors,and providing backtracking ability for a given product.The proposed approach offers a tailored method that suits specific goals for retailers,enabling them to gain a better understanding of customer behavior based on factual transactions in the target market.We applied THAPE to a real dataset as a case study in this paper to demonstrate its effectiveness.Through this application,we successfully mined a concise set of highly interesting and useful association rules.Out of the 11,265 rules generated,we identified 125 rules that are particularly relevant to the business context.These identified rules significantly improve the interpretability and usefulness of association rules for decision-making purposes.展开更多
This study explores the factors influencing metro passengers’ arrival volume in Wuhan, China, and Lagos, Nigeria, by examining weather, time of day, waiting time, travel behavior, arrival patterns, and metro satisfac...This study explores the factors influencing metro passengers’ arrival volume in Wuhan, China, and Lagos, Nigeria, by examining weather, time of day, waiting time, travel behavior, arrival patterns, and metro satisfaction. It addresses a significant research gap in understanding metro passengers’ dynamics across cultural and geographical contexts. It employs questionnaires, field observations, and advanced data analysis techniques like association rule mining and neural network modeling. Key findings include a correlation between rainy weather, shorter waiting times, and higher arrival volumes. Neural network models showed high predictive accuracy, with waiting time, metro satisfaction, and weather being significant factors in Lagos Light Rail Blue Line Metro. In contrast, arrival patterns, weather, and time of day were more influential in Wuhan Metro Line 5. Results suggest that improving metro satisfaction and reducing waiting times could increase arrival volumes in Lagos Metro while adjusting schedules for weather and peak times could optimize flow in Wuhan Metro. These insights are valuable for transportation planning, passenger arrival volume management, and enhancing user experiences, potentially benefiting urban transportation sustainability and development goals.展开更多
Association rules and C4.5 rules can overcome the shortage of the traditional land evaluation methods and improve the intelligibility and efficiency of the land evaluation knowledge.In order to compare these two kinds...Association rules and C4.5 rules can overcome the shortage of the traditional land evaluation methods and improve the intelligibility and efficiency of the land evaluation knowledge.In order to compare these two kinds of classification rules in the application,two fuzzy classifiers were established by combining with fuzzy decision algorithm especially based on Second General Soil Survey of Guangdong Province.The results of experiments demonstrated that the fuzzy classifier based on association rules obtain a higher accuracy rate,but with more complex calculation process and more computational overhead;the fuzzy classifier based on C4.5 rules obtain a slightly lower accuracy,but with fast computation and simpler calculation.展开更多
随着电力系统的数字化和智能化发展,配变重过载预测成为了实现智能状态检修的关键技术之一。配变过载时空因子在现实场景中通常呈偏置分布。其中,部分高风险罕见(high risk and rare,HRR)因子一旦出现,将对变压器造成无法逆转的伤害。为...随着电力系统的数字化和智能化发展,配变重过载预测成为了实现智能状态检修的关键技术之一。配变过载时空因子在现实场景中通常呈偏置分布。其中,部分高风险罕见(high risk and rare,HRR)因子一旦出现,将对变压器造成无法逆转的伤害。为此,该文提出一种基于提高关联规则关键重要性(improved association rules‐criticality importance,IAR‐CI)模型的配变过载预测方法。首先,考虑内部与外部因素,收集多个数据源并建立配变运行状态数据库,且通过ICA识别与配变重过载强关联的罕见高危时段与HRR;其次,基于关键性重要度(criticality importance,CI)度量计算,设计一种因子权重计算方法,准确衡量因子的风险权重;最后,应用TBFP‐Growth算法,增强模型的运行效率。采用中国南方某地区电网数据进行算例仿真。研究表明,该方法能够提升配变重过载的预测性能,有助于后续巡检、检测策略的合理统筹和科学规划,可在降低电力设备运维检修成本的同时提高供电的可靠性。展开更多
Recent advancements in science and technology,coupled with the proliferation of data,have also urged laboratory medicine to integrate with the era of artificial intelligence(AI)and machine learning(ML).In the current ...Recent advancements in science and technology,coupled with the proliferation of data,have also urged laboratory medicine to integrate with the era of artificial intelligence(AI)and machine learning(ML).In the current practices of evidencebased medicine,the laboratory tests analysing disease patterns through the association rule mining(ARM)have emerged as a modern tool for the risk assessment and the disease stratification,with the potential to reduce cardiovascular disease(CVD)mortality.CVDs are the well recognised leading global cause of mortality with the higher fatality rates in the Indian population due to associated factors like hypertension,diabetes,and lifestyle choices.AI-driven algorithms have offered deep insights in this field while addressing various challenges such as healthcare systems grappling with the physician shortages.Personalized medicine,well driven by the big data necessitates the integration of ML techniques and high-quality electronic health records to direct the meaningful outcome.These technological advancements enhance the computational analyses for both research and clinical practice.ARM plays a pivotal role by uncovering meaningful relationships within databases,aiding in patient survival prediction and risk factor identification.AI potential in laboratory medicine is vast and it must be cautiously integrated while considering potential ethical,legal,and privacy concerns.Thus,an AI ethics framework is essential to guide its responsible use.Aligning AI algorithms with existing lab practices,promoting education among healthcare professionals,and fostering careful integration into clinical settings are imperative for harnessing the benefits of this transformative technology.展开更多
The advent of the big data era has provided many types of transportation datasets,such as metro smart card data,for studying residents’mobility and understanding how their mobility has been shaped and is shaping the ...The advent of the big data era has provided many types of transportation datasets,such as metro smart card data,for studying residents’mobility and understanding how their mobility has been shaped and is shaping the urban space.In this paper,we use metro smart card data from two Chinese metropolises,Shanghai and Shenzhen.Five metro mobility indicators are introduced,and association rules are established to explore the mobility patterns.The proportion of people entering and exiting the station is used to measure the jobs-housing balance.It is found that the average travel distance and duration of Shanghai passengers are higher than those of Shenzhen,and the proportion of metro commuters in Shanghai is higher than that of Shenzhen.The jobs-housing spatial relationship in Shenzhen based on metro travel is more balanced than that in Shanghai.The fundamental reason for the differences between the two cities is the difference in urban morphology.Compared with the monocentric structure of Shanghai,the polycentric structure of Shenzhen results in more scattered travel hotspots and more diverse travel routes,which helps Shenzhen to have a better jobs-housing balance.This paper fills a gap in comparative research among Chinese cities based on transportation big data analysis.The results provide support for planning metro routes,adjusting urban structure and land use to form a more reasonable metro network,and balancing the jobs-housing spatial relationship.展开更多
[Objectives]This study was conducted to analyze the medication rules of clinical prescriptions of traditional Chinese medicine decoction pieces for the treatment of novel coronavirus pneumonia(COVID-19)during the epid...[Objectives]This study was conducted to analyze the medication rules of clinical prescriptions of traditional Chinese medicine decoction pieces for the treatment of novel coronavirus pneumonia(COVID-19)during the epidemic in multiple regions based on data mining technology,so as to provide a reference for the treatment of COVID-19 with traditional Chinese medicine.[Methods]The traditional Chinese medicine prescriptions used since the outbreak of COVID-19 in Hubei Province during the fight against the epidemic from February 25,2020 to February 14,2022,the traditional Chinese medicine prescriptions used by Guizhou traditional Chinese medicine expert team aiding Hubei Province,the traditional Chinese medicine prescriptions for rehabilitation and conditioning of patients in Ezhou of Hubei Province after discharge,the traditional Chinese medicine prescriptions for the prevention and treatment of COVID-19 in Guizhou Province,and the traditional Chinese medicine prescriptions for the treatment of COVID-19 collected from the end of 2019 to the present from the Chinese database of CNKI were collected as the data of this study.Excel was used to establish a database and enter it into the TCM inheritance calculation platform V3.5,and the association rules and k-means clustering algorithm were used to analyze the frequency of herbal medicines in prescriptions during the treatment of COVID-19,the frequency of four natures,five flavors,meridian distribution,and drug combinations.[Results]A total of 1859 COVID-19 patients treated with traditional Chinese medicine were included,and the proportion of males was higher than that of females,and middle-aged and elderly people were the most common group.A total of 2170 prescriptions of traditional Chinese medicine were included,involving a total of 383 traditional Chinese medicines.High-frequency medicines included poria,Radix Bupleuri,Radix Scutellariae,Herba Pogostemonis,Fructus Forsythiae,Flos Loniceraeetc.The four natures were mainly concentrated in cold,warm and neutral,and the five flavors were mainly concentrated in bitter,pungent and sweet.The herbal medicines were mainly attributed to the lungs and stomach meridians,and were mainly of heat-clearing,exterior syndrome-relieving and diuresis-promoting and damp-clearing types.A total of 24 high-frequency herbal combinations and 35 association rule were excavated,and 3 types of formulas were obtained by cluster analysis.[Conclusions]The analysis results and medicine combinations obtained in the formulas are consistent with the traditional Chinese medicine treatment theory of COVID-19 caused by wind-heat filth accompanied with damp and toxin.展开更多
Typical association rules consider only items enumerated in transactions. Such rules are referred to as positive association rules. Negative association rules also consider the same items, but in addition consider neg...Typical association rules consider only items enumerated in transactions. Such rules are referred to as positive association rules. Negative association rules also consider the same items, but in addition consider negated items (i. e. absent from transactions). Negative association rules are useful in market-basket analysis to identify products that conflict with each other or products that complement each other. They are also very convenient for associative classifiers, classifiers that build their classification model based on association rules. Indeed, mining for such rules necessitates the examination of an exponentially large search space. Despite their usefulness, very few algorithms to mine them have been proposed to date. In this paper, an algorithm based on FP tree is presented to discover negative association rules.展开更多
A new algorithm for mining quantitative association rules with standard SQL is presented. The association rules are evaluated with the sufficiency gene LS of subjectivity Bayes reasoning. This algorithm is proved to b...A new algorithm for mining quantitative association rules with standard SQL is presented. The association rules are evaluated with the sufficiency gene LS of subjectivity Bayes reasoning. This algorithm is proved to be quick and effective with its application in Lujiang insects and pests database.展开更多
In the privacy preservation of association rules, sensitivity analysis should be reported after the quantification of items in terms of their occurrence. The traditional methodologies, used for preserving confidential...In the privacy preservation of association rules, sensitivity analysis should be reported after the quantification of items in terms of their occurrence. The traditional methodologies, used for preserving confidentiality of association rules, are based on the assumptions while safeguarding susceptible information rather than recognition of insightful items. Therefore, it is time to go one step ahead in order to remove such assumptions in the protection of responsive information especially in XML association rule mining. Thus, we focus on this central and highly researched area in terms of generating XML association rule mining without arguing on the disclosure risks involvement in such mining process. Hence, we described the identification of susceptible items in order to hide the confidential information through a supervised learning technique. These susceptible items show the high dependency on other items that are measured in terms of statistical significance with Bayesian Network. Thus, we proposed two methodologies based on items probabilistic occurrence and mode of items. Additionally, all this information is modeled and named PPDM (Privacy Preservation in Data Mining) model for XARs. Furthermore, the PPDM model is helpful for sharing markets information among competitors with a lower chance of generating monopoly. Finally, PPDM model introduces great accuracy in computing sensitivity of items and opens new dimensions to the academia for the standardization of such NP-hard problems.展开更多
Mining association rules from large database is very costly. We develop a parallel algorithm for this task on shared-memory multiprocessor (SMP). Most proposed parallel algorithms for association rules mining have to ...Mining association rules from large database is very costly. We develop a parallel algorithm for this task on shared-memory multiprocessor (SMP). Most proposed parallel algorithms for association rules mining have to scan the database at least two times. In this article, a parallel algorithm Scan Once (SO) has been proposed for SMP, which only scans the database once. And this algorithm is fundamentally different from the known parallel algorithm Count Distribution (CD). It adopts bit matrix to store the database information and gets the support of the frequent itemsets by adopting Vector-And-Operation, which greatly improve the efficiency of generating all frequent itemsets. Empirical evaluation shows that the algorithm outperforms the known one CD algorithm.展开更多
Frequent item sets mining plays an important role in association rules mining. A variety of algorithms for finding frequent item sets in very large transaction databases have been developed. Although many techniques w...Frequent item sets mining plays an important role in association rules mining. A variety of algorithms for finding frequent item sets in very large transaction databases have been developed. Although many techniques were proposed for maintenance of the discovered rules when new transactions are added, little work is done for maintaining the discovered rules when some transactions are deleted from the database. Updates are fundamental aspect of data management. In this paper, a decremental association rules mining algorithm is present for updating the discovered association rules when some transactions are removed from the original data set. Extensive experiments were conducted to evaluate the performance of the proposed algorithm. The results show that the proposed algorithm is efficient and outperforms other well-known algorithms.展开更多
Data mining techniques offer great opportunities for developing ethics lines whose main aim is to ensure improvements and compliance with the values, conduct and commitments making up the code of ethics. The aim of th...Data mining techniques offer great opportunities for developing ethics lines whose main aim is to ensure improvements and compliance with the values, conduct and commitments making up the code of ethics. The aim of this study is to suggest a process for exploiting the data generated by the data generated and collected from an ethics line by extracting rules of association and applying the Apriori algorithm. This makes it possible to identify anomalies and behaviour patterns requiring action to review, correct, promote or expand them, as appropriate.展开更多
[Objectives]To explore the compatibility rules of neonatal parenteral nutrition(PN)prescriptions based on association rules and hierarchical cluster analysis,thereby providing a reference for standardizing neonatal pa...[Objectives]To explore the compatibility rules of neonatal parenteral nutrition(PN)prescriptions based on association rules and hierarchical cluster analysis,thereby providing a reference for standardizing neonatal parenteral nutrition supportive therapy.[Methods]The data about neonatal PN formulations prepared by the Pharmacy Intravenous Admixture Services(PIVAS)of the Affiliated Hospital of Chengde Medical University from July 2015 to June 2021 were collected.The general information of the prescriptions and the frequency of drug use were analyzed with Excel 2019;the boxplot of drug dosing was drawn using GraphPad 8.0 software;and SPSS Modeler 18.0 and SPSS Statistics 26.0 were used to perform association rules and hierarchical cluster analysis.[Results]A total of 11488 PN prescriptions were collected from 1421 newborns,involving 18 kinds of drugs,which were divided into 11 types of nutrients.Association rules analysis yielded 84 nutrient substance combinations.The combination of fat emulsion-water-soluble vitamins-fat-soluble vitamins-glucose-amino acids had the highest confidence(99.95%).The hierarchical cluster analysis divided nutrients into 5 types.[Conclusions]The prescriptions of PN for newborns were composed of five types of nutrients:amino acids,fat emulsion,glucose,water-soluble vitamins,and fat-soluble vitamins.According to the lack of electrolytes and trace elements,appropriate drugs can be chosen to meet nutritional demands.This study provides reference basis for reasonable selection of drugs for neonatal PN prescriptions and further standardization of PN supportive therapy in newborns.展开更多
In data mining from transaction DB, the relationships between the attributes have been focused, but the relationships between the tuples have not been taken into account. In spatial database, there are relationships b...In data mining from transaction DB, the relationships between the attributes have been focused, but the relationships between the tuples have not been taken into account. In spatial database, there are relationships between the attributes and the tuples, and most of the associations occur between the tuples, such as adjacent, intersection, overlap and other topological relationships. So the tasks of spatial data association rules mining include mining the relationships between attributes of spatial objects, which are called as vertical direction DM, and the relationships between the tuples, which are called as horizontal direction DM. This paper analyzes the storage models of spatial data, uses for reference the technologies of data mining in transaction DB, defines the spatial data association rule, including vertical direction association rule, horizontal direction association rule and two-direction association rule, discusses the measurement of spatial association rule interestingness, and puts forward the work flows of spatial association rule data mining. During two-direction spatial association rules mining, an algorithm is proposed to get non-spatial itemsets. By virtue of spatial analysis, the spatial relations were transferred into non-spatial associations and the non-spatial itemsets were gotten. Based on the non-spatial itemsets, the Apriori algorithm or other algorithms could be used to get the frequent itemsets and then the spatial association rules come into being. Using spatial DB, the spatial association rules were gotten to validate the algorithm, and the test results show that this algorithm is efficient and can mine the interesting spatial rules.展开更多
An association rules mining method based on semantic relativity is proposed to solve the problem that there are more candidate item sets and higher time complexity in traditional association rules mining.Semantic rela...An association rules mining method based on semantic relativity is proposed to solve the problem that there are more candidate item sets and higher time complexity in traditional association rules mining.Semantic relativity of ontology concepts is used to describe complicated relationships of domains in the method.Candidate item sets with less semantic relativity are filtered to reduce the number of candidate item sets in association rules mining.An ontology hierarchy relationship is regarded as a directed acyclic graph rather than a hierarchy tree in the semantic relativity computation.Not only direct hierarchy relationships,but also non-direct hierarchy relationships and other typical semantic relationships are taken into account.Experimental results show that the proposed method can reduce the number of candidate item sets effectively and improve the efficiency of association rules mining.展开更多
Association rules are useful for determining correlations between items. Applying association rules to intrusion detection system (IDS) can improve the detection rate, but false positive rate is also increased. Weight...Association rules are useful for determining correlations between items. Applying association rules to intrusion detection system (IDS) can improve the detection rate, but false positive rate is also increased. Weighted association rules are used in this paper to mine intrustion models, which can increase the detection rate and decrease the false positive rate by some extent. Based on this, the structure of host-based IDS using weighted association rules is proposed.展开更多
文摘BACKGROUND It is increasingly common to find patients affected by a combination of type 2 diabetes mellitus(T2DM)and coronary artery disease(CAD),and studies are able to correlate their relationships with available biological and clinical evidence.The aim of the current study was to apply association rule mining(ARM)to discover whether there are consistent patterns of clinical features relevant to these diseases.ARM leverages clinical and laboratory data to the meaningful patterns for diabetic CAD by harnessing the power help of data-driven algorithms to optimise the decision-making in patient care.AIM To reinforce the evidence of the T2DM-CAD interplay and demonstrate the ability of ARM to provide new insights into multivariate pattern discovery.METHODS This cross-sectional study was conducted at the Department of Biochemistry in a specialized tertiary care centre in Delhi,involving a total of 300 consented subjects categorized into three groups:CAD with diabetes,CAD without diabetes,and healthy controls,with 100 subjects in each group.The participants were enrolled from the Cardiology IPD&OPD for the sample collection.The study employed ARM technique to extract the meaningful patterns and relationships from the clinical data with its original value.RESULTS The clinical dataset comprised 35 attributes from enrolled subjects.The analysis produced rules with a maximum branching factor of 4 and a rule length of 5,necessitating a 1%probability increase for enhancement.Prominent patterns emerged,highlighting strong links between health indicators and diabetes likelihood,particularly elevated HbA1C and random blood sugar levels.The ARM technique identified individuals with a random blood sugar level>175 and HbA1C>6.6 are likely in the“CAD-with-diabetes”group,offering valuable insights into health indicators and influencing factors on disease outcomes.CONCLUSION The application of this method holds promise for healthcare practitioners to offer valuable insights for enhancing patient treatment targeting specific subtypes of CAD with diabetes.Implying artificial intelligence techniques with medical data,we have shown the potential for personalized healthcare and the development of user-friendly applications aimed at improving cardiovascular health outcomes for this high-risk population to optimise the decision-making in patient care.
文摘Association rule learning(ARL)is a widely used technique for discovering relationships within datasets.However,it often generates excessive irrelevant or ambiguous rules.Therefore,post-processing is crucial not only for removing irrelevant or redundant rules but also for uncovering hidden associations that impact other factors.Recently,several post-processing methods have been proposed,each with its own strengths and weaknesses.In this paper,we propose THAPE(Tunable Hybrid Associative Predictive Engine),which combines descriptive and predictive techniques.By leveraging both techniques,our aim is to enhance the quality of analyzing generated rules.This includes removing irrelevant or redundant rules,uncovering interesting and useful rules,exploring hidden association rules that may affect other factors,and providing backtracking ability for a given product.The proposed approach offers a tailored method that suits specific goals for retailers,enabling them to gain a better understanding of customer behavior based on factual transactions in the target market.We applied THAPE to a real dataset as a case study in this paper to demonstrate its effectiveness.Through this application,we successfully mined a concise set of highly interesting and useful association rules.Out of the 11,265 rules generated,we identified 125 rules that are particularly relevant to the business context.These identified rules significantly improve the interpretability and usefulness of association rules for decision-making purposes.
文摘This study explores the factors influencing metro passengers’ arrival volume in Wuhan, China, and Lagos, Nigeria, by examining weather, time of day, waiting time, travel behavior, arrival patterns, and metro satisfaction. It addresses a significant research gap in understanding metro passengers’ dynamics across cultural and geographical contexts. It employs questionnaires, field observations, and advanced data analysis techniques like association rule mining and neural network modeling. Key findings include a correlation between rainy weather, shorter waiting times, and higher arrival volumes. Neural network models showed high predictive accuracy, with waiting time, metro satisfaction, and weather being significant factors in Lagos Light Rail Blue Line Metro. In contrast, arrival patterns, weather, and time of day were more influential in Wuhan Metro Line 5. Results suggest that improving metro satisfaction and reducing waiting times could increase arrival volumes in Lagos Metro while adjusting schedules for weather and peak times could optimize flow in Wuhan Metro. These insights are valuable for transportation planning, passenger arrival volume management, and enhancing user experiences, potentially benefiting urban transportation sustainability and development goals.
基金Supported by Science and Technology Plan Project of Guangdong Province (2009B010900026,2009CD058,2009CD078,2009CD079,2009CD080)Special Funds for Support Program of Development of Modern Information Service Industry of Guangdong Province(06120840B0370124)Funded Fund Project of South China Agricultural University (2007K017)~~
文摘Association rules and C4.5 rules can overcome the shortage of the traditional land evaluation methods and improve the intelligibility and efficiency of the land evaluation knowledge.In order to compare these two kinds of classification rules in the application,two fuzzy classifiers were established by combining with fuzzy decision algorithm especially based on Second General Soil Survey of Guangdong Province.The results of experiments demonstrated that the fuzzy classifier based on association rules obtain a higher accuracy rate,but with more complex calculation process and more computational overhead;the fuzzy classifier based on C4.5 rules obtain a slightly lower accuracy,but with fast computation and simpler calculation.
文摘随着电力系统的数字化和智能化发展,配变重过载预测成为了实现智能状态检修的关键技术之一。配变过载时空因子在现实场景中通常呈偏置分布。其中,部分高风险罕见(high risk and rare,HRR)因子一旦出现,将对变压器造成无法逆转的伤害。为此,该文提出一种基于提高关联规则关键重要性(improved association rules‐criticality importance,IAR‐CI)模型的配变过载预测方法。首先,考虑内部与外部因素,收集多个数据源并建立配变运行状态数据库,且通过ICA识别与配变重过载强关联的罕见高危时段与HRR;其次,基于关键性重要度(criticality importance,CI)度量计算,设计一种因子权重计算方法,准确衡量因子的风险权重;最后,应用TBFP‐Growth算法,增强模型的运行效率。采用中国南方某地区电网数据进行算例仿真。研究表明,该方法能够提升配变重过载的预测性能,有助于后续巡检、检测策略的合理统筹和科学规划,可在降低电力设备运维检修成本的同时提高供电的可靠性。
文摘Recent advancements in science and technology,coupled with the proliferation of data,have also urged laboratory medicine to integrate with the era of artificial intelligence(AI)and machine learning(ML).In the current practices of evidencebased medicine,the laboratory tests analysing disease patterns through the association rule mining(ARM)have emerged as a modern tool for the risk assessment and the disease stratification,with the potential to reduce cardiovascular disease(CVD)mortality.CVDs are the well recognised leading global cause of mortality with the higher fatality rates in the Indian population due to associated factors like hypertension,diabetes,and lifestyle choices.AI-driven algorithms have offered deep insights in this field while addressing various challenges such as healthcare systems grappling with the physician shortages.Personalized medicine,well driven by the big data necessitates the integration of ML techniques and high-quality electronic health records to direct the meaningful outcome.These technological advancements enhance the computational analyses for both research and clinical practice.ARM plays a pivotal role by uncovering meaningful relationships within databases,aiding in patient survival prediction and risk factor identification.AI potential in laboratory medicine is vast and it must be cautiously integrated while considering potential ethical,legal,and privacy concerns.Thus,an AI ethics framework is essential to guide its responsible use.Aligning AI algorithms with existing lab practices,promoting education among healthcare professionals,and fostering careful integration into clinical settings are imperative for harnessing the benefits of this transformative technology.
基金National Key R&D Program of China(No.2019YFB2103102)Hong Kong Polytechnic University(No.CD06,P0042540)。
文摘The advent of the big data era has provided many types of transportation datasets,such as metro smart card data,for studying residents’mobility and understanding how their mobility has been shaped and is shaping the urban space.In this paper,we use metro smart card data from two Chinese metropolises,Shanghai and Shenzhen.Five metro mobility indicators are introduced,and association rules are established to explore the mobility patterns.The proportion of people entering and exiting the station is used to measure the jobs-housing balance.It is found that the average travel distance and duration of Shanghai passengers are higher than those of Shenzhen,and the proportion of metro commuters in Shanghai is higher than that of Shenzhen.The jobs-housing spatial relationship in Shenzhen based on metro travel is more balanced than that in Shanghai.The fundamental reason for the differences between the two cities is the difference in urban morphology.Compared with the monocentric structure of Shanghai,the polycentric structure of Shenzhen results in more scattered travel hotspots and more diverse travel routes,which helps Shenzhen to have a better jobs-housing balance.This paper fills a gap in comparative research among Chinese cities based on transportation big data analysis.The results provide support for planning metro routes,adjusting urban structure and land use to form a more reasonable metro network,and balancing the jobs-housing spatial relationship.
基金Supported by Public Health and Epidemic Prevention and Control Project of Guiyang Bureau of Science and Technology([2022]-4-4-5)Guizhou Provincial Key Discipline of Traditional Chinese Medicine and Ethnic Medicine:Clinical Traditional Chinese Medicine(QZYYZDXK(JS)-2023-04).
文摘[Objectives]This study was conducted to analyze the medication rules of clinical prescriptions of traditional Chinese medicine decoction pieces for the treatment of novel coronavirus pneumonia(COVID-19)during the epidemic in multiple regions based on data mining technology,so as to provide a reference for the treatment of COVID-19 with traditional Chinese medicine.[Methods]The traditional Chinese medicine prescriptions used since the outbreak of COVID-19 in Hubei Province during the fight against the epidemic from February 25,2020 to February 14,2022,the traditional Chinese medicine prescriptions used by Guizhou traditional Chinese medicine expert team aiding Hubei Province,the traditional Chinese medicine prescriptions for rehabilitation and conditioning of patients in Ezhou of Hubei Province after discharge,the traditional Chinese medicine prescriptions for the prevention and treatment of COVID-19 in Guizhou Province,and the traditional Chinese medicine prescriptions for the treatment of COVID-19 collected from the end of 2019 to the present from the Chinese database of CNKI were collected as the data of this study.Excel was used to establish a database and enter it into the TCM inheritance calculation platform V3.5,and the association rules and k-means clustering algorithm were used to analyze the frequency of herbal medicines in prescriptions during the treatment of COVID-19,the frequency of four natures,five flavors,meridian distribution,and drug combinations.[Results]A total of 1859 COVID-19 patients treated with traditional Chinese medicine were included,and the proportion of males was higher than that of females,and middle-aged and elderly people were the most common group.A total of 2170 prescriptions of traditional Chinese medicine were included,involving a total of 383 traditional Chinese medicines.High-frequency medicines included poria,Radix Bupleuri,Radix Scutellariae,Herba Pogostemonis,Fructus Forsythiae,Flos Loniceraeetc.The four natures were mainly concentrated in cold,warm and neutral,and the five flavors were mainly concentrated in bitter,pungent and sweet.The herbal medicines were mainly attributed to the lungs and stomach meridians,and were mainly of heat-clearing,exterior syndrome-relieving and diuresis-promoting and damp-clearing types.A total of 24 high-frequency herbal combinations and 35 association rule were excavated,and 3 types of formulas were obtained by cluster analysis.[Conclusions]The analysis results and medicine combinations obtained in the formulas are consistent with the traditional Chinese medicine treatment theory of COVID-19 caused by wind-heat filth accompanied with damp and toxin.
基金Supported by the National Natural Science Foun-dation of China(70371015) and the Science Foundation of JiangsuUniversity ( 04KJD001)
文摘Typical association rules consider only items enumerated in transactions. Such rules are referred to as positive association rules. Negative association rules also consider the same items, but in addition consider negated items (i. e. absent from transactions). Negative association rules are useful in market-basket analysis to identify products that conflict with each other or products that complement each other. They are also very convenient for associative classifiers, classifiers that build their classification model based on association rules. Indeed, mining for such rules necessitates the examination of an exponentially large search space. Despite their usefulness, very few algorithms to mine them have been proposed to date. In this paper, an algorithm based on FP tree is presented to discover negative association rules.
文摘A new algorithm for mining quantitative association rules with standard SQL is presented. The association rules are evaluated with the sufficiency gene LS of subjectivity Bayes reasoning. This algorithm is proved to be quick and effective with its application in Lujiang insects and pests database.
文摘In the privacy preservation of association rules, sensitivity analysis should be reported after the quantification of items in terms of their occurrence. The traditional methodologies, used for preserving confidentiality of association rules, are based on the assumptions while safeguarding susceptible information rather than recognition of insightful items. Therefore, it is time to go one step ahead in order to remove such assumptions in the protection of responsive information especially in XML association rule mining. Thus, we focus on this central and highly researched area in terms of generating XML association rule mining without arguing on the disclosure risks involvement in such mining process. Hence, we described the identification of susceptible items in order to hide the confidential information through a supervised learning technique. These susceptible items show the high dependency on other items that are measured in terms of statistical significance with Bayesian Network. Thus, we proposed two methodologies based on items probabilistic occurrence and mode of items. Additionally, all this information is modeled and named PPDM (Privacy Preservation in Data Mining) model for XARs. Furthermore, the PPDM model is helpful for sharing markets information among competitors with a lower chance of generating monopoly. Finally, PPDM model introduces great accuracy in computing sensitivity of items and opens new dimensions to the academia for the standardization of such NP-hard problems.
文摘Mining association rules from large database is very costly. We develop a parallel algorithm for this task on shared-memory multiprocessor (SMP). Most proposed parallel algorithms for association rules mining have to scan the database at least two times. In this article, a parallel algorithm Scan Once (SO) has been proposed for SMP, which only scans the database once. And this algorithm is fundamentally different from the known parallel algorithm Count Distribution (CD). It adopts bit matrix to store the database information and gets the support of the frequent itemsets by adopting Vector-And-Operation, which greatly improve the efficiency of generating all frequent itemsets. Empirical evaluation shows that the algorithm outperforms the known one CD algorithm.
文摘Frequent item sets mining plays an important role in association rules mining. A variety of algorithms for finding frequent item sets in very large transaction databases have been developed. Although many techniques were proposed for maintenance of the discovered rules when new transactions are added, little work is done for maintaining the discovered rules when some transactions are deleted from the database. Updates are fundamental aspect of data management. In this paper, a decremental association rules mining algorithm is present for updating the discovered association rules when some transactions are removed from the original data set. Extensive experiments were conducted to evaluate the performance of the proposed algorithm. The results show that the proposed algorithm is efficient and outperforms other well-known algorithms.
文摘Data mining techniques offer great opportunities for developing ethics lines whose main aim is to ensure improvements and compliance with the values, conduct and commitments making up the code of ethics. The aim of this study is to suggest a process for exploiting the data generated by the data generated and collected from an ethics line by extracting rules of association and applying the Apriori algorithm. This makes it possible to identify anomalies and behaviour patterns requiring action to review, correct, promote or expand them, as appropriate.
基金Supported by Science and Technology Research and Development Project of Chengde City,Hebei Province(201706A043)Young Scholar Program of Hebei Pharmaceutical Association Hospital Pharmaceutical Research Project(2020—Hbsyxhqn0029).
文摘[Objectives]To explore the compatibility rules of neonatal parenteral nutrition(PN)prescriptions based on association rules and hierarchical cluster analysis,thereby providing a reference for standardizing neonatal parenteral nutrition supportive therapy.[Methods]The data about neonatal PN formulations prepared by the Pharmacy Intravenous Admixture Services(PIVAS)of the Affiliated Hospital of Chengde Medical University from July 2015 to June 2021 were collected.The general information of the prescriptions and the frequency of drug use were analyzed with Excel 2019;the boxplot of drug dosing was drawn using GraphPad 8.0 software;and SPSS Modeler 18.0 and SPSS Statistics 26.0 were used to perform association rules and hierarchical cluster analysis.[Results]A total of 11488 PN prescriptions were collected from 1421 newborns,involving 18 kinds of drugs,which were divided into 11 types of nutrients.Association rules analysis yielded 84 nutrient substance combinations.The combination of fat emulsion-water-soluble vitamins-fat-soluble vitamins-glucose-amino acids had the highest confidence(99.95%).The hierarchical cluster analysis divided nutrients into 5 types.[Conclusions]The prescriptions of PN for newborns were composed of five types of nutrients:amino acids,fat emulsion,glucose,water-soluble vitamins,and fat-soluble vitamins.According to the lack of electrolytes and trace elements,appropriate drugs can be chosen to meet nutritional demands.This study provides reference basis for reasonable selection of drugs for neonatal PN prescriptions and further standardization of PN supportive therapy in newborns.
基金The work is supported by Natural Science Foundatiion of Chongqing (No .CSTC 2005BB2065)
文摘In data mining from transaction DB, the relationships between the attributes have been focused, but the relationships between the tuples have not been taken into account. In spatial database, there are relationships between the attributes and the tuples, and most of the associations occur between the tuples, such as adjacent, intersection, overlap and other topological relationships. So the tasks of spatial data association rules mining include mining the relationships between attributes of spatial objects, which are called as vertical direction DM, and the relationships between the tuples, which are called as horizontal direction DM. This paper analyzes the storage models of spatial data, uses for reference the technologies of data mining in transaction DB, defines the spatial data association rule, including vertical direction association rule, horizontal direction association rule and two-direction association rule, discusses the measurement of spatial association rule interestingness, and puts forward the work flows of spatial association rule data mining. During two-direction spatial association rules mining, an algorithm is proposed to get non-spatial itemsets. By virtue of spatial analysis, the spatial relations were transferred into non-spatial associations and the non-spatial itemsets were gotten. Based on the non-spatial itemsets, the Apriori algorithm or other algorithms could be used to get the frequent itemsets and then the spatial association rules come into being. Using spatial DB, the spatial association rules were gotten to validate the algorithm, and the test results show that this algorithm is efficient and can mine the interesting spatial rules.
基金The National Natural Science Foundation of China(No.50674086)Specialized Research Fund for the Doctoral Program of Higher Education(No.20060290508)the Science and Technology Fund of China University of Mining and Technology(No.2007B016)
文摘An association rules mining method based on semantic relativity is proposed to solve the problem that there are more candidate item sets and higher time complexity in traditional association rules mining.Semantic relativity of ontology concepts is used to describe complicated relationships of domains in the method.Candidate item sets with less semantic relativity are filtered to reduce the number of candidate item sets in association rules mining.An ontology hierarchy relationship is regarded as a directed acyclic graph rather than a hierarchy tree in the semantic relativity computation.Not only direct hierarchy relationships,but also non-direct hierarchy relationships and other typical semantic relationships are taken into account.Experimental results show that the proposed method can reduce the number of candidate item sets effectively and improve the efficiency of association rules mining.
文摘Association rules are useful for determining correlations between items. Applying association rules to intrusion detection system (IDS) can improve the detection rate, but false positive rate is also increased. Weighted association rules are used in this paper to mine intrustion models, which can increase the detection rate and decrease the false positive rate by some extent. Based on this, the structure of host-based IDS using weighted association rules is proposed.