Association rules mining is a major data mining field that leads to discovery of associations and correlations among items in today’s big data environment. The conventional association rule mining focuses mainly on p...Association rules mining is a major data mining field that leads to discovery of associations and correlations among items in today’s big data environment. The conventional association rule mining focuses mainly on positive itemsets generated from frequently occurring itemsets (PFIS). However, there has been a significant study focused on infrequent itemsets with utilization of negative association rules to mine interesting frequent itemsets (NFIS) from transactions. In this work, we propose an efficient backward calculating negative frequent itemset algorithm namely EBC-NFIS for computing backward supports that can extract both positive and negative frequent itemsets synchronously from dataset. EBC-NFIS algorithm is based on popular e-NFIS algorithm that computes supports of negative itemsets from the supports of positive itemsets. The proposed algorithm makes use of previously computed supports from memory to minimize the computation time. In addition, association rules, i.e. positive and negative association rules (PNARs) are generated from discovered frequent itemsets using EBC-NFIS algorithm. The efficiency of the proposed algorithm is verified by several experiments and comparing results with e-NFIS algorithm. The experimental results confirm that the proposed algorithm successfully discovers NFIS and PNARs and runs significantly faster than conventional e-NFIS algorithm.展开更多
Objective:To explore the medication rule of Traditional Chinese Medicine(TCM)in the treatment of sleep disorder after stroke by using data mining technology.Methods:A computer search was used to search the electronic ...Objective:To explore the medication rule of Traditional Chinese Medicine(TCM)in the treatment of sleep disorder after stroke by using data mining technology.Methods:A computer search was used to search the electronic database of clinical literature on the treatment of sleep disorders after stroke by TCM from January 2000 to January 2021.Excel was used to establish the database,and the prescription information was described and analyzed statistically.Using IBM SPSS Modeler 18.0 software,Apriori algorithm was used for TCM association analysis,and IBM SPSS 22.0 software was used for systematic cluster analysis of high-frequency TCM.Results:A total of 67 literatures were included,covering 131 traditional Chinese medicines.The medecines with a higher frequency of sodium use include Ziziphi Spinosae Semen(Suanzaoren),Angelicae Sinensis Radix(Danggui),Ligusticum(Chuanxiong),liquorice(Gancao),Poria cocos(Fuling),and so on.From the effect point of view,deficiency-tonifying medicine,sedative medicine and blood-activating and stasis-removing medicine are commonly used.The medicinal properties are mainly cold,mild and warm.The main medicine flavor are sweet and bitter.The medicines mostly belong to the liver,heart and spleen Meridian.Thirty-three association rules were obtained for medicine pairs and medicine groups from the correlation analysis,and the core combinations were"Ziziphi Spinosae Semen(Suanzaoren)-Tuber fleeceflower stem(Yejiaoteng)","Ziziphi Spinosae Semen(Suanzaoren)-Polygala(Yuanzhi)","Ziziphi Spinosae Semen(Suanzaoren)-Cortex albiziae(Hehuanpi)"and"Angelicae Sinensis Radix(Danggui)-Radix bupleuri(Chaihu)-Radix Paeoniae Alba(Baishao)"and so on.Seven medicine aggregation groups were obtained by medicine cluster analysis.Conclusion:In the treatment of sleep disorder after stroke by TCM,the main method is to calm the heart and mind.Meanwhile,according to different syndrome types,the treatment methods of tonifying the heart and spleen,nourishing the liver and kidney,soothing the liver and softening the liver,clearing heat and resolving phlegm,nourishing the blood and promoting blood circulation are selected,which provide certain reference for clinical treatment.展开更多
Objective:Use data mining techniques to explore the rule of Chinese medicine used for airway remodeling.Methods:Search the literature on Chinese medicine use for airway remodeling in the past 20 years.With the help of...Objective:Use data mining techniques to explore the rule of Chinese medicine used for airway remodeling.Methods:Search the literature on Chinese medicine use for airway remodeling in the past 20 years.With the help of WPS Office Excel 11.1,IBM SPSS Statistics 23.0 and SPSS Modeler 18.0 software,prescriptions were analyzed for the frequency of drug use,the four natures,the five flavours and the channel tropism,cluster analysis and association analysis of high-frequency drugs.Results:There were 58 Chinese medicine prescriptions for airway remodeling be found,involving 105 Chinese medicines,the most frequent channel tropism were spleen,stomach,lung,large intestine,liver and gallbladder,the most frequent use of the five flavors was sour,sweet and pungent,the highest frequency of the four natures was cold and hot,cluster analysis yielded eight drug aggregation groups,and association rule analysis yielded five groups of high-frequency drug pairs.Conclusion:The main TCM treatments for airway remodeling are expelling phlegm,relieving cough,asthma calming,expelling blood stasis and deficiency tonifying.The results of this study can provide ideas for compounding and drug selection for subsequent studies.展开更多
Objective: Based on data mining, to explore the medication rules of Chinese medicine for the treatment of restless legs syndrome(RLS). Methods: The CNKI, WANFANG, and VIP were taken as data sources, and "restless...Objective: Based on data mining, to explore the medication rules of Chinese medicine for the treatment of restless legs syndrome(RLS). Methods: The CNKI, WANFANG, and VIP were taken as data sources, and "restless legs syndrome, RLS" as the key words, and "Chinese medicine, Chinese materia medica, traditional Chinese medicine(TCM), traditional Chinese and Western medicine" as sub key words, the data was extracted from the journals and literature related to the treatment of RLS by TCM from the establishment of the database to 2020, and data mining techniques(frequency analysis, cluster analysis, association rules) were used to analyze the core drugs and drug pair(group) rules. Results: A total of 87 prescriptions met the requirements of this study, involving 142 Chinese herbal medicines. The top 5 Chinese herbal medicines with a higher frequency of use were Radix Paeoniae Alba, Radix Glycyrrhizae, Radix Angelicae Sinensis, Fructus Chaenomelis and Radix Astragali seu Hedysari. The four Qi(气) of the medicine were mainly warm and neutral, the five flavors were mainly sweet, bitter, and pungent. The main channels of the meridian were mainly the liver meridian, spleen meridian and heart meridian. The medication categories were mainly tonifying deficiency herbs, blood activating and removing blood stasis herbs, and eliminating wind and dampness herbs. The association rule analysis yielded 24 Chinese medicine combinations with high support, and the hierarchical cluster analysis yielded a total of 5 clusters. Conclusion: TCM treatment of RLS is based on tonifying deficiency herbs, especially to replenish Qi and blood throughout the course of the disease, supplemented by herbs for promoting blood circulation and removing blood stasis, and herbs for eliminating wind and dampness, as well as combined with herbs for reliving superficies and herbs for calming the liver to stop the wind.展开更多
In conjunction with association rules for data mining, the connections between testing indices and strong and weak association rules were determined, and new derivative rules were obtained by further reasoning. Associ...In conjunction with association rules for data mining, the connections between testing indices and strong and weak association rules were determined, and new derivative rules were obtained by further reasoning. Association rules were used to analyze correlation and check consistency between indices. This study shows that the judgment obtained by weak association rules or non-association rules is more accurate and more credible than that obtained by strong association rules. When the testing grades of two indices in the weak association rules are inconsistent, the testing grades of indices are more likely to be erroneous, and the mistakes are often caused by human factors. Clustering data mining technology was used to analyze the reliability of a diagnosis, or to perform health diagnosis directly. Analysis showed that the clustering results are related to the indices selected, and that if the indices selected are more significant, the characteristics of clustering results are also more significant, and the analysis or diagnosis is more credible. The indices and diagnosis analysis function produced by this study provide a necessary theoretical foundation and new ideas for the development of hydraulic metal structure health diagnosis technology.展开更多
In communication alarm correlation analysis,traditional association rules generation(ARG) algorithm usually has low efficiency and high error rate.This paper proposes an alarm correlation rules generation algorithm ba...In communication alarm correlation analysis,traditional association rules generation(ARG) algorithm usually has low efficiency and high error rate.This paper proposes an alarm correlation rules generation algorithm based on the confidence covered value.Confidence covered value method can judge whether a rule is redundant or not scientific After the rules that based on weighted frequent patterns(WFPs) generated,the association rules were deleted by the confidence covered value,in order to delete the redundant rules and keep the rules with more information.Experiments show that the alarm correlation rules generation algorithm based on the confidence covered value has higher efficiency than the traditional method,and can effectively remove redundant rules.Thus it is very suitable for telecommunication alarm association rules processing.展开更多
Despite advances in technological complexity and efforts,software repository maintenance requires reusing the data to reduce the effort and complexity.However,increasing ambiguity,irrelevance,and bugs while extracting...Despite advances in technological complexity and efforts,software repository maintenance requires reusing the data to reduce the effort and complexity.However,increasing ambiguity,irrelevance,and bugs while extracting similar data during software development generate a large amount of data from those data that reside in repositories.Thus,there is a need for a repository mining technique for relevant and bug-free data prediction.This paper proposes a fault prediction approach using a data-mining technique to find good predictors for high-quality software.To predict errors in mining data,the Apriori algorithm was used to discover association rules by fixing confidence at more than 40%and support at least 30%.The pruning strategy was adopted based on evaluation measures.Next,the rules were extracted from three projects of different domains;the extracted rules were then combined to obtain the most popular rules based on the evaluation measure values.To evaluate the proposed approach,we conducted an experimental study to compare the proposed rules with existing ones using four different industrial projects.The evaluation showed that the results of our proposal are promising.Practitioners and developers can utilize these rules for defect prediction during early software development.展开更多
Objective This study aimed to examine and propagate the medication experience and group formula of traditional Chinese medicine(TCM)Master XIONG Jibo in diagnosing and treat-ing arthralgia syndrome(AS)through data min...Objective This study aimed to examine and propagate the medication experience and group formula of traditional Chinese medicine(TCM)Master XIONG Jibo in diagnosing and treat-ing arthralgia syndrome(AS)through data mining.Methods Data of outpatient cases of Professor XIONG Jibo were collected from January 1,2014 to December 31,2018,along with cases recorded in A Real Famous Traditional Chinese Medicine Doctor:XIONG Jibo's Clinical Medical Record 1,which was published in December 2019.The five variables collected from the patients’data were TCM diagnostic information,TCM and western medicine diagnoses,syndrome,treatment,and prescription.A database was established for the collected data with Excel.Using the Python environment,a custom-ized modified natural language processing(NLP)model for the diagnosis and treatment of AS by Professor XIONG Jibo was established to preprocess the data and to analyze the word cloud.Frequency analysis,association rule analysis,cluster analysis,and visual analysis of AS cases were performed based on the Traditional Chinese Medicine Inheritance Computing Platform(V3.0)and RStudio(V4.0.3).Results A total of 610 medical records of Professor XIONG Jibo were collected from the case database.A total of 103 medical records were included after data screening criteria,which comprised 187 times(45 kinds)of prescriptions and 1506 times(125 kinds)of Chinese herbs.The main related meridians were the liver,spleen,and kidney meridians.The properties of Chinese herbs used most were mainly warm,flat,and cold,while the flavors of herbs were mainly bitter,pungent,and sweet.The main patterns of AS included the damp heat,phlegm stasis,and neck arthralgia.The most commonly used herbs for AS were Chuanniuxi(Cyathu-lae Radix),Huangbo(Phellodendri Chinensis Cortex),Cangzhu(Atractylodis Rhizoma),Qinjiao(Gentianae Macrophyllae Radix),Gancao(Glycyrrhizae Radix et Rhizoma),Huangqi(Astragali Radix),and Chuanxiong(Chuanxiong Rhizoma).The most common effect of the herbs was“promoting blood circulation and removing blood stasis”,followed by“supple-menting deficiency(Qi supplementing,blood supplementing,and Yang supplementing)”,and“dispelling wind and dampness”.The data were analyzed with the support≥15%and con-fidence=100%,and after de-duplication,five second-order association rules,39 third-order association rules,39 fourth-order association rules,and two fifth-order association rules were identified.The top-ranking association rules of each were“Cangzhu(Atractylodis Rhizoma)→Huangbo(Phellodendri Chinensis Cortex)”“Cangzhu(Atractylodis Rhizoma)+Chuanniuxi(Cyathulae Radix)→Huangbo(Phellodendri Chinensis Cortex)”“Chuanniuxi(Cyathulae Radix)+Danggui(Angelicae Sinensis Radix)+Gancao(Glycyrrhizae Radix et Rhizoma)→Qinjiao(Gentianae Macrophyllae Radix)”and“Chuanniuxi(Cyathulae Radix)+Danggui(Angelicae Sinensis Radix)+Gancao(Glycyrrhizae Radix et Rhizoma)+Huangbo(Phello-dendri Chinensis Cortex)→Qinjiao(Gentianae Macrophyllae Radix)”,respectively.Five clusters were obtained using cluster analysis of the top 30 herbs.The herbs were mainly dry-ing dampness,supplementing Qi,and promoting blood circulation.The main prescriptions of AS were Ermiao San(二妙散),Gegen Jianghuang San(葛根姜黄散),and Huangqi Chongteng Yin(黄芪虫藤饮).The herbs of core prescription included Cangzhu(Atractylodis Rhizoma),Chuanniuxi(Cyathulae Radix),Gancao(Glycyrrhizae Radix et Rhizoma),Huangbo(Phellodendri Chinensis Cortex),Mugua(Chaenomelis Fructus),Qinjiao(Gentianae Macro-phyllae Radix),Danggui(Angelicae Sinensis Radix),and Yiyiren(Coicis Semen).Conclusion Clearing heat and dampness,relieving collaterals and pain,and invigorating Qi and blood are the most commonly used therapies for the treatment of AS by Professor XIONG Jibo.Additionally,customized NLP model could improve the efficiency of data mining in TCM.展开更多
Trauma is the most common cause of death to young people and many of these deaths are preventable [1]. The prediction of trauma patients outcome was a difficult problem to investigate till present times. In this study...Trauma is the most common cause of death to young people and many of these deaths are preventable [1]. The prediction of trauma patients outcome was a difficult problem to investigate till present times. In this study, prediction models are built and their capabilities to accurately predict the mortality are assessed. The analysis includes a comparison of data mining techniques using classification, clustering and association algorithms. Data were collected by Hellenic Trauma and Emergency Surgery Society from 30 Greek hospitals. Dataset contains records of 8544 patients suffering from severe injuries collected from the year 2005 to 2006. Factors include patients' demographic elements and several other variables registered from the time and place of accident until the hospital treatment and final outcome. Using this analysis the obtained results are compared in terms of sensitivity, specificity, positive predictive value and negative predictive value and the ROC curve depicts these methods performance.展开更多
Intrusion detection is regarded as classification in data mining field. However instead of directly mining the classification rules, class association rules, which are then used to construct a classifier, are mined fr...Intrusion detection is regarded as classification in data mining field. However instead of directly mining the classification rules, class association rules, which are then used to construct a classifier, are mined from audit logs. Some attributes in audit logs are important for detecting intrusion but their values are distributed skewedly. A relative support concept is proposed to deal with such situation. To mine class association rules effectively, an algorithms based on FP-tree is exploited. Experiment result proves that this method has better performance.展开更多
In order to make effective use a large amount of graduate data in colleges and universities that accumulate by teaching management of work, the paper study the data mining for higher vocational graduates database usin...In order to make effective use a large amount of graduate data in colleges and universities that accumulate by teaching management of work, the paper study the data mining for higher vocational graduates database using the data mining technology. Using a variety of data preprocessing methods for the original data, and the paper put forward to mining algorithm based on commonly association rule Apriori algorithm, then according to the actual needs of the design and implementation of association rule mining system, has been beneficial to the employment guidance of college teaching management decision and graduates of the mining results.展开更多
Maximum frequent pattern generation from a large database of transactions and items for association rule mining is an important research topic in data mining. Association rule mining aims to discover interesting corre...Maximum frequent pattern generation from a large database of transactions and items for association rule mining is an important research topic in data mining. Association rule mining aims to discover interesting correlations, frequent patterns, associations, or causal structures between items hidden in a large database. By exploiting quantum computing, we propose an efficient quantum search algorithm design to discover the maximum frequent patterns. We modified Grover’s search algorithm so that a subspace of arbitrary symmetric states is used instead of the whole search space. We presented a novel quantum oracle design that employs a quantum counter to count the maximum frequent items and a quantum comparator to check with a minimum support threshold. The proposed derived algorithm increases the rate of the correct solutions since the search is only in a subspace. Furthermore, our algorithm significantly scales and optimizes the required number of qubits in design, which directly reflected positively on the performance. Our proposed design can accommodate more transactions and items and still have a good performance with a small number of qubits.展开更多
The conventional complete association rule set was replaced by the least association rule set in data warehouse association rule mining process. The least association rule set should comply with two requirements: 1) i...The conventional complete association rule set was replaced by the least association rule set in data warehouse association rule mining process. The least association rule set should comply with two requirements: 1) it should be the minimal and the simplest association rule set; 2) its predictive power should in no way be weaker than that of the complete association rule set so that the precision of the association rule set analysis can be guaranteed. By adopting the least association rule set, the pruning of weak rules can be effectively carried out so as to greatly reduce the number of frequent itemset, and therefore improve the mining efficiency. Finally, based on the classical Apriori algorithm, the upward closure property of weak rules is utilized to develop a corresponding efficient algorithm.展开更多
Based on the rough set theory which is a powerful tool in dealing with vagueness and uncertainty, an algorithm to mine association rules in incomplete information systems was presented and the support and confidence w...Based on the rough set theory which is a powerful tool in dealing with vagueness and uncertainty, an algorithm to mine association rules in incomplete information systems was presented and the support and confidence were redefined. The algorithm can mine the association rules with decision attributes directly without processing missing values. Using the incomplete dataset Mushroom from UCI machine learning repository, the new algorithm was compared with the classical association rules mining algorithm based on Apriori from the number of rules extracted, testing accuracy and execution time. The experiment results show that the new algorithm has advantages of short execution time and high accuracy.展开更多
This paper is aimed to develop an algorithm for extracting association rules,called Context-Based Association Rule Mining algorithm(CARM),which can be regarded as an extension of the Context-Based Positive and Negativ...This paper is aimed to develop an algorithm for extracting association rules,called Context-Based Association Rule Mining algorithm(CARM),which can be regarded as an extension of the Context-Based Positive and Negative Association Rule Mining algorithm(CBPNARM).CBPNARM was developed to extract positive and negative association rules from Spatiotemporal(space-time)data only,while the proposed algorithm can be applied to both spatial and non-spatial data.The proposed algorithm is applied to the energy dataset to classify a country’s energy development by uncovering the enthralling interdependencies between the set of variables to get positive and negative associations.Many association rules related to sustainable energy development are extracted by the proposed algorithm that needs to be pruned by some pruning technique.The context,in this paper serves as a pruning measure to extract pertinent association rules from non-spatial data.Conditional Probability Increment Ratio(CPIR)is also added in the proposed algorithm that was not used in CBPNARM.The inclusion of the context variable and CPIR resulted in fewer rules and improved robustness and ease of use.Also,the extraction of a common negative frequent itemset in CARM is different from that of CBPNARM.The rules created by the proposed algorithm are more meaningful,significant,relevant and insightful.The accuracy of the proposed algorithm is compared with the Apriori,PNARM and CBPNARM algorithms.The results demonstrated enhanced accuracy,relevance and timeliness.展开更多
Data-mining techniques have been developed to turn data into useful task-oriented knowledge. Most algorithms for mining association rules identify relationships among transactions using binary values and find rules at...Data-mining techniques have been developed to turn data into useful task-oriented knowledge. Most algorithms for mining association rules identify relationships among transactions using binary values and find rules at a single-concept level. Extracting multilevel association rules in transaction databases is most commonly used in data mining. This paper proposes a multilevel fuzzy association rule mining model for extraction of implicit knowledge which stored as quantitative values in transactions. For this reason it uses different support value at each level as well as different membership function for each item. By integrating fuzzy-set concepts, data-mining technologies and multiple-level taxonomy, our method finds fuzzy association rules from transaction data sets. This approach adopts a top-down progressively deepening approach to derive large itemsets and also incorporates fuzzy boundaries instead of sharp boundary intervals. Comparing our method with previous ones in simulation shows that the proposed method maintains higher precision, the mined rules are closer to reality, and it gives ability to mine association rules at different levels based on the user’s tendency as well.展开更多
Hotspots (active fires) indicate spatial distribution of fires. A study on determining influence factors for hotspot occurrence is essential so that fire events can be predicted based on characteristics of a certain a...Hotspots (active fires) indicate spatial distribution of fires. A study on determining influence factors for hotspot occurrence is essential so that fire events can be predicted based on characteristics of a certain area. This study discovers the possible influence factors on the occurrence of fire events using the association rule algorithm namely Apriori in the study area of Rokan Hilir Riau Province Indonesia. The Apriori algorithm was applied on a forest fire dataset which containeddata on physical environment (land cover, river, road and city center), socio-economic (income source, population, and number of school), weather (precipitation, wind speed, and screen temperature), and peatlands. The experiment results revealed 324 multidimensional association rules indicating relationships between hotspots occurrence and other factors.The association among hotspots occurrence with other geographical objects was discovered for the minimum support of 10% and the minimum confidence of 80%. The results show that strong relations between hotspots occurrence and influence factors are found for the support about 12.42%, the confidence of 1, and the lift of 2.26. These factors are precipitation greater than or equal to 3 mm/day, wind speed in [1m/s, 2m/s), non peatland area, screen temperature in [297K, 298K), the number of school in 1 km2 less than or equal to 0.1, and the distance of each hotspot to the nearest road less than or equal to 2.5 km.展开更多
In this paper, we propose an efficient algorithm, called FFP-Growth (shortfor fast FP-Growth) , to mine frequent itemsets. Similar to FP-Growth, FFP-Growth searches theFP-tree in the bottom-up order, but need not cons...In this paper, we propose an efficient algorithm, called FFP-Growth (shortfor fast FP-Growth) , to mine frequent itemsets. Similar to FP-Growth, FFP-Growth searches theFP-tree in the bottom-up order, but need not construct conditional pattern bases and sub-FP-trees,thus, saving a substantial amount of time and space, and the FP-tree created by it is much smallerthan that created by TD-FP-Growth, hence improving efficiency. At the same time, FFP-Growth can beeasily extended for reducing the search space as TD-FP-Growth (M) and TD-FP-Growth (C). Experimentalresults show that the algorithm of this paper is effective and efficient.展开更多
文摘Association rules mining is a major data mining field that leads to discovery of associations and correlations among items in today’s big data environment. The conventional association rule mining focuses mainly on positive itemsets generated from frequently occurring itemsets (PFIS). However, there has been a significant study focused on infrequent itemsets with utilization of negative association rules to mine interesting frequent itemsets (NFIS) from transactions. In this work, we propose an efficient backward calculating negative frequent itemset algorithm namely EBC-NFIS for computing backward supports that can extract both positive and negative frequent itemsets synchronously from dataset. EBC-NFIS algorithm is based on popular e-NFIS algorithm that computes supports of negative itemsets from the supports of positive itemsets. The proposed algorithm makes use of previously computed supports from memory to minimize the computation time. In addition, association rules, i.e. positive and negative association rules (PNARs) are generated from discovered frequent itemsets using EBC-NFIS algorithm. The efficiency of the proposed algorithm is verified by several experiments and comparing results with e-NFIS algorithm. The experimental results confirm that the proposed algorithm successfully discovers NFIS and PNARs and runs significantly faster than conventional e-NFIS algorithm.
基金Beijing Science and Technology Program(No.Z191100006619065)National Key R&D Program(No.2017YFC1700101)。
文摘Objective:To explore the medication rule of Traditional Chinese Medicine(TCM)in the treatment of sleep disorder after stroke by using data mining technology.Methods:A computer search was used to search the electronic database of clinical literature on the treatment of sleep disorders after stroke by TCM from January 2000 to January 2021.Excel was used to establish the database,and the prescription information was described and analyzed statistically.Using IBM SPSS Modeler 18.0 software,Apriori algorithm was used for TCM association analysis,and IBM SPSS 22.0 software was used for systematic cluster analysis of high-frequency TCM.Results:A total of 67 literatures were included,covering 131 traditional Chinese medicines.The medecines with a higher frequency of sodium use include Ziziphi Spinosae Semen(Suanzaoren),Angelicae Sinensis Radix(Danggui),Ligusticum(Chuanxiong),liquorice(Gancao),Poria cocos(Fuling),and so on.From the effect point of view,deficiency-tonifying medicine,sedative medicine and blood-activating and stasis-removing medicine are commonly used.The medicinal properties are mainly cold,mild and warm.The main medicine flavor are sweet and bitter.The medicines mostly belong to the liver,heart and spleen Meridian.Thirty-three association rules were obtained for medicine pairs and medicine groups from the correlation analysis,and the core combinations were"Ziziphi Spinosae Semen(Suanzaoren)-Tuber fleeceflower stem(Yejiaoteng)","Ziziphi Spinosae Semen(Suanzaoren)-Polygala(Yuanzhi)","Ziziphi Spinosae Semen(Suanzaoren)-Cortex albiziae(Hehuanpi)"and"Angelicae Sinensis Radix(Danggui)-Radix bupleuri(Chaihu)-Radix Paeoniae Alba(Baishao)"and so on.Seven medicine aggregation groups were obtained by medicine cluster analysis.Conclusion:In the treatment of sleep disorder after stroke by TCM,the main method is to calm the heart and mind.Meanwhile,according to different syndrome types,the treatment methods of tonifying the heart and spleen,nourishing the liver and kidney,soothing the liver and softening the liver,clearing heat and resolving phlegm,nourishing the blood and promoting blood circulation are selected,which provide certain reference for clinical treatment.
文摘Objective:Use data mining techniques to explore the rule of Chinese medicine used for airway remodeling.Methods:Search the literature on Chinese medicine use for airway remodeling in the past 20 years.With the help of WPS Office Excel 11.1,IBM SPSS Statistics 23.0 and SPSS Modeler 18.0 software,prescriptions were analyzed for the frequency of drug use,the four natures,the five flavours and the channel tropism,cluster analysis and association analysis of high-frequency drugs.Results:There were 58 Chinese medicine prescriptions for airway remodeling be found,involving 105 Chinese medicines,the most frequent channel tropism were spleen,stomach,lung,large intestine,liver and gallbladder,the most frequent use of the five flavors was sour,sweet and pungent,the highest frequency of the four natures was cold and hot,cluster analysis yielded eight drug aggregation groups,and association rule analysis yielded five groups of high-frequency drug pairs.Conclusion:The main TCM treatments for airway remodeling are expelling phlegm,relieving cough,asthma calming,expelling blood stasis and deficiency tonifying.The results of this study can provide ideas for compounding and drug selection for subsequent studies.
文摘Objective: Based on data mining, to explore the medication rules of Chinese medicine for the treatment of restless legs syndrome(RLS). Methods: The CNKI, WANFANG, and VIP were taken as data sources, and "restless legs syndrome, RLS" as the key words, and "Chinese medicine, Chinese materia medica, traditional Chinese medicine(TCM), traditional Chinese and Western medicine" as sub key words, the data was extracted from the journals and literature related to the treatment of RLS by TCM from the establishment of the database to 2020, and data mining techniques(frequency analysis, cluster analysis, association rules) were used to analyze the core drugs and drug pair(group) rules. Results: A total of 87 prescriptions met the requirements of this study, involving 142 Chinese herbal medicines. The top 5 Chinese herbal medicines with a higher frequency of use were Radix Paeoniae Alba, Radix Glycyrrhizae, Radix Angelicae Sinensis, Fructus Chaenomelis and Radix Astragali seu Hedysari. The four Qi(气) of the medicine were mainly warm and neutral, the five flavors were mainly sweet, bitter, and pungent. The main channels of the meridian were mainly the liver meridian, spleen meridian and heart meridian. The medication categories were mainly tonifying deficiency herbs, blood activating and removing blood stasis herbs, and eliminating wind and dampness herbs. The association rule analysis yielded 24 Chinese medicine combinations with high support, and the hierarchical cluster analysis yielded a total of 5 clusters. Conclusion: TCM treatment of RLS is based on tonifying deficiency herbs, especially to replenish Qi and blood throughout the course of the disease, supplemented by herbs for promoting blood circulation and removing blood stasis, and herbs for eliminating wind and dampness, as well as combined with herbs for reliving superficies and herbs for calming the liver to stop the wind.
基金supported by the Key Program of the National Natural Science Foundation of China(Grant No.50539010)the Special Fund for Public Welfare Industry of the Ministry of Water Resources of China(Grant No.200801019)
文摘In conjunction with association rules for data mining, the connections between testing indices and strong and weak association rules were determined, and new derivative rules were obtained by further reasoning. Association rules were used to analyze correlation and check consistency between indices. This study shows that the judgment obtained by weak association rules or non-association rules is more accurate and more credible than that obtained by strong association rules. When the testing grades of two indices in the weak association rules are inconsistent, the testing grades of indices are more likely to be erroneous, and the mistakes are often caused by human factors. Clustering data mining technology was used to analyze the reliability of a diagnosis, or to perform health diagnosis directly. Analysis showed that the clustering results are related to the indices selected, and that if the indices selected are more significant, the characteristics of clustering results are also more significant, and the analysis or diagnosis is more credible. The indices and diagnosis analysis function produced by this study provide a necessary theoretical foundation and new ideas for the development of hydraulic metal structure health diagnosis technology.
基金Project of Sichuan Provincial Department of Education,China(No.13Z215)the Foundation of Scientific Research of Chengdu University of Information Technology,China(No.J201405)+1 种基金the Project of Sichuan Provincial Department of Science and Technology,China(No.2015JY0047)the Open Research Subject of Key Laboratory of Signal and Information Processing,China(No.szjj 2015-070)
文摘In communication alarm correlation analysis,traditional association rules generation(ARG) algorithm usually has low efficiency and high error rate.This paper proposes an alarm correlation rules generation algorithm based on the confidence covered value.Confidence covered value method can judge whether a rule is redundant or not scientific After the rules that based on weighted frequent patterns(WFPs) generated,the association rules were deleted by the confidence covered value,in order to delete the redundant rules and keep the rules with more information.Experiments show that the alarm correlation rules generation algorithm based on the confidence covered value has higher efficiency than the traditional method,and can effectively remove redundant rules.Thus it is very suitable for telecommunication alarm association rules processing.
基金This research was financially supported in part by the Ministry of Trade,Industry and Energy(MOTIE)and Korea Institute for Advancement of Technology(KIAT)through the International Cooperative R&D program.(Project No.P0016038)in part by the MSIT(Ministry of Science and ICT),Korea,under the ITRC(Information Technology Research Center)support program(IITP-2021-2016-0-00312)supervised by the IITP(Institute for Information&communications Technology Planning&Evaluation).
文摘Despite advances in technological complexity and efforts,software repository maintenance requires reusing the data to reduce the effort and complexity.However,increasing ambiguity,irrelevance,and bugs while extracting similar data during software development generate a large amount of data from those data that reside in repositories.Thus,there is a need for a repository mining technique for relevant and bug-free data prediction.This paper proposes a fault prediction approach using a data-mining technique to find good predictors for high-quality software.To predict errors in mining data,the Apriori algorithm was used to discover association rules by fixing confidence at more than 40%and support at least 30%.The pruning strategy was adopted based on evaluation measures.Next,the rules were extracted from three projects of different domains;the extracted rules were then combined to obtain the most popular rules based on the evaluation measure values.To evaluate the proposed approach,we conducted an experimental study to compare the proposed rules with existing ones using four different industrial projects.The evaluation showed that the results of our proposal are promising.Practitioners and developers can utilize these rules for defect prediction during early software development.
基金Project of State Administration of Traditional Chinese Medicine(GZY-YZS-2019-45)The Horizontal Project of Hunan Medical College(HYH-2021Y-KJ-6-33)+1 种基金Scientific Research Project of Hunan Provincial Department of Education in 2021(21C0223)Natural Science Foundation of Hunan Province in 2022(1524)。
文摘Objective This study aimed to examine and propagate the medication experience and group formula of traditional Chinese medicine(TCM)Master XIONG Jibo in diagnosing and treat-ing arthralgia syndrome(AS)through data mining.Methods Data of outpatient cases of Professor XIONG Jibo were collected from January 1,2014 to December 31,2018,along with cases recorded in A Real Famous Traditional Chinese Medicine Doctor:XIONG Jibo's Clinical Medical Record 1,which was published in December 2019.The five variables collected from the patients’data were TCM diagnostic information,TCM and western medicine diagnoses,syndrome,treatment,and prescription.A database was established for the collected data with Excel.Using the Python environment,a custom-ized modified natural language processing(NLP)model for the diagnosis and treatment of AS by Professor XIONG Jibo was established to preprocess the data and to analyze the word cloud.Frequency analysis,association rule analysis,cluster analysis,and visual analysis of AS cases were performed based on the Traditional Chinese Medicine Inheritance Computing Platform(V3.0)and RStudio(V4.0.3).Results A total of 610 medical records of Professor XIONG Jibo were collected from the case database.A total of 103 medical records were included after data screening criteria,which comprised 187 times(45 kinds)of prescriptions and 1506 times(125 kinds)of Chinese herbs.The main related meridians were the liver,spleen,and kidney meridians.The properties of Chinese herbs used most were mainly warm,flat,and cold,while the flavors of herbs were mainly bitter,pungent,and sweet.The main patterns of AS included the damp heat,phlegm stasis,and neck arthralgia.The most commonly used herbs for AS were Chuanniuxi(Cyathu-lae Radix),Huangbo(Phellodendri Chinensis Cortex),Cangzhu(Atractylodis Rhizoma),Qinjiao(Gentianae Macrophyllae Radix),Gancao(Glycyrrhizae Radix et Rhizoma),Huangqi(Astragali Radix),and Chuanxiong(Chuanxiong Rhizoma).The most common effect of the herbs was“promoting blood circulation and removing blood stasis”,followed by“supple-menting deficiency(Qi supplementing,blood supplementing,and Yang supplementing)”,and“dispelling wind and dampness”.The data were analyzed with the support≥15%and con-fidence=100%,and after de-duplication,five second-order association rules,39 third-order association rules,39 fourth-order association rules,and two fifth-order association rules were identified.The top-ranking association rules of each were“Cangzhu(Atractylodis Rhizoma)→Huangbo(Phellodendri Chinensis Cortex)”“Cangzhu(Atractylodis Rhizoma)+Chuanniuxi(Cyathulae Radix)→Huangbo(Phellodendri Chinensis Cortex)”“Chuanniuxi(Cyathulae Radix)+Danggui(Angelicae Sinensis Radix)+Gancao(Glycyrrhizae Radix et Rhizoma)→Qinjiao(Gentianae Macrophyllae Radix)”and“Chuanniuxi(Cyathulae Radix)+Danggui(Angelicae Sinensis Radix)+Gancao(Glycyrrhizae Radix et Rhizoma)+Huangbo(Phello-dendri Chinensis Cortex)→Qinjiao(Gentianae Macrophyllae Radix)”,respectively.Five clusters were obtained using cluster analysis of the top 30 herbs.The herbs were mainly dry-ing dampness,supplementing Qi,and promoting blood circulation.The main prescriptions of AS were Ermiao San(二妙散),Gegen Jianghuang San(葛根姜黄散),and Huangqi Chongteng Yin(黄芪虫藤饮).The herbs of core prescription included Cangzhu(Atractylodis Rhizoma),Chuanniuxi(Cyathulae Radix),Gancao(Glycyrrhizae Radix et Rhizoma),Huangbo(Phellodendri Chinensis Cortex),Mugua(Chaenomelis Fructus),Qinjiao(Gentianae Macro-phyllae Radix),Danggui(Angelicae Sinensis Radix),and Yiyiren(Coicis Semen).Conclusion Clearing heat and dampness,relieving collaterals and pain,and invigorating Qi and blood are the most commonly used therapies for the treatment of AS by Professor XIONG Jibo.Additionally,customized NLP model could improve the efficiency of data mining in TCM.
文摘Trauma is the most common cause of death to young people and many of these deaths are preventable [1]. The prediction of trauma patients outcome was a difficult problem to investigate till present times. In this study, prediction models are built and their capabilities to accurately predict the mortality are assessed. The analysis includes a comparison of data mining techniques using classification, clustering and association algorithms. Data were collected by Hellenic Trauma and Emergency Surgery Society from 30 Greek hospitals. Dataset contains records of 8544 patients suffering from severe injuries collected from the year 2005 to 2006. Factors include patients' demographic elements and several other variables registered from the time and place of accident until the hospital treatment and final outcome. Using this analysis the obtained results are compared in terms of sensitivity, specificity, positive predictive value and negative predictive value and the ROC curve depicts these methods performance.
基金The work is supported by Chinese NSF(Project No.60073034)
文摘Intrusion detection is regarded as classification in data mining field. However instead of directly mining the classification rules, class association rules, which are then used to construct a classifier, are mined from audit logs. Some attributes in audit logs are important for detecting intrusion but their values are distributed skewedly. A relative support concept is proposed to deal with such situation. To mine class association rules effectively, an algorithms based on FP-tree is exploited. Experiment result proves that this method has better performance.
文摘In order to make effective use a large amount of graduate data in colleges and universities that accumulate by teaching management of work, the paper study the data mining for higher vocational graduates database using the data mining technology. Using a variety of data preprocessing methods for the original data, and the paper put forward to mining algorithm based on commonly association rule Apriori algorithm, then according to the actual needs of the design and implementation of association rule mining system, has been beneficial to the employment guidance of college teaching management decision and graduates of the mining results.
文摘Maximum frequent pattern generation from a large database of transactions and items for association rule mining is an important research topic in data mining. Association rule mining aims to discover interesting correlations, frequent patterns, associations, or causal structures between items hidden in a large database. By exploiting quantum computing, we propose an efficient quantum search algorithm design to discover the maximum frequent patterns. We modified Grover’s search algorithm so that a subspace of arbitrary symmetric states is used instead of the whole search space. We presented a novel quantum oracle design that employs a quantum counter to count the maximum frequent items and a quantum comparator to check with a minimum support threshold. The proposed derived algorithm increases the rate of the correct solutions since the search is only in a subspace. Furthermore, our algorithm significantly scales and optimizes the required number of qubits in design, which directly reflected positively on the performance. Our proposed design can accommodate more transactions and items and still have a good performance with a small number of qubits.
文摘The conventional complete association rule set was replaced by the least association rule set in data warehouse association rule mining process. The least association rule set should comply with two requirements: 1) it should be the minimal and the simplest association rule set; 2) its predictive power should in no way be weaker than that of the complete association rule set so that the precision of the association rule set analysis can be guaranteed. By adopting the least association rule set, the pruning of weak rules can be effectively carried out so as to greatly reduce the number of frequent itemset, and therefore improve the mining efficiency. Finally, based on the classical Apriori algorithm, the upward closure property of weak rules is utilized to develop a corresponding efficient algorithm.
基金Projects(10871031, 60474070) supported by the National Natural Science Foundation of ChinaProject(07A001) supported by the Scientific Research Fund of Hunan Provincial Education Department, China
文摘Based on the rough set theory which is a powerful tool in dealing with vagueness and uncertainty, an algorithm to mine association rules in incomplete information systems was presented and the support and confidence were redefined. The algorithm can mine the association rules with decision attributes directly without processing missing values. Using the incomplete dataset Mushroom from UCI machine learning repository, the new algorithm was compared with the classical association rules mining algorithm based on Apriori from the number of rules extracted, testing accuracy and execution time. The experiment results show that the new algorithm has advantages of short execution time and high accuracy.
文摘This paper is aimed to develop an algorithm for extracting association rules,called Context-Based Association Rule Mining algorithm(CARM),which can be regarded as an extension of the Context-Based Positive and Negative Association Rule Mining algorithm(CBPNARM).CBPNARM was developed to extract positive and negative association rules from Spatiotemporal(space-time)data only,while the proposed algorithm can be applied to both spatial and non-spatial data.The proposed algorithm is applied to the energy dataset to classify a country’s energy development by uncovering the enthralling interdependencies between the set of variables to get positive and negative associations.Many association rules related to sustainable energy development are extracted by the proposed algorithm that needs to be pruned by some pruning technique.The context,in this paper serves as a pruning measure to extract pertinent association rules from non-spatial data.Conditional Probability Increment Ratio(CPIR)is also added in the proposed algorithm that was not used in CBPNARM.The inclusion of the context variable and CPIR resulted in fewer rules and improved robustness and ease of use.Also,the extraction of a common negative frequent itemset in CARM is different from that of CBPNARM.The rules created by the proposed algorithm are more meaningful,significant,relevant and insightful.The accuracy of the proposed algorithm is compared with the Apriori,PNARM and CBPNARM algorithms.The results demonstrated enhanced accuracy,relevance and timeliness.
文摘Data-mining techniques have been developed to turn data into useful task-oriented knowledge. Most algorithms for mining association rules identify relationships among transactions using binary values and find rules at a single-concept level. Extracting multilevel association rules in transaction databases is most commonly used in data mining. This paper proposes a multilevel fuzzy association rule mining model for extraction of implicit knowledge which stored as quantitative values in transactions. For this reason it uses different support value at each level as well as different membership function for each item. By integrating fuzzy-set concepts, data-mining technologies and multiple-level taxonomy, our method finds fuzzy association rules from transaction data sets. This approach adopts a top-down progressively deepening approach to derive large itemsets and also incorporates fuzzy boundaries instead of sharp boundary intervals. Comparing our method with previous ones in simulation shows that the proposed method maintains higher precision, the mined rules are closer to reality, and it gives ability to mine association rules at different levels based on the user’s tendency as well.
文摘Hotspots (active fires) indicate spatial distribution of fires. A study on determining influence factors for hotspot occurrence is essential so that fire events can be predicted based on characteristics of a certain area. This study discovers the possible influence factors on the occurrence of fire events using the association rule algorithm namely Apriori in the study area of Rokan Hilir Riau Province Indonesia. The Apriori algorithm was applied on a forest fire dataset which containeddata on physical environment (land cover, river, road and city center), socio-economic (income source, population, and number of school), weather (precipitation, wind speed, and screen temperature), and peatlands. The experiment results revealed 324 multidimensional association rules indicating relationships between hotspots occurrence and other factors.The association among hotspots occurrence with other geographical objects was discovered for the minimum support of 10% and the minimum confidence of 80%. The results show that strong relations between hotspots occurrence and influence factors are found for the support about 12.42%, the confidence of 1, and the lift of 2.26. These factors are precipitation greater than or equal to 3 mm/day, wind speed in [1m/s, 2m/s), non peatland area, screen temperature in [297K, 298K), the number of school in 1 km2 less than or equal to 0.1, and the distance of each hotspot to the nearest road less than or equal to 2.5 km.
文摘In this paper, we propose an efficient algorithm, called FFP-Growth (shortfor fast FP-Growth) , to mine frequent itemsets. Similar to FP-Growth, FFP-Growth searches theFP-tree in the bottom-up order, but need not construct conditional pattern bases and sub-FP-trees,thus, saving a substantial amount of time and space, and the FP-tree created by it is much smallerthan that created by TD-FP-Growth, hence improving efficiency. At the same time, FFP-Growth can beeasily extended for reducing the search space as TD-FP-Growth (M) and TD-FP-Growth (C). Experimentalresults show that the algorithm of this paper is effective and efficient.