An association rules mining method based on semantic relativity is proposed to solve the problem that there are more candidate item sets and higher time complexity in traditional association rules mining.Semantic rela...An association rules mining method based on semantic relativity is proposed to solve the problem that there are more candidate item sets and higher time complexity in traditional association rules mining.Semantic relativity of ontology concepts is used to describe complicated relationships of domains in the method.Candidate item sets with less semantic relativity are filtered to reduce the number of candidate item sets in association rules mining.An ontology hierarchy relationship is regarded as a directed acyclic graph rather than a hierarchy tree in the semantic relativity computation.Not only direct hierarchy relationships,but also non-direct hierarchy relationships and other typical semantic relationships are taken into account.Experimental results show that the proposed method can reduce the number of candidate item sets effectively and improve the efficiency of association rules mining.展开更多
The traditional generalization-based knowledge discovery method is introduced. A new kind of multilevel spatial association of the rules mining method based on the cloud model is presented. The cloud model integrates ...The traditional generalization-based knowledge discovery method is introduced. A new kind of multilevel spatial association of the rules mining method based on the cloud model is presented. The cloud model integrates the vague and random use of linguistic terms in a unified way. With these models, spatial and nonspatial attribute values are well generalized at multiple levels, allowing discovery of strong spatial association rules. Combining the cloud model based method with Apriori algorithms for mining association rules from a spatial database shows benefits in being effective and flexible.展开更多
Association rule mining methods, as a set of important data mining tools, could be used for mining spatial association rules of spatial data. However, applications of these methods are limited for mining results conta...Association rule mining methods, as a set of important data mining tools, could be used for mining spatial association rules of spatial data. However, applications of these methods are limited for mining results containing large number of redundant rules. In this paper, a new method named Geo-Filtered Association Rules Mining(GFARM) is proposed to effectively eliminate the redundant rules. An application of GFARM is performed as a case study in which association rules are discovered between building land distribution and potential driving factors in Wuhan, China from 1995 to 2015. Ten sets of regular sampling grids with different sizes are used for detecting the influence of multi-scales on GFARM. Results show that the proposed method can filter 50%–70% of redundant rules. GFARM is also successful in discovering spatial association pattern between building land distribution and driving factors.展开更多
Typical association rules consider only items enumerated in transactions. Such rules are referred to as positive association rules. Negative association rules also consider the same items, but in addition consider neg...Typical association rules consider only items enumerated in transactions. Such rules are referred to as positive association rules. Negative association rules also consider the same items, but in addition consider negated items (i. e. absent from transactions). Negative association rules are useful in market-basket analysis to identify products that conflict with each other or products that complement each other. They are also very convenient for associative classifiers, classifiers that build their classification model based on association rules. Indeed, mining for such rules necessitates the examination of an exponentially large search space. Despite their usefulness, very few algorithms to mine them have been proposed to date. In this paper, an algorithm based on FP tree is presented to discover negative association rules.展开更多
In this letter, on the basis of Frequent Pattern(FP) tree, the support function to update FP-tree is introduced, then an Incremental FP (IFP) algorithm for mining association rules is proposed. IFP algorithm considers...In this letter, on the basis of Frequent Pattern(FP) tree, the support function to update FP-tree is introduced, then an Incremental FP (IFP) algorithm for mining association rules is proposed. IFP algorithm considers not only adding new data into the database but also reducing old data from the database. Furthermore, it can predigest five cases to three cases.The algorithm proposed in this letter can avoid generating lots of candidate items, and it is high efficient.展开更多
Data-mining techniques have been developed to turn data into useful task-oriented knowledge. Most algorithms for mining association rules identify relationships among transactions using binary values and find rules at...Data-mining techniques have been developed to turn data into useful task-oriented knowledge. Most algorithms for mining association rules identify relationships among transactions using binary values and find rules at a single-concept level. Extracting multilevel association rules in transaction databases is most commonly used in data mining. This paper proposes a multilevel fuzzy association rule mining model for extraction of implicit knowledge which stored as quantitative values in transactions. For this reason it uses different support value at each level as well as different membership function for each item. By integrating fuzzy-set concepts, data-mining technologies and multiple-level taxonomy, our method finds fuzzy association rules from transaction data sets. This approach adopts a top-down progressively deepening approach to derive large itemsets and also incorporates fuzzy boundaries instead of sharp boundary intervals. Comparing our method with previous ones in simulation shows that the proposed method maintains higher precision, the mined rules are closer to reality, and it gives ability to mine association rules at different levels based on the user’s tendency as well.展开更多
Data mining techniques offer great opportunities for developing ethics lines whose main aim is to ensure improvements and compliance with the values, conduct and commitments making up the code of ethics. The aim of th...Data mining techniques offer great opportunities for developing ethics lines whose main aim is to ensure improvements and compliance with the values, conduct and commitments making up the code of ethics. The aim of this study is to suggest a process for exploiting the data generated by the data generated and collected from an ethics line by extracting rules of association and applying the Apriori algorithm. This makes it possible to identify anomalies and behaviour patterns requiring action to review, correct, promote or expand them, as appropriate.展开更多
Objective: Based on data mining, to explore the medication rules of Chinese medicine for the treatment of restless legs syndrome(RLS). Methods: The CNKI, WANFANG, and VIP were taken as data sources, and "restless...Objective: Based on data mining, to explore the medication rules of Chinese medicine for the treatment of restless legs syndrome(RLS). Methods: The CNKI, WANFANG, and VIP were taken as data sources, and "restless legs syndrome, RLS" as the key words, and "Chinese medicine, Chinese materia medica, traditional Chinese medicine(TCM), traditional Chinese and Western medicine" as sub key words, the data was extracted from the journals and literature related to the treatment of RLS by TCM from the establishment of the database to 2020, and data mining techniques(frequency analysis, cluster analysis, association rules) were used to analyze the core drugs and drug pair(group) rules. Results: A total of 87 prescriptions met the requirements of this study, involving 142 Chinese herbal medicines. The top 5 Chinese herbal medicines with a higher frequency of use were Radix Paeoniae Alba, Radix Glycyrrhizae, Radix Angelicae Sinensis, Fructus Chaenomelis and Radix Astragali seu Hedysari. The four Qi(气) of the medicine were mainly warm and neutral, the five flavors were mainly sweet, bitter, and pungent. The main channels of the meridian were mainly the liver meridian, spleen meridian and heart meridian. The medication categories were mainly tonifying deficiency herbs, blood activating and removing blood stasis herbs, and eliminating wind and dampness herbs. The association rule analysis yielded 24 Chinese medicine combinations with high support, and the hierarchical cluster analysis yielded a total of 5 clusters. Conclusion: TCM treatment of RLS is based on tonifying deficiency herbs, especially to replenish Qi and blood throughout the course of the disease, supplemented by herbs for promoting blood circulation and removing blood stasis, and herbs for eliminating wind and dampness, as well as combined with herbs for reliving superficies and herbs for calming the liver to stop the wind.展开更多
Hotspots (active fires) indicate spatial distribution of fires. A study on determining influence factors for hotspot occurrence is essential so that fire events can be predicted based on characteristics of a certain a...Hotspots (active fires) indicate spatial distribution of fires. A study on determining influence factors for hotspot occurrence is essential so that fire events can be predicted based on characteristics of a certain area. This study discovers the possible influence factors on the occurrence of fire events using the association rule algorithm namely Apriori in the study area of Rokan Hilir Riau Province Indonesia. The Apriori algorithm was applied on a forest fire dataset which containeddata on physical environment (land cover, river, road and city center), socio-economic (income source, population, and number of school), weather (precipitation, wind speed, and screen temperature), and peatlands. The experiment results revealed 324 multidimensional association rules indicating relationships between hotspots occurrence and other factors.The association among hotspots occurrence with other geographical objects was discovered for the minimum support of 10% and the minimum confidence of 80%. The results show that strong relations between hotspots occurrence and influence factors are found for the support about 12.42%, the confidence of 1, and the lift of 2.26. These factors are precipitation greater than or equal to 3 mm/day, wind speed in [1m/s, 2m/s), non peatland area, screen temperature in [297K, 298K), the number of school in 1 km2 less than or equal to 0.1, and the distance of each hotspot to the nearest road less than or equal to 2.5 km.展开更多
Objective:To explore the medication rule of Traditional Chinese Medicine(TCM)in the treatment of sleep disorder after stroke by using data mining technology.Methods:A computer search was used to search the electronic ...Objective:To explore the medication rule of Traditional Chinese Medicine(TCM)in the treatment of sleep disorder after stroke by using data mining technology.Methods:A computer search was used to search the electronic database of clinical literature on the treatment of sleep disorders after stroke by TCM from January 2000 to January 2021.Excel was used to establish the database,and the prescription information was described and analyzed statistically.Using IBM SPSS Modeler 18.0 software,Apriori algorithm was used for TCM association analysis,and IBM SPSS 22.0 software was used for systematic cluster analysis of high-frequency TCM.Results:A total of 67 literatures were included,covering 131 traditional Chinese medicines.The medecines with a higher frequency of sodium use include Ziziphi Spinosae Semen(Suanzaoren),Angelicae Sinensis Radix(Danggui),Ligusticum(Chuanxiong),liquorice(Gancao),Poria cocos(Fuling),and so on.From the effect point of view,deficiency-tonifying medicine,sedative medicine and blood-activating and stasis-removing medicine are commonly used.The medicinal properties are mainly cold,mild and warm.The main medicine flavor are sweet and bitter.The medicines mostly belong to the liver,heart and spleen Meridian.Thirty-three association rules were obtained for medicine pairs and medicine groups from the correlation analysis,and the core combinations were"Ziziphi Spinosae Semen(Suanzaoren)-Tuber fleeceflower stem(Yejiaoteng)","Ziziphi Spinosae Semen(Suanzaoren)-Polygala(Yuanzhi)","Ziziphi Spinosae Semen(Suanzaoren)-Cortex albiziae(Hehuanpi)"and"Angelicae Sinensis Radix(Danggui)-Radix bupleuri(Chaihu)-Radix Paeoniae Alba(Baishao)"and so on.Seven medicine aggregation groups were obtained by medicine cluster analysis.Conclusion:In the treatment of sleep disorder after stroke by TCM,the main method is to calm the heart and mind.Meanwhile,according to different syndrome types,the treatment methods of tonifying the heart and spleen,nourishing the liver and kidney,soothing the liver and softening the liver,clearing heat and resolving phlegm,nourishing the blood and promoting blood circulation are selected,which provide certain reference for clinical treatment.展开更多
Mining association rules from large database is very costly. We develop a parallel algorithm for this task on shared-memory multiprocessor (SMP). Most proposed parallel algorithms for association rules mining have to ...Mining association rules from large database is very costly. We develop a parallel algorithm for this task on shared-memory multiprocessor (SMP). Most proposed parallel algorithms for association rules mining have to scan the database at least two times. In this article, a parallel algorithm Scan Once (SO) has been proposed for SMP, which only scans the database once. And this algorithm is fundamentally different from the known parallel algorithm Count Distribution (CD). It adopts bit matrix to store the database information and gets the support of the frequent itemsets by adopting Vector-And-Operation, which greatly improve the efficiency of generating all frequent itemsets. Empirical evaluation shows that the algorithm outperforms the known one CD algorithm.展开更多
Objective:To study the prescription and medication rule of Professor Hua Baojin in the treatment of colorectal cancer through data mining,so as to provide reference for clinical treatment of colorectal cancer. Methods...Objective:To study the prescription and medication rule of Professor Hua Baojin in the treatment of colorectal cancer through data mining,so as to provide reference for clinical treatment of colorectal cancer. Methods:The outpatient medical records of Professor Hua Baojin from June 2015 to October 2020 in Guang’anmen Hospital,China Academy of Chinese Medical Sciences were collected. TCM Inheritance Support Platform(V2.5)was used to analyze high-frequency drugs,drug properties,flavors,channel tropism,common drug combinations,core combinations and new prescriptions. Results:A total of 500 prescriptions were included,involving 222 traditional Chinese medicines and 38 high-frequency(≥100)medicines,including Atractylodes,Poria cocos,ginger,etc. The most common medicinal properties of drugs were warm,cold and mild,and flavors were sweet,bitter and pungent,channel tropism included spleen,stomach,liver and lung meridians. 36 groups of common drug combinations and 20 association rules were obtained by data mining,and 18 core combinations and 9 new prescriptions were evolved.Conclusion:Professor Hua Baojin takes recuperating spleen and stomach as the core in the treatment of colorectal cancer,attaching importance to regulating the rise and fall of qi,adding and subtracting on the basis of Xiangsha Liujunzi Decoction,flexibly selecting the drugs of dispelling blood stasis,resolving phlegm,detoxification and loose knots,and using both cold and warm in the prescription,tonifying and reducing at the same time.展开更多
Objective:To analyze the clinical research literatures of TCM syndromes and treatment of Immunoglobulin A Nephropathy(IgAN)on data-mining technology,and explore the rules of TCM syndromes and Chinese medication.Method...Objective:To analyze the clinical research literatures of TCM syndromes and treatment of Immunoglobulin A Nephropathy(IgAN)on data-mining technology,and explore the rules of TCM syndromes and Chinese medication.Methods:By searching the clinical literatures about TCM syndromes and treatment of IgAN had published in China Biomedical Literature Database,CNKI Database,Wanfang Database,and Chongqing Weipu Database from January 2000 to September 2019.Strictly according to the inclusion and exclusion criteria,included documents and established the databases.Applicated of statistical software for frequency analysis of TCM syndromes,Chinese herbal medicines,and analyzed the commonly used Chinese herbal medicines by factor analysis and cluster analysis.Results:292 literatures were finally included,involving 28 syndromes.A total of 479 prescriptions and 254 Chinese herbs were used.The rules of syndromes in IgA nephropathy are as follows:the syndromes are mainly composed of Qi and Yin deficiency syndrome,followed by the Spleen and kidney qi deficiency syndrome,Liver and Kidney Yin Deficiency Syndrome,Spleen and kidney yang deficiency syndrome.The blood-stasis syndrome,damp-heat syndrome and wind-heat disturbance syndrome are accompanied by other syndrome.The characteristics of traditional Chinese medicines for IgA nephropathy is mainly composed of tonic deficiency medicines,supplemented by blood-activating and stasis-eliminating medicines,heat-clearing medicines,inducing diuresis and excreting dampness medicines and astringent medicines to combine with formulas.Conclusion:Through Data-mining method systematically summarized the rules of TCM syndromes and Chinese medication in treating IgAN and provided scientific theoretical guidance for the treatment of IgAN.展开更多
Background:To summarize the concerted application and prescription rules of traditional Chinese medicine in the treatment of pre diabetes.Methods:Microsoft Excel 2010 was used to summarize the categories,nature,flavou...Background:To summarize the concerted application and prescription rules of traditional Chinese medicine in the treatment of pre diabetes.Methods:Microsoft Excel 2010 was used to summarize the categories,nature,flavour and channel tropism of drugs.The cluster analysis of high-frequency drugs was carried out by SPSS 22.0,and the association rules of high-frequency drugs were analyzed by Apriori algorithm provided by SPSS modeler 14.0.Results:One hundred and forty-six references were included,including 153 prescriptions and 131 drugs.Their frequency of use is listed in the following order.The top 3 categories of drugs were“Tonifying,Heat-Clearing”,diuresis and“Diffusing Dampness”drugs.The top 5 drugs were Huangqi(Astragali radix),Fuling(Poria),Huanglian(Coptidis rhizoma),Shanyao(Dioscoreae rhizoma),Gegen(Puerariae lobatae radix).The top 3 channel tropism of drugs were spleen,stomach and lung.The top 3 nature of drugs were cold,warm and calm.The top 3 flavour of drugs were sweet,bitter and pungent.The cluster analysis of high-frequency drugs showed that it could be classified into 4 categories:“Benefiting Qi”for promoting production of fluid,“Clearing Heat”and“Eliminating Dampness”,“Nourishing Yin”and“Clearing Heat”,and“Invigorating Spleen”for“Diffusing Dampness”.The results of association rule analysis showed that the combination with the highest degree of confidence and support was Poria-Chenpi(Citri reticulatae pericarpium)-Banxia(Pinelliae rhizoma)-Baizhu(Atractylodis macrocephalae rhizoma)and the combination with the highest frequency was Astragali radix-Puerariae lobatae radix.Conclusion:The pre diabetes is due to deficiency.The disease location is spleen and stomach and the pathological factor is phlegm-damp,that is why benefiting qi and invigorating spleen is regarded as the key link of clinical treatment.展开更多
Objective:Using data mining technology to explore the rules of traditional Chinese medicine(TCM)in the treatment of threatened abortion in the early stage of pregnancy with sub-chorionic haematoma(SCH).Methods:Literat...Objective:Using data mining technology to explore the rules of traditional Chinese medicine(TCM)in the treatment of threatened abortion in the early stage of pregnancy with sub-chorionic haematoma(SCH).Methods:Literature of TCM in the treatment of threatened abortion in the early stage of pregnancy with SCH were retrieved from CNKI,VIP,WANFANG and Pubmed,EMBASE.The literature information database was established to be used for descriptive analysis,association rule analysis and cluster analysis of relevant data.Results:A total of 100 literatures were included,involving 114 Chinese herbs.The efficacy of Chinese herbs were mainly tonic drugs,hemostatic drugs,heat-clearing drugs,dissolving blood stasis and hemostatic drugs.The medicinal properties were mostly mild and warm,and the taste of the drug was mainly sweet,bitter and pungent.The liver meridian,spleen meridian and kidney meridian were frequently used.The commonly used drug pair combination was"Xu duan(Radix dipsaci,续断)-Tusizi(Semen Cuscutae,菟丝子)",and the core combination was"Tusizi-Xu duan-Ejiao(Donkeyhide gelatin,阿胶)-Baizhu(Atractylodes macrocephala,白术)-Dangshen(Codonopsis pilosula,党参)".Commonly used drugs for removing blood stasis and hemostasis were with Sanqi(Panax notoginseng,三七),Puhuang(cattail pollen,蒲黄),and Qiancao(Radix Rubiae,茜草).Conclusion:Data mining traditional Chinese medicine for the treatment of threatened abortion in the early stage of pregnancy with SCH clinically commonly used drug efficacy,taste,meridian,commonly used drug pairs,core combination and commonly used blood stasis hemostatic drugs,has important reference significance for the treatment of threatened abortion in the early stage of pregnancy combined with SCH.展开更多
BACKGROUND It is increasingly common to find patients affected by a combination of type 2 diabetes mellitus(T2DM)and coronary artery disease(CAD),and studies are able to correlate their relationships with available bi...BACKGROUND It is increasingly common to find patients affected by a combination of type 2 diabetes mellitus(T2DM)and coronary artery disease(CAD),and studies are able to correlate their relationships with available biological and clinical evidence.The aim of the current study was to apply association rule mining(ARM)to discover whether there are consistent patterns of clinical features relevant to these diseases.ARM leverages clinical and laboratory data to the meaningful patterns for diabetic CAD by harnessing the power help of data-driven algorithms to optimise the decision-making in patient care.AIM To reinforce the evidence of the T2DM-CAD interplay and demonstrate the ability of ARM to provide new insights into multivariate pattern discovery.METHODS This cross-sectional study was conducted at the Department of Biochemistry in a specialized tertiary care centre in Delhi,involving a total of 300 consented subjects categorized into three groups:CAD with diabetes,CAD without diabetes,and healthy controls,with 100 subjects in each group.The participants were enrolled from the Cardiology IPD&OPD for the sample collection.The study employed ARM technique to extract the meaningful patterns and relationships from the clinical data with its original value.RESULTS The clinical dataset comprised 35 attributes from enrolled subjects.The analysis produced rules with a maximum branching factor of 4 and a rule length of 5,necessitating a 1%probability increase for enhancement.Prominent patterns emerged,highlighting strong links between health indicators and diabetes likelihood,particularly elevated HbA1C and random blood sugar levels.The ARM technique identified individuals with a random blood sugar level>175 and HbA1C>6.6 are likely in the“CAD-with-diabetes”group,offering valuable insights into health indicators and influencing factors on disease outcomes.CONCLUSION The application of this method holds promise for healthcare practitioners to offer valuable insights for enhancing patient treatment targeting specific subtypes of CAD with diabetes.Implying artificial intelligence techniques with medical data,we have shown the potential for personalized healthcare and the development of user-friendly applications aimed at improving cardiovascular health outcomes for this high-risk population to optimise the decision-making in patient care.展开更多
The Apriori algorithm is a classical method of association rules mining.Based on analysis of this theory,the paper provides an improved Apriori algorithm.The paper puts foward with algorithm combines HASH table techni...The Apriori algorithm is a classical method of association rules mining.Based on analysis of this theory,the paper provides an improved Apriori algorithm.The paper puts foward with algorithm combines HASH table technique and reduction of candidate item sets to enhance the usage efficiency of resources as well as the individualized service of the data library.展开更多
In data mining from transaction DB, the relationships between the attributes have been focused, but the relationships between the tuples have not been taken into account. In spatial database, there are relationships b...In data mining from transaction DB, the relationships between the attributes have been focused, but the relationships between the tuples have not been taken into account. In spatial database, there are relationships between the attributes and the tuples, and most of the associations occur between the tuples, such as adjacent, intersection, overlap and other topological relationships. So the tasks of spatial data association rules mining include mining the relationships between attributes of spatial objects, which are called as vertical direction DM, and the relationships between the tuples, which are called as horizontal direction DM. This paper analyzes the storage models of spatial data, uses for reference the technologies of data mining in transaction DB, defines the spatial data association rule, including vertical direction association rule, horizontal direction association rule and two-direction association rule, discusses the measurement of spatial association rule interestingness, and puts forward the work flows of spatial association rule data mining. During two-direction spatial association rules mining, an algorithm is proposed to get non-spatial itemsets. By virtue of spatial analysis, the spatial relations were transferred into non-spatial associations and the non-spatial itemsets were gotten. Based on the non-spatial itemsets, the Apriori algorithm or other algorithms could be used to get the frequent itemsets and then the spatial association rules come into being. Using spatial DB, the spatial association rules were gotten to validate the algorithm, and the test results show that this algorithm is efficient and can mine the interesting spatial rules.展开更多
基金The National Natural Science Foundation of China(No.50674086)Specialized Research Fund for the Doctoral Program of Higher Education(No.20060290508)the Science and Technology Fund of China University of Mining and Technology(No.2007B016)
文摘An association rules mining method based on semantic relativity is proposed to solve the problem that there are more candidate item sets and higher time complexity in traditional association rules mining.Semantic relativity of ontology concepts is used to describe complicated relationships of domains in the method.Candidate item sets with less semantic relativity are filtered to reduce the number of candidate item sets in association rules mining.An ontology hierarchy relationship is regarded as a directed acyclic graph rather than a hierarchy tree in the semantic relativity computation.Not only direct hierarchy relationships,but also non-direct hierarchy relationships and other typical semantic relationships are taken into account.Experimental results show that the proposed method can reduce the number of candidate item sets effectively and improve the efficiency of association rules mining.
文摘The traditional generalization-based knowledge discovery method is introduced. A new kind of multilevel spatial association of the rules mining method based on the cloud model is presented. The cloud model integrates the vague and random use of linguistic terms in a unified way. With these models, spatial and nonspatial attribute values are well generalized at multiple levels, allowing discovery of strong spatial association rules. Combining the cloud model based method with Apriori algorithms for mining association rules from a spatial database shows benefits in being effective and flexible.
基金Under the auspices of Special Fund of Ministry of Land and Resources of China in Public Interest(No.201511001)
文摘Association rule mining methods, as a set of important data mining tools, could be used for mining spatial association rules of spatial data. However, applications of these methods are limited for mining results containing large number of redundant rules. In this paper, a new method named Geo-Filtered Association Rules Mining(GFARM) is proposed to effectively eliminate the redundant rules. An application of GFARM is performed as a case study in which association rules are discovered between building land distribution and potential driving factors in Wuhan, China from 1995 to 2015. Ten sets of regular sampling grids with different sizes are used for detecting the influence of multi-scales on GFARM. Results show that the proposed method can filter 50%–70% of redundant rules. GFARM is also successful in discovering spatial association pattern between building land distribution and driving factors.
基金Supported by the National Natural Science Foun-dation of China(70371015) and the Science Foundation of JiangsuUniversity ( 04KJD001)
文摘Typical association rules consider only items enumerated in transactions. Such rules are referred to as positive association rules. Negative association rules also consider the same items, but in addition consider negated items (i. e. absent from transactions). Negative association rules are useful in market-basket analysis to identify products that conflict with each other or products that complement each other. They are also very convenient for associative classifiers, classifiers that build their classification model based on association rules. Indeed, mining for such rules necessitates the examination of an exponentially large search space. Despite their usefulness, very few algorithms to mine them have been proposed to date. In this paper, an algorithm based on FP tree is presented to discover negative association rules.
基金Supported in part by the National Natural Science Foundation of China(No.60073012),Natural Science Foundation of Jiangsu(BK2001004)
文摘In this letter, on the basis of Frequent Pattern(FP) tree, the support function to update FP-tree is introduced, then an Incremental FP (IFP) algorithm for mining association rules is proposed. IFP algorithm considers not only adding new data into the database but also reducing old data from the database. Furthermore, it can predigest five cases to three cases.The algorithm proposed in this letter can avoid generating lots of candidate items, and it is high efficient.
文摘Data-mining techniques have been developed to turn data into useful task-oriented knowledge. Most algorithms for mining association rules identify relationships among transactions using binary values and find rules at a single-concept level. Extracting multilevel association rules in transaction databases is most commonly used in data mining. This paper proposes a multilevel fuzzy association rule mining model for extraction of implicit knowledge which stored as quantitative values in transactions. For this reason it uses different support value at each level as well as different membership function for each item. By integrating fuzzy-set concepts, data-mining technologies and multiple-level taxonomy, our method finds fuzzy association rules from transaction data sets. This approach adopts a top-down progressively deepening approach to derive large itemsets and also incorporates fuzzy boundaries instead of sharp boundary intervals. Comparing our method with previous ones in simulation shows that the proposed method maintains higher precision, the mined rules are closer to reality, and it gives ability to mine association rules at different levels based on the user’s tendency as well.
文摘Data mining techniques offer great opportunities for developing ethics lines whose main aim is to ensure improvements and compliance with the values, conduct and commitments making up the code of ethics. The aim of this study is to suggest a process for exploiting the data generated by the data generated and collected from an ethics line by extracting rules of association and applying the Apriori algorithm. This makes it possible to identify anomalies and behaviour patterns requiring action to review, correct, promote or expand them, as appropriate.
文摘Objective: Based on data mining, to explore the medication rules of Chinese medicine for the treatment of restless legs syndrome(RLS). Methods: The CNKI, WANFANG, and VIP were taken as data sources, and "restless legs syndrome, RLS" as the key words, and "Chinese medicine, Chinese materia medica, traditional Chinese medicine(TCM), traditional Chinese and Western medicine" as sub key words, the data was extracted from the journals and literature related to the treatment of RLS by TCM from the establishment of the database to 2020, and data mining techniques(frequency analysis, cluster analysis, association rules) were used to analyze the core drugs and drug pair(group) rules. Results: A total of 87 prescriptions met the requirements of this study, involving 142 Chinese herbal medicines. The top 5 Chinese herbal medicines with a higher frequency of use were Radix Paeoniae Alba, Radix Glycyrrhizae, Radix Angelicae Sinensis, Fructus Chaenomelis and Radix Astragali seu Hedysari. The four Qi(气) of the medicine were mainly warm and neutral, the five flavors were mainly sweet, bitter, and pungent. The main channels of the meridian were mainly the liver meridian, spleen meridian and heart meridian. The medication categories were mainly tonifying deficiency herbs, blood activating and removing blood stasis herbs, and eliminating wind and dampness herbs. The association rule analysis yielded 24 Chinese medicine combinations with high support, and the hierarchical cluster analysis yielded a total of 5 clusters. Conclusion: TCM treatment of RLS is based on tonifying deficiency herbs, especially to replenish Qi and blood throughout the course of the disease, supplemented by herbs for promoting blood circulation and removing blood stasis, and herbs for eliminating wind and dampness, as well as combined with herbs for reliving superficies and herbs for calming the liver to stop the wind.
文摘Hotspots (active fires) indicate spatial distribution of fires. A study on determining influence factors for hotspot occurrence is essential so that fire events can be predicted based on characteristics of a certain area. This study discovers the possible influence factors on the occurrence of fire events using the association rule algorithm namely Apriori in the study area of Rokan Hilir Riau Province Indonesia. The Apriori algorithm was applied on a forest fire dataset which containeddata on physical environment (land cover, river, road and city center), socio-economic (income source, population, and number of school), weather (precipitation, wind speed, and screen temperature), and peatlands. The experiment results revealed 324 multidimensional association rules indicating relationships between hotspots occurrence and other factors.The association among hotspots occurrence with other geographical objects was discovered for the minimum support of 10% and the minimum confidence of 80%. The results show that strong relations between hotspots occurrence and influence factors are found for the support about 12.42%, the confidence of 1, and the lift of 2.26. These factors are precipitation greater than or equal to 3 mm/day, wind speed in [1m/s, 2m/s), non peatland area, screen temperature in [297K, 298K), the number of school in 1 km2 less than or equal to 0.1, and the distance of each hotspot to the nearest road less than or equal to 2.5 km.
基金Beijing Science and Technology Program(No.Z191100006619065)National Key R&D Program(No.2017YFC1700101)。
文摘Objective:To explore the medication rule of Traditional Chinese Medicine(TCM)in the treatment of sleep disorder after stroke by using data mining technology.Methods:A computer search was used to search the electronic database of clinical literature on the treatment of sleep disorders after stroke by TCM from January 2000 to January 2021.Excel was used to establish the database,and the prescription information was described and analyzed statistically.Using IBM SPSS Modeler 18.0 software,Apriori algorithm was used for TCM association analysis,and IBM SPSS 22.0 software was used for systematic cluster analysis of high-frequency TCM.Results:A total of 67 literatures were included,covering 131 traditional Chinese medicines.The medecines with a higher frequency of sodium use include Ziziphi Spinosae Semen(Suanzaoren),Angelicae Sinensis Radix(Danggui),Ligusticum(Chuanxiong),liquorice(Gancao),Poria cocos(Fuling),and so on.From the effect point of view,deficiency-tonifying medicine,sedative medicine and blood-activating and stasis-removing medicine are commonly used.The medicinal properties are mainly cold,mild and warm.The main medicine flavor are sweet and bitter.The medicines mostly belong to the liver,heart and spleen Meridian.Thirty-three association rules were obtained for medicine pairs and medicine groups from the correlation analysis,and the core combinations were"Ziziphi Spinosae Semen(Suanzaoren)-Tuber fleeceflower stem(Yejiaoteng)","Ziziphi Spinosae Semen(Suanzaoren)-Polygala(Yuanzhi)","Ziziphi Spinosae Semen(Suanzaoren)-Cortex albiziae(Hehuanpi)"and"Angelicae Sinensis Radix(Danggui)-Radix bupleuri(Chaihu)-Radix Paeoniae Alba(Baishao)"and so on.Seven medicine aggregation groups were obtained by medicine cluster analysis.Conclusion:In the treatment of sleep disorder after stroke by TCM,the main method is to calm the heart and mind.Meanwhile,according to different syndrome types,the treatment methods of tonifying the heart and spleen,nourishing the liver and kidney,soothing the liver and softening the liver,clearing heat and resolving phlegm,nourishing the blood and promoting blood circulation are selected,which provide certain reference for clinical treatment.
文摘Mining association rules from large database is very costly. We develop a parallel algorithm for this task on shared-memory multiprocessor (SMP). Most proposed parallel algorithms for association rules mining have to scan the database at least two times. In this article, a parallel algorithm Scan Once (SO) has been proposed for SMP, which only scans the database once. And this algorithm is fundamentally different from the known parallel algorithm Count Distribution (CD). It adopts bit matrix to store the database information and gets the support of the frequent itemsets by adopting Vector-And-Operation, which greatly improve the efficiency of generating all frequent itemsets. Empirical evaluation shows that the algorithm outperforms the known one CD algorithm.
基金National Natural Science Foundation of China(No.81673961,81774294)Beijing Natural Science Foundation(No.7172186)。
文摘Objective:To study the prescription and medication rule of Professor Hua Baojin in the treatment of colorectal cancer through data mining,so as to provide reference for clinical treatment of colorectal cancer. Methods:The outpatient medical records of Professor Hua Baojin from June 2015 to October 2020 in Guang’anmen Hospital,China Academy of Chinese Medical Sciences were collected. TCM Inheritance Support Platform(V2.5)was used to analyze high-frequency drugs,drug properties,flavors,channel tropism,common drug combinations,core combinations and new prescriptions. Results:A total of 500 prescriptions were included,involving 222 traditional Chinese medicines and 38 high-frequency(≥100)medicines,including Atractylodes,Poria cocos,ginger,etc. The most common medicinal properties of drugs were warm,cold and mild,and flavors were sweet,bitter and pungent,channel tropism included spleen,stomach,liver and lung meridians. 36 groups of common drug combinations and 20 association rules were obtained by data mining,and 18 core combinations and 9 new prescriptions were evolved.Conclusion:Professor Hua Baojin takes recuperating spleen and stomach as the core in the treatment of colorectal cancer,attaching importance to regulating the rise and fall of qi,adding and subtracting on the basis of Xiangsha Liujunzi Decoction,flexibly selecting the drugs of dispelling blood stasis,resolving phlegm,detoxification and loose knots,and using both cold and warm in the prescription,tonifying and reducing at the same time.
基金National Natural Science Foundation of China(No.81760807)Guangxi University of Traditional Chinese Medicine 2019 Graduate Education Innovation Program Project(No.YCSY20190066)。
文摘Objective:To analyze the clinical research literatures of TCM syndromes and treatment of Immunoglobulin A Nephropathy(IgAN)on data-mining technology,and explore the rules of TCM syndromes and Chinese medication.Methods:By searching the clinical literatures about TCM syndromes and treatment of IgAN had published in China Biomedical Literature Database,CNKI Database,Wanfang Database,and Chongqing Weipu Database from January 2000 to September 2019.Strictly according to the inclusion and exclusion criteria,included documents and established the databases.Applicated of statistical software for frequency analysis of TCM syndromes,Chinese herbal medicines,and analyzed the commonly used Chinese herbal medicines by factor analysis and cluster analysis.Results:292 literatures were finally included,involving 28 syndromes.A total of 479 prescriptions and 254 Chinese herbs were used.The rules of syndromes in IgA nephropathy are as follows:the syndromes are mainly composed of Qi and Yin deficiency syndrome,followed by the Spleen and kidney qi deficiency syndrome,Liver and Kidney Yin Deficiency Syndrome,Spleen and kidney yang deficiency syndrome.The blood-stasis syndrome,damp-heat syndrome and wind-heat disturbance syndrome are accompanied by other syndrome.The characteristics of traditional Chinese medicines for IgA nephropathy is mainly composed of tonic deficiency medicines,supplemented by blood-activating and stasis-eliminating medicines,heat-clearing medicines,inducing diuresis and excreting dampness medicines and astringent medicines to combine with formulas.Conclusion:Through Data-mining method systematically summarized the rules of TCM syndromes and Chinese medication in treating IgAN and provided scientific theoretical guidance for the treatment of IgAN.
文摘Background:To summarize the concerted application and prescription rules of traditional Chinese medicine in the treatment of pre diabetes.Methods:Microsoft Excel 2010 was used to summarize the categories,nature,flavour and channel tropism of drugs.The cluster analysis of high-frequency drugs was carried out by SPSS 22.0,and the association rules of high-frequency drugs were analyzed by Apriori algorithm provided by SPSS modeler 14.0.Results:One hundred and forty-six references were included,including 153 prescriptions and 131 drugs.Their frequency of use is listed in the following order.The top 3 categories of drugs were“Tonifying,Heat-Clearing”,diuresis and“Diffusing Dampness”drugs.The top 5 drugs were Huangqi(Astragali radix),Fuling(Poria),Huanglian(Coptidis rhizoma),Shanyao(Dioscoreae rhizoma),Gegen(Puerariae lobatae radix).The top 3 channel tropism of drugs were spleen,stomach and lung.The top 3 nature of drugs were cold,warm and calm.The top 3 flavour of drugs were sweet,bitter and pungent.The cluster analysis of high-frequency drugs showed that it could be classified into 4 categories:“Benefiting Qi”for promoting production of fluid,“Clearing Heat”and“Eliminating Dampness”,“Nourishing Yin”and“Clearing Heat”,and“Invigorating Spleen”for“Diffusing Dampness”.The results of association rule analysis showed that the combination with the highest degree of confidence and support was Poria-Chenpi(Citri reticulatae pericarpium)-Banxia(Pinelliae rhizoma)-Baizhu(Atractylodis macrocephalae rhizoma)and the combination with the highest frequency was Astragali radix-Puerariae lobatae radix.Conclusion:The pre diabetes is due to deficiency.The disease location is spleen and stomach and the pathological factor is phlegm-damp,that is why benefiting qi and invigorating spleen is regarded as the key link of clinical treatment.
基金Clinical observation and metabolomics study of patients with Phlegm-stasis interjunction polycystic ovary syndrome by Guangdong Bureau of Traditional Chinese Medicine (20202066)Shenzhen Baoan district science and technology plan (20200505115910988)Observation on the efficacy of Jiaxiao Dingjing Decoction combined with clomiphene in the treatment of polycystic ovary syndrome (2020JD526)。
文摘Objective:Using data mining technology to explore the rules of traditional Chinese medicine(TCM)in the treatment of threatened abortion in the early stage of pregnancy with sub-chorionic haematoma(SCH).Methods:Literature of TCM in the treatment of threatened abortion in the early stage of pregnancy with SCH were retrieved from CNKI,VIP,WANFANG and Pubmed,EMBASE.The literature information database was established to be used for descriptive analysis,association rule analysis and cluster analysis of relevant data.Results:A total of 100 literatures were included,involving 114 Chinese herbs.The efficacy of Chinese herbs were mainly tonic drugs,hemostatic drugs,heat-clearing drugs,dissolving blood stasis and hemostatic drugs.The medicinal properties were mostly mild and warm,and the taste of the drug was mainly sweet,bitter and pungent.The liver meridian,spleen meridian and kidney meridian were frequently used.The commonly used drug pair combination was"Xu duan(Radix dipsaci,续断)-Tusizi(Semen Cuscutae,菟丝子)",and the core combination was"Tusizi-Xu duan-Ejiao(Donkeyhide gelatin,阿胶)-Baizhu(Atractylodes macrocephala,白术)-Dangshen(Codonopsis pilosula,党参)".Commonly used drugs for removing blood stasis and hemostasis were with Sanqi(Panax notoginseng,三七),Puhuang(cattail pollen,蒲黄),and Qiancao(Radix Rubiae,茜草).Conclusion:Data mining traditional Chinese medicine for the treatment of threatened abortion in the early stage of pregnancy with SCH clinically commonly used drug efficacy,taste,meridian,commonly used drug pairs,core combination and commonly used blood stasis hemostatic drugs,has important reference significance for the treatment of threatened abortion in the early stage of pregnancy combined with SCH.
文摘BACKGROUND It is increasingly common to find patients affected by a combination of type 2 diabetes mellitus(T2DM)and coronary artery disease(CAD),and studies are able to correlate their relationships with available biological and clinical evidence.The aim of the current study was to apply association rule mining(ARM)to discover whether there are consistent patterns of clinical features relevant to these diseases.ARM leverages clinical and laboratory data to the meaningful patterns for diabetic CAD by harnessing the power help of data-driven algorithms to optimise the decision-making in patient care.AIM To reinforce the evidence of the T2DM-CAD interplay and demonstrate the ability of ARM to provide new insights into multivariate pattern discovery.METHODS This cross-sectional study was conducted at the Department of Biochemistry in a specialized tertiary care centre in Delhi,involving a total of 300 consented subjects categorized into three groups:CAD with diabetes,CAD without diabetes,and healthy controls,with 100 subjects in each group.The participants were enrolled from the Cardiology IPD&OPD for the sample collection.The study employed ARM technique to extract the meaningful patterns and relationships from the clinical data with its original value.RESULTS The clinical dataset comprised 35 attributes from enrolled subjects.The analysis produced rules with a maximum branching factor of 4 and a rule length of 5,necessitating a 1%probability increase for enhancement.Prominent patterns emerged,highlighting strong links between health indicators and diabetes likelihood,particularly elevated HbA1C and random blood sugar levels.The ARM technique identified individuals with a random blood sugar level>175 and HbA1C>6.6 are likely in the“CAD-with-diabetes”group,offering valuable insights into health indicators and influencing factors on disease outcomes.CONCLUSION The application of this method holds promise for healthcare practitioners to offer valuable insights for enhancing patient treatment targeting specific subtypes of CAD with diabetes.Implying artificial intelligence techniques with medical data,we have shown the potential for personalized healthcare and the development of user-friendly applications aimed at improving cardiovascular health outcomes for this high-risk population to optimise the decision-making in patient care.
文摘The Apriori algorithm is a classical method of association rules mining.Based on analysis of this theory,the paper provides an improved Apriori algorithm.The paper puts foward with algorithm combines HASH table technique and reduction of candidate item sets to enhance the usage efficiency of resources as well as the individualized service of the data library.
基金The work is supported by Natural Science Foundatiion of Chongqing (No .CSTC 2005BB2065)
文摘In data mining from transaction DB, the relationships between the attributes have been focused, but the relationships between the tuples have not been taken into account. In spatial database, there are relationships between the attributes and the tuples, and most of the associations occur between the tuples, such as adjacent, intersection, overlap and other topological relationships. So the tasks of spatial data association rules mining include mining the relationships between attributes of spatial objects, which are called as vertical direction DM, and the relationships between the tuples, which are called as horizontal direction DM. This paper analyzes the storage models of spatial data, uses for reference the technologies of data mining in transaction DB, defines the spatial data association rule, including vertical direction association rule, horizontal direction association rule and two-direction association rule, discusses the measurement of spatial association rule interestingness, and puts forward the work flows of spatial association rule data mining. During two-direction spatial association rules mining, an algorithm is proposed to get non-spatial itemsets. By virtue of spatial analysis, the spatial relations were transferred into non-spatial associations and the non-spatial itemsets were gotten. Based on the non-spatial itemsets, the Apriori algorithm or other algorithms could be used to get the frequent itemsets and then the spatial association rules come into being. Using spatial DB, the spatial association rules were gotten to validate the algorithm, and the test results show that this algorithm is efficient and can mine the interesting spatial rules.