Typical association rules consider only items enumerated in transactions. Such rules are referred to as positive association rules. Negative association rules also consider the same items, but in addition consider neg...Typical association rules consider only items enumerated in transactions. Such rules are referred to as positive association rules. Negative association rules also consider the same items, but in addition consider negated items (i. e. absent from transactions). Negative association rules are useful in market-basket analysis to identify products that conflict with each other or products that complement each other. They are also very convenient for associative classifiers, classifiers that build their classification model based on association rules. Indeed, mining for such rules necessitates the examination of an exponentially large search space. Despite their usefulness, very few algorithms to mine them have been proposed to date. In this paper, an algorithm based on FP tree is presented to discover negative association rules.展开更多
The traditional generalization-based knowledge discovery method is introduced. A new kind of multilevel spatial association of the rules mining method based on the cloud model is presented. The cloud model integrates ...The traditional generalization-based knowledge discovery method is introduced. A new kind of multilevel spatial association of the rules mining method based on the cloud model is presented. The cloud model integrates the vague and random use of linguistic terms in a unified way. With these models, spatial and nonspatial attribute values are well generalized at multiple levels, allowing discovery of strong spatial association rules. Combining the cloud model based method with Apriori algorithms for mining association rules from a spatial database shows benefits in being effective and flexible.展开更多
Data-mining techniques have been developed to turn data into useful task-oriented knowledge. Most algorithms for mining association rules identify relationships among transactions using binary values and find rules at...Data-mining techniques have been developed to turn data into useful task-oriented knowledge. Most algorithms for mining association rules identify relationships among transactions using binary values and find rules at a single-concept level. Extracting multilevel association rules in transaction databases is most commonly used in data mining. This paper proposes a multilevel fuzzy association rule mining model for extraction of implicit knowledge which stored as quantitative values in transactions. For this reason it uses different support value at each level as well as different membership function for each item. By integrating fuzzy-set concepts, data-mining technologies and multiple-level taxonomy, our method finds fuzzy association rules from transaction data sets. This approach adopts a top-down progressively deepening approach to derive large itemsets and also incorporates fuzzy boundaries instead of sharp boundary intervals. Comparing our method with previous ones in simulation shows that the proposed method maintains higher precision, the mined rules are closer to reality, and it gives ability to mine association rules at different levels based on the user’s tendency as well.展开更多
Hotspots (active fires) indicate spatial distribution of fires. A study on determining influence factors for hotspot occurrence is essential so that fire events can be predicted based on characteristics of a certain a...Hotspots (active fires) indicate spatial distribution of fires. A study on determining influence factors for hotspot occurrence is essential so that fire events can be predicted based on characteristics of a certain area. This study discovers the possible influence factors on the occurrence of fire events using the association rule algorithm namely Apriori in the study area of Rokan Hilir Riau Province Indonesia. The Apriori algorithm was applied on a forest fire dataset which containeddata on physical environment (land cover, river, road and city center), socio-economic (income source, population, and number of school), weather (precipitation, wind speed, and screen temperature), and peatlands. The experiment results revealed 324 multidimensional association rules indicating relationships between hotspots occurrence and other factors.The association among hotspots occurrence with other geographical objects was discovered for the minimum support of 10% and the minimum confidence of 80%. The results show that strong relations between hotspots occurrence and influence factors are found for the support about 12.42%, the confidence of 1, and the lift of 2.26. These factors are precipitation greater than or equal to 3 mm/day, wind speed in [1m/s, 2m/s), non peatland area, screen temperature in [297K, 298K), the number of school in 1 km2 less than or equal to 0.1, and the distance of each hotspot to the nearest road less than or equal to 2.5 km.展开更多
In data mining from transaction DB, the relationships between the attributes have been focused, but the relationships between the tuples have not been taken into account. In spatial database, there are relationships b...In data mining from transaction DB, the relationships between the attributes have been focused, but the relationships between the tuples have not been taken into account. In spatial database, there are relationships between the attributes and the tuples, and most of the associations occur between the tuples, such as adjacent, intersection, overlap and other topological relationships. So the tasks of spatial data association rules mining include mining the relationships between attributes of spatial objects, which are called as vertical direction DM, and the relationships between the tuples, which are called as horizontal direction DM. This paper analyzes the storage models of spatial data, uses for reference the technologies of data mining in transaction DB, defines the spatial data association rule, including vertical direction association rule, horizontal direction association rule and two-direction association rule, discusses the measurement of spatial association rule interestingness, and puts forward the work flows of spatial association rule data mining. During two-direction spatial association rules mining, an algorithm is proposed to get non-spatial itemsets. By virtue of spatial analysis, the spatial relations were transferred into non-spatial associations and the non-spatial itemsets were gotten. Based on the non-spatial itemsets, the Apriori algorithm or other algorithms could be used to get the frequent itemsets and then the spatial association rules come into being. Using spatial DB, the spatial association rules were gotten to validate the algorithm, and the test results show that this algorithm is efficient and can mine the interesting spatial rules.展开更多
Association rule mining methods, as a set of important data mining tools, could be used for mining spatial association rules of spatial data. However, applications of these methods are limited for mining results conta...Association rule mining methods, as a set of important data mining tools, could be used for mining spatial association rules of spatial data. However, applications of these methods are limited for mining results containing large number of redundant rules. In this paper, a new method named Geo-Filtered Association Rules Mining(GFARM) is proposed to effectively eliminate the redundant rules. An application of GFARM is performed as a case study in which association rules are discovered between building land distribution and potential driving factors in Wuhan, China from 1995 to 2015. Ten sets of regular sampling grids with different sizes are used for detecting the influence of multi-scales on GFARM. Results show that the proposed method can filter 50%–70% of redundant rules. GFARM is also successful in discovering spatial association pattern between building land distribution and driving factors.展开更多
In this letter, on the basis of Frequent Pattern(FP) tree, the support function to update FP-tree is introduced, then an Incremental FP (IFP) algorithm for mining association rules is proposed. IFP algorithm considers...In this letter, on the basis of Frequent Pattern(FP) tree, the support function to update FP-tree is introduced, then an Incremental FP (IFP) algorithm for mining association rules is proposed. IFP algorithm considers not only adding new data into the database but also reducing old data from the database. Furthermore, it can predigest five cases to three cases.The algorithm proposed in this letter can avoid generating lots of candidate items, and it is high efficient.展开更多
Data mining techniques offer great opportunities for developing ethics lines whose main aim is to ensure improvements and compliance with the values, conduct and commitments making up the code of ethics. The aim of th...Data mining techniques offer great opportunities for developing ethics lines whose main aim is to ensure improvements and compliance with the values, conduct and commitments making up the code of ethics. The aim of this study is to suggest a process for exploiting the data generated by the data generated and collected from an ethics line by extracting rules of association and applying the Apriori algorithm. This makes it possible to identify anomalies and behaviour patterns requiring action to review, correct, promote or expand them, as appropriate.展开更多
In order to make effective use a large amount of graduate data in colleges and universities that accumulate by teaching management of work, the paper study the data mining for higher vocational graduates database usin...In order to make effective use a large amount of graduate data in colleges and universities that accumulate by teaching management of work, the paper study the data mining for higher vocational graduates database using the data mining technology. Using a variety of data preprocessing methods for the original data, and the paper put forward to mining algorithm based on commonly association rule Apriori algorithm, then according to the actual needs of the design and implementation of association rule mining system, has been beneficial to the employment guidance of college teaching management decision and graduates of the mining results.展开更多
Objective: Based on data mining, to explore the medication rules of Chinese medicine for the treatment of restless legs syndrome(RLS). Methods: The CNKI, WANFANG, and VIP were taken as data sources, and "restless...Objective: Based on data mining, to explore the medication rules of Chinese medicine for the treatment of restless legs syndrome(RLS). Methods: The CNKI, WANFANG, and VIP were taken as data sources, and "restless legs syndrome, RLS" as the key words, and "Chinese medicine, Chinese materia medica, traditional Chinese medicine(TCM), traditional Chinese and Western medicine" as sub key words, the data was extracted from the journals and literature related to the treatment of RLS by TCM from the establishment of the database to 2020, and data mining techniques(frequency analysis, cluster analysis, association rules) were used to analyze the core drugs and drug pair(group) rules. Results: A total of 87 prescriptions met the requirements of this study, involving 142 Chinese herbal medicines. The top 5 Chinese herbal medicines with a higher frequency of use were Radix Paeoniae Alba, Radix Glycyrrhizae, Radix Angelicae Sinensis, Fructus Chaenomelis and Radix Astragali seu Hedysari. The four Qi(气) of the medicine were mainly warm and neutral, the five flavors were mainly sweet, bitter, and pungent. The main channels of the meridian were mainly the liver meridian, spleen meridian and heart meridian. The medication categories were mainly tonifying deficiency herbs, blood activating and removing blood stasis herbs, and eliminating wind and dampness herbs. The association rule analysis yielded 24 Chinese medicine combinations with high support, and the hierarchical cluster analysis yielded a total of 5 clusters. Conclusion: TCM treatment of RLS is based on tonifying deficiency herbs, especially to replenish Qi and blood throughout the course of the disease, supplemented by herbs for promoting blood circulation and removing blood stasis, and herbs for eliminating wind and dampness, as well as combined with herbs for reliving superficies and herbs for calming the liver to stop the wind.展开更多
The Apriori algorithm is a classical method of association rules mining.Based on analysis of this theory,the paper provides an improved Apriori algorithm.The paper puts foward with algorithm combines HASH table techni...The Apriori algorithm is a classical method of association rules mining.Based on analysis of this theory,the paper provides an improved Apriori algorithm.The paper puts foward with algorithm combines HASH table technique and reduction of candidate item sets to enhance the usage efficiency of resources as well as the individualized service of the data library.展开更多
Objective:To explore the medication rule of Traditional Chinese Medicine(TCM)in the treatment of sleep disorder after stroke by using data mining technology.Methods:A computer search was used to search the electronic ...Objective:To explore the medication rule of Traditional Chinese Medicine(TCM)in the treatment of sleep disorder after stroke by using data mining technology.Methods:A computer search was used to search the electronic database of clinical literature on the treatment of sleep disorders after stroke by TCM from January 2000 to January 2021.Excel was used to establish the database,and the prescription information was described and analyzed statistically.Using IBM SPSS Modeler 18.0 software,Apriori algorithm was used for TCM association analysis,and IBM SPSS 22.0 software was used for systematic cluster analysis of high-frequency TCM.Results:A total of 67 literatures were included,covering 131 traditional Chinese medicines.The medecines with a higher frequency of sodium use include Ziziphi Spinosae Semen(Suanzaoren),Angelicae Sinensis Radix(Danggui),Ligusticum(Chuanxiong),liquorice(Gancao),Poria cocos(Fuling),and so on.From the effect point of view,deficiency-tonifying medicine,sedative medicine and blood-activating and stasis-removing medicine are commonly used.The medicinal properties are mainly cold,mild and warm.The main medicine flavor are sweet and bitter.The medicines mostly belong to the liver,heart and spleen Meridian.Thirty-three association rules were obtained for medicine pairs and medicine groups from the correlation analysis,and the core combinations were"Ziziphi Spinosae Semen(Suanzaoren)-Tuber fleeceflower stem(Yejiaoteng)","Ziziphi Spinosae Semen(Suanzaoren)-Polygala(Yuanzhi)","Ziziphi Spinosae Semen(Suanzaoren)-Cortex albiziae(Hehuanpi)"and"Angelicae Sinensis Radix(Danggui)-Radix bupleuri(Chaihu)-Radix Paeoniae Alba(Baishao)"and so on.Seven medicine aggregation groups were obtained by medicine cluster analysis.Conclusion:In the treatment of sleep disorder after stroke by TCM,the main method is to calm the heart and mind.Meanwhile,according to different syndrome types,the treatment methods of tonifying the heart and spleen,nourishing the liver and kidney,soothing the liver and softening the liver,clearing heat and resolving phlegm,nourishing the blood and promoting blood circulation are selected,which provide certain reference for clinical treatment.展开更多
In conjunction with association rules for data mining, the connections between testing indices and strong and weak association rules were determined, and new derivative rules were obtained by further reasoning. Associ...In conjunction with association rules for data mining, the connections between testing indices and strong and weak association rules were determined, and new derivative rules were obtained by further reasoning. Association rules were used to analyze correlation and check consistency between indices. This study shows that the judgment obtained by weak association rules or non-association rules is more accurate and more credible than that obtained by strong association rules. When the testing grades of two indices in the weak association rules are inconsistent, the testing grades of indices are more likely to be erroneous, and the mistakes are often caused by human factors. Clustering data mining technology was used to analyze the reliability of a diagnosis, or to perform health diagnosis directly. Analysis showed that the clustering results are related to the indices selected, and that if the indices selected are more significant, the characteristics of clustering results are also more significant, and the analysis or diagnosis is more credible. The indices and diagnosis analysis function produced by this study provide a necessary theoretical foundation and new ideas for the development of hydraulic metal structure health diagnosis technology.展开更多
Objective This study aimed to examine and propagate the medication experience and group formula of traditional Chinese medicine(TCM)Master XIONG Jibo in diagnosing and treat-ing arthralgia syndrome(AS)through data min...Objective This study aimed to examine and propagate the medication experience and group formula of traditional Chinese medicine(TCM)Master XIONG Jibo in diagnosing and treat-ing arthralgia syndrome(AS)through data mining.Methods Data of outpatient cases of Professor XIONG Jibo were collected from January 1,2014 to December 31,2018,along with cases recorded in A Real Famous Traditional Chinese Medicine Doctor:XIONG Jibo's Clinical Medical Record 1,which was published in December 2019.The five variables collected from the patients’data were TCM diagnostic information,TCM and western medicine diagnoses,syndrome,treatment,and prescription.A database was established for the collected data with Excel.Using the Python environment,a custom-ized modified natural language processing(NLP)model for the diagnosis and treatment of AS by Professor XIONG Jibo was established to preprocess the data and to analyze the word cloud.Frequency analysis,association rule analysis,cluster analysis,and visual analysis of AS cases were performed based on the Traditional Chinese Medicine Inheritance Computing Platform(V3.0)and RStudio(V4.0.3).Results A total of 610 medical records of Professor XIONG Jibo were collected from the case database.A total of 103 medical records were included after data screening criteria,which comprised 187 times(45 kinds)of prescriptions and 1506 times(125 kinds)of Chinese herbs.The main related meridians were the liver,spleen,and kidney meridians.The properties of Chinese herbs used most were mainly warm,flat,and cold,while the flavors of herbs were mainly bitter,pungent,and sweet.The main patterns of AS included the damp heat,phlegm stasis,and neck arthralgia.The most commonly used herbs for AS were Chuanniuxi(Cyathu-lae Radix),Huangbo(Phellodendri Chinensis Cortex),Cangzhu(Atractylodis Rhizoma),Qinjiao(Gentianae Macrophyllae Radix),Gancao(Glycyrrhizae Radix et Rhizoma),Huangqi(Astragali Radix),and Chuanxiong(Chuanxiong Rhizoma).The most common effect of the herbs was“promoting blood circulation and removing blood stasis”,followed by“supple-menting deficiency(Qi supplementing,blood supplementing,and Yang supplementing)”,and“dispelling wind and dampness”.The data were analyzed with the support≥15%and con-fidence=100%,and after de-duplication,five second-order association rules,39 third-order association rules,39 fourth-order association rules,and two fifth-order association rules were identified.The top-ranking association rules of each were“Cangzhu(Atractylodis Rhizoma)→Huangbo(Phellodendri Chinensis Cortex)”“Cangzhu(Atractylodis Rhizoma)+Chuanniuxi(Cyathulae Radix)→Huangbo(Phellodendri Chinensis Cortex)”“Chuanniuxi(Cyathulae Radix)+Danggui(Angelicae Sinensis Radix)+Gancao(Glycyrrhizae Radix et Rhizoma)→Qinjiao(Gentianae Macrophyllae Radix)”and“Chuanniuxi(Cyathulae Radix)+Danggui(Angelicae Sinensis Radix)+Gancao(Glycyrrhizae Radix et Rhizoma)+Huangbo(Phello-dendri Chinensis Cortex)→Qinjiao(Gentianae Macrophyllae Radix)”,respectively.Five clusters were obtained using cluster analysis of the top 30 herbs.The herbs were mainly dry-ing dampness,supplementing Qi,and promoting blood circulation.The main prescriptions of AS were Ermiao San(二妙散),Gegen Jianghuang San(葛根姜黄散),and Huangqi Chongteng Yin(黄芪虫藤饮).The herbs of core prescription included Cangzhu(Atractylodis Rhizoma),Chuanniuxi(Cyathulae Radix),Gancao(Glycyrrhizae Radix et Rhizoma),Huangbo(Phellodendri Chinensis Cortex),Mugua(Chaenomelis Fructus),Qinjiao(Gentianae Macro-phyllae Radix),Danggui(Angelicae Sinensis Radix),and Yiyiren(Coicis Semen).Conclusion Clearing heat and dampness,relieving collaterals and pain,and invigorating Qi and blood are the most commonly used therapies for the treatment of AS by Professor XIONG Jibo.Additionally,customized NLP model could improve the efficiency of data mining in TCM.展开更多
Intrusion detection is regarded as classification in data mining field. However instead of directly mining the classification rules, class association rules, which are then used to construct a classifier, are mined fr...Intrusion detection is regarded as classification in data mining field. However instead of directly mining the classification rules, class association rules, which are then used to construct a classifier, are mined from audit logs. Some attributes in audit logs are important for detecting intrusion but their values are distributed skewedly. A relative support concept is proposed to deal with such situation. To mine class association rules effectively, an algorithms based on FP-tree is exploited. Experiment result proves that this method has better performance.展开更多
As data mining more and more popular applied in computer system,the quality as-surance test of its software would be get more and more attention.However,because of the ex-istence of the 'oracle' problem,the tr...As data mining more and more popular applied in computer system,the quality as-surance test of its software would be get more and more attention.However,because of the ex-istence of the 'oracle' problem,the traditional test method is not ease fit for the application program in the field of the data mining.In this paper,based on metamorphic testing,a software testing method is proposed in the field of the data mining,makes an association rules algorithm as the specific case,and constructs the metamorphic relation on the algorithm.Experiences show that the method can achieve the testing target and is feasible to apply to other domain.展开更多
Despite advances in technological complexity and efforts,software repository maintenance requires reusing the data to reduce the effort and complexity.However,increasing ambiguity,irrelevance,and bugs while extracting...Despite advances in technological complexity and efforts,software repository maintenance requires reusing the data to reduce the effort and complexity.However,increasing ambiguity,irrelevance,and bugs while extracting similar data during software development generate a large amount of data from those data that reside in repositories.Thus,there is a need for a repository mining technique for relevant and bug-free data prediction.This paper proposes a fault prediction approach using a data-mining technique to find good predictors for high-quality software.To predict errors in mining data,the Apriori algorithm was used to discover association rules by fixing confidence at more than 40%and support at least 30%.The pruning strategy was adopted based on evaluation measures.Next,the rules were extracted from three projects of different domains;the extracted rules were then combined to obtain the most popular rules based on the evaluation measure values.To evaluate the proposed approach,we conducted an experimental study to compare the proposed rules with existing ones using four different industrial projects.The evaluation showed that the results of our proposal are promising.Practitioners and developers can utilize these rules for defect prediction during early software development.展开更多
基金Supported by the National Natural Science Foun-dation of China(70371015) and the Science Foundation of JiangsuUniversity ( 04KJD001)
文摘Typical association rules consider only items enumerated in transactions. Such rules are referred to as positive association rules. Negative association rules also consider the same items, but in addition consider negated items (i. e. absent from transactions). Negative association rules are useful in market-basket analysis to identify products that conflict with each other or products that complement each other. They are also very convenient for associative classifiers, classifiers that build their classification model based on association rules. Indeed, mining for such rules necessitates the examination of an exponentially large search space. Despite their usefulness, very few algorithms to mine them have been proposed to date. In this paper, an algorithm based on FP tree is presented to discover negative association rules.
文摘The traditional generalization-based knowledge discovery method is introduced. A new kind of multilevel spatial association of the rules mining method based on the cloud model is presented. The cloud model integrates the vague and random use of linguistic terms in a unified way. With these models, spatial and nonspatial attribute values are well generalized at multiple levels, allowing discovery of strong spatial association rules. Combining the cloud model based method with Apriori algorithms for mining association rules from a spatial database shows benefits in being effective and flexible.
文摘Data-mining techniques have been developed to turn data into useful task-oriented knowledge. Most algorithms for mining association rules identify relationships among transactions using binary values and find rules at a single-concept level. Extracting multilevel association rules in transaction databases is most commonly used in data mining. This paper proposes a multilevel fuzzy association rule mining model for extraction of implicit knowledge which stored as quantitative values in transactions. For this reason it uses different support value at each level as well as different membership function for each item. By integrating fuzzy-set concepts, data-mining technologies and multiple-level taxonomy, our method finds fuzzy association rules from transaction data sets. This approach adopts a top-down progressively deepening approach to derive large itemsets and also incorporates fuzzy boundaries instead of sharp boundary intervals. Comparing our method with previous ones in simulation shows that the proposed method maintains higher precision, the mined rules are closer to reality, and it gives ability to mine association rules at different levels based on the user’s tendency as well.
文摘Hotspots (active fires) indicate spatial distribution of fires. A study on determining influence factors for hotspot occurrence is essential so that fire events can be predicted based on characteristics of a certain area. This study discovers the possible influence factors on the occurrence of fire events using the association rule algorithm namely Apriori in the study area of Rokan Hilir Riau Province Indonesia. The Apriori algorithm was applied on a forest fire dataset which containeddata on physical environment (land cover, river, road and city center), socio-economic (income source, population, and number of school), weather (precipitation, wind speed, and screen temperature), and peatlands. The experiment results revealed 324 multidimensional association rules indicating relationships between hotspots occurrence and other factors.The association among hotspots occurrence with other geographical objects was discovered for the minimum support of 10% and the minimum confidence of 80%. The results show that strong relations between hotspots occurrence and influence factors are found for the support about 12.42%, the confidence of 1, and the lift of 2.26. These factors are precipitation greater than or equal to 3 mm/day, wind speed in [1m/s, 2m/s), non peatland area, screen temperature in [297K, 298K), the number of school in 1 km2 less than or equal to 0.1, and the distance of each hotspot to the nearest road less than or equal to 2.5 km.
基金The work is supported by Natural Science Foundatiion of Chongqing (No .CSTC 2005BB2065)
文摘In data mining from transaction DB, the relationships between the attributes have been focused, but the relationships between the tuples have not been taken into account. In spatial database, there are relationships between the attributes and the tuples, and most of the associations occur between the tuples, such as adjacent, intersection, overlap and other topological relationships. So the tasks of spatial data association rules mining include mining the relationships between attributes of spatial objects, which are called as vertical direction DM, and the relationships between the tuples, which are called as horizontal direction DM. This paper analyzes the storage models of spatial data, uses for reference the technologies of data mining in transaction DB, defines the spatial data association rule, including vertical direction association rule, horizontal direction association rule and two-direction association rule, discusses the measurement of spatial association rule interestingness, and puts forward the work flows of spatial association rule data mining. During two-direction spatial association rules mining, an algorithm is proposed to get non-spatial itemsets. By virtue of spatial analysis, the spatial relations were transferred into non-spatial associations and the non-spatial itemsets were gotten. Based on the non-spatial itemsets, the Apriori algorithm or other algorithms could be used to get the frequent itemsets and then the spatial association rules come into being. Using spatial DB, the spatial association rules were gotten to validate the algorithm, and the test results show that this algorithm is efficient and can mine the interesting spatial rules.
基金Under the auspices of Special Fund of Ministry of Land and Resources of China in Public Interest(No.201511001)
文摘Association rule mining methods, as a set of important data mining tools, could be used for mining spatial association rules of spatial data. However, applications of these methods are limited for mining results containing large number of redundant rules. In this paper, a new method named Geo-Filtered Association Rules Mining(GFARM) is proposed to effectively eliminate the redundant rules. An application of GFARM is performed as a case study in which association rules are discovered between building land distribution and potential driving factors in Wuhan, China from 1995 to 2015. Ten sets of regular sampling grids with different sizes are used for detecting the influence of multi-scales on GFARM. Results show that the proposed method can filter 50%–70% of redundant rules. GFARM is also successful in discovering spatial association pattern between building land distribution and driving factors.
基金Supported in part by the National Natural Science Foundation of China(No.60073012),Natural Science Foundation of Jiangsu(BK2001004)
文摘In this letter, on the basis of Frequent Pattern(FP) tree, the support function to update FP-tree is introduced, then an Incremental FP (IFP) algorithm for mining association rules is proposed. IFP algorithm considers not only adding new data into the database but also reducing old data from the database. Furthermore, it can predigest five cases to three cases.The algorithm proposed in this letter can avoid generating lots of candidate items, and it is high efficient.
文摘Data mining techniques offer great opportunities for developing ethics lines whose main aim is to ensure improvements and compliance with the values, conduct and commitments making up the code of ethics. The aim of this study is to suggest a process for exploiting the data generated by the data generated and collected from an ethics line by extracting rules of association and applying the Apriori algorithm. This makes it possible to identify anomalies and behaviour patterns requiring action to review, correct, promote or expand them, as appropriate.
文摘In order to make effective use a large amount of graduate data in colleges and universities that accumulate by teaching management of work, the paper study the data mining for higher vocational graduates database using the data mining technology. Using a variety of data preprocessing methods for the original data, and the paper put forward to mining algorithm based on commonly association rule Apriori algorithm, then according to the actual needs of the design and implementation of association rule mining system, has been beneficial to the employment guidance of college teaching management decision and graduates of the mining results.
文摘Objective: Based on data mining, to explore the medication rules of Chinese medicine for the treatment of restless legs syndrome(RLS). Methods: The CNKI, WANFANG, and VIP were taken as data sources, and "restless legs syndrome, RLS" as the key words, and "Chinese medicine, Chinese materia medica, traditional Chinese medicine(TCM), traditional Chinese and Western medicine" as sub key words, the data was extracted from the journals and literature related to the treatment of RLS by TCM from the establishment of the database to 2020, and data mining techniques(frequency analysis, cluster analysis, association rules) were used to analyze the core drugs and drug pair(group) rules. Results: A total of 87 prescriptions met the requirements of this study, involving 142 Chinese herbal medicines. The top 5 Chinese herbal medicines with a higher frequency of use were Radix Paeoniae Alba, Radix Glycyrrhizae, Radix Angelicae Sinensis, Fructus Chaenomelis and Radix Astragali seu Hedysari. The four Qi(气) of the medicine were mainly warm and neutral, the five flavors were mainly sweet, bitter, and pungent. The main channels of the meridian were mainly the liver meridian, spleen meridian and heart meridian. The medication categories were mainly tonifying deficiency herbs, blood activating and removing blood stasis herbs, and eliminating wind and dampness herbs. The association rule analysis yielded 24 Chinese medicine combinations with high support, and the hierarchical cluster analysis yielded a total of 5 clusters. Conclusion: TCM treatment of RLS is based on tonifying deficiency herbs, especially to replenish Qi and blood throughout the course of the disease, supplemented by herbs for promoting blood circulation and removing blood stasis, and herbs for eliminating wind and dampness, as well as combined with herbs for reliving superficies and herbs for calming the liver to stop the wind.
文摘The Apriori algorithm is a classical method of association rules mining.Based on analysis of this theory,the paper provides an improved Apriori algorithm.The paper puts foward with algorithm combines HASH table technique and reduction of candidate item sets to enhance the usage efficiency of resources as well as the individualized service of the data library.
基金Beijing Science and Technology Program(No.Z191100006619065)National Key R&D Program(No.2017YFC1700101)。
文摘Objective:To explore the medication rule of Traditional Chinese Medicine(TCM)in the treatment of sleep disorder after stroke by using data mining technology.Methods:A computer search was used to search the electronic database of clinical literature on the treatment of sleep disorders after stroke by TCM from January 2000 to January 2021.Excel was used to establish the database,and the prescription information was described and analyzed statistically.Using IBM SPSS Modeler 18.0 software,Apriori algorithm was used for TCM association analysis,and IBM SPSS 22.0 software was used for systematic cluster analysis of high-frequency TCM.Results:A total of 67 literatures were included,covering 131 traditional Chinese medicines.The medecines with a higher frequency of sodium use include Ziziphi Spinosae Semen(Suanzaoren),Angelicae Sinensis Radix(Danggui),Ligusticum(Chuanxiong),liquorice(Gancao),Poria cocos(Fuling),and so on.From the effect point of view,deficiency-tonifying medicine,sedative medicine and blood-activating and stasis-removing medicine are commonly used.The medicinal properties are mainly cold,mild and warm.The main medicine flavor are sweet and bitter.The medicines mostly belong to the liver,heart and spleen Meridian.Thirty-three association rules were obtained for medicine pairs and medicine groups from the correlation analysis,and the core combinations were"Ziziphi Spinosae Semen(Suanzaoren)-Tuber fleeceflower stem(Yejiaoteng)","Ziziphi Spinosae Semen(Suanzaoren)-Polygala(Yuanzhi)","Ziziphi Spinosae Semen(Suanzaoren)-Cortex albiziae(Hehuanpi)"and"Angelicae Sinensis Radix(Danggui)-Radix bupleuri(Chaihu)-Radix Paeoniae Alba(Baishao)"and so on.Seven medicine aggregation groups were obtained by medicine cluster analysis.Conclusion:In the treatment of sleep disorder after stroke by TCM,the main method is to calm the heart and mind.Meanwhile,according to different syndrome types,the treatment methods of tonifying the heart and spleen,nourishing the liver and kidney,soothing the liver and softening the liver,clearing heat and resolving phlegm,nourishing the blood and promoting blood circulation are selected,which provide certain reference for clinical treatment.
基金supported by the Key Program of the National Natural Science Foundation of China(Grant No.50539010)the Special Fund for Public Welfare Industry of the Ministry of Water Resources of China(Grant No.200801019)
文摘In conjunction with association rules for data mining, the connections between testing indices and strong and weak association rules were determined, and new derivative rules were obtained by further reasoning. Association rules were used to analyze correlation and check consistency between indices. This study shows that the judgment obtained by weak association rules or non-association rules is more accurate and more credible than that obtained by strong association rules. When the testing grades of two indices in the weak association rules are inconsistent, the testing grades of indices are more likely to be erroneous, and the mistakes are often caused by human factors. Clustering data mining technology was used to analyze the reliability of a diagnosis, or to perform health diagnosis directly. Analysis showed that the clustering results are related to the indices selected, and that if the indices selected are more significant, the characteristics of clustering results are also more significant, and the analysis or diagnosis is more credible. The indices and diagnosis analysis function produced by this study provide a necessary theoretical foundation and new ideas for the development of hydraulic metal structure health diagnosis technology.
基金Project of State Administration of Traditional Chinese Medicine(GZY-YZS-2019-45)The Horizontal Project of Hunan Medical College(HYH-2021Y-KJ-6-33)+1 种基金Scientific Research Project of Hunan Provincial Department of Education in 2021(21C0223)Natural Science Foundation of Hunan Province in 2022(1524)。
文摘Objective This study aimed to examine and propagate the medication experience and group formula of traditional Chinese medicine(TCM)Master XIONG Jibo in diagnosing and treat-ing arthralgia syndrome(AS)through data mining.Methods Data of outpatient cases of Professor XIONG Jibo were collected from January 1,2014 to December 31,2018,along with cases recorded in A Real Famous Traditional Chinese Medicine Doctor:XIONG Jibo's Clinical Medical Record 1,which was published in December 2019.The five variables collected from the patients’data were TCM diagnostic information,TCM and western medicine diagnoses,syndrome,treatment,and prescription.A database was established for the collected data with Excel.Using the Python environment,a custom-ized modified natural language processing(NLP)model for the diagnosis and treatment of AS by Professor XIONG Jibo was established to preprocess the data and to analyze the word cloud.Frequency analysis,association rule analysis,cluster analysis,and visual analysis of AS cases were performed based on the Traditional Chinese Medicine Inheritance Computing Platform(V3.0)and RStudio(V4.0.3).Results A total of 610 medical records of Professor XIONG Jibo were collected from the case database.A total of 103 medical records were included after data screening criteria,which comprised 187 times(45 kinds)of prescriptions and 1506 times(125 kinds)of Chinese herbs.The main related meridians were the liver,spleen,and kidney meridians.The properties of Chinese herbs used most were mainly warm,flat,and cold,while the flavors of herbs were mainly bitter,pungent,and sweet.The main patterns of AS included the damp heat,phlegm stasis,and neck arthralgia.The most commonly used herbs for AS were Chuanniuxi(Cyathu-lae Radix),Huangbo(Phellodendri Chinensis Cortex),Cangzhu(Atractylodis Rhizoma),Qinjiao(Gentianae Macrophyllae Radix),Gancao(Glycyrrhizae Radix et Rhizoma),Huangqi(Astragali Radix),and Chuanxiong(Chuanxiong Rhizoma).The most common effect of the herbs was“promoting blood circulation and removing blood stasis”,followed by“supple-menting deficiency(Qi supplementing,blood supplementing,and Yang supplementing)”,and“dispelling wind and dampness”.The data were analyzed with the support≥15%and con-fidence=100%,and after de-duplication,five second-order association rules,39 third-order association rules,39 fourth-order association rules,and two fifth-order association rules were identified.The top-ranking association rules of each were“Cangzhu(Atractylodis Rhizoma)→Huangbo(Phellodendri Chinensis Cortex)”“Cangzhu(Atractylodis Rhizoma)+Chuanniuxi(Cyathulae Radix)→Huangbo(Phellodendri Chinensis Cortex)”“Chuanniuxi(Cyathulae Radix)+Danggui(Angelicae Sinensis Radix)+Gancao(Glycyrrhizae Radix et Rhizoma)→Qinjiao(Gentianae Macrophyllae Radix)”and“Chuanniuxi(Cyathulae Radix)+Danggui(Angelicae Sinensis Radix)+Gancao(Glycyrrhizae Radix et Rhizoma)+Huangbo(Phello-dendri Chinensis Cortex)→Qinjiao(Gentianae Macrophyllae Radix)”,respectively.Five clusters were obtained using cluster analysis of the top 30 herbs.The herbs were mainly dry-ing dampness,supplementing Qi,and promoting blood circulation.The main prescriptions of AS were Ermiao San(二妙散),Gegen Jianghuang San(葛根姜黄散),and Huangqi Chongteng Yin(黄芪虫藤饮).The herbs of core prescription included Cangzhu(Atractylodis Rhizoma),Chuanniuxi(Cyathulae Radix),Gancao(Glycyrrhizae Radix et Rhizoma),Huangbo(Phellodendri Chinensis Cortex),Mugua(Chaenomelis Fructus),Qinjiao(Gentianae Macro-phyllae Radix),Danggui(Angelicae Sinensis Radix),and Yiyiren(Coicis Semen).Conclusion Clearing heat and dampness,relieving collaterals and pain,and invigorating Qi and blood are the most commonly used therapies for the treatment of AS by Professor XIONG Jibo.Additionally,customized NLP model could improve the efficiency of data mining in TCM.
基金The work is supported by Chinese NSF(Project No.60073034)
文摘Intrusion detection is regarded as classification in data mining field. However instead of directly mining the classification rules, class association rules, which are then used to construct a classifier, are mined from audit logs. Some attributes in audit logs are important for detecting intrusion but their values are distributed skewedly. A relative support concept is proposed to deal with such situation. To mine class association rules effectively, an algorithms based on FP-tree is exploited. Experiment result proves that this method has better performance.
文摘As data mining more and more popular applied in computer system,the quality as-surance test of its software would be get more and more attention.However,because of the ex-istence of the 'oracle' problem,the traditional test method is not ease fit for the application program in the field of the data mining.In this paper,based on metamorphic testing,a software testing method is proposed in the field of the data mining,makes an association rules algorithm as the specific case,and constructs the metamorphic relation on the algorithm.Experiences show that the method can achieve the testing target and is feasible to apply to other domain.
基金This research was financially supported in part by the Ministry of Trade,Industry and Energy(MOTIE)and Korea Institute for Advancement of Technology(KIAT)through the International Cooperative R&D program.(Project No.P0016038)in part by the MSIT(Ministry of Science and ICT),Korea,under the ITRC(Information Technology Research Center)support program(IITP-2021-2016-0-00312)supervised by the IITP(Institute for Information&communications Technology Planning&Evaluation).
文摘Despite advances in technological complexity and efforts,software repository maintenance requires reusing the data to reduce the effort and complexity.However,increasing ambiguity,irrelevance,and bugs while extracting similar data during software development generate a large amount of data from those data that reside in repositories.Thus,there is a need for a repository mining technique for relevant and bug-free data prediction.This paper proposes a fault prediction approach using a data-mining technique to find good predictors for high-quality software.To predict errors in mining data,the Apriori algorithm was used to discover association rules by fixing confidence at more than 40%and support at least 30%.The pruning strategy was adopted based on evaluation measures.Next,the rules were extracted from three projects of different domains;the extracted rules were then combined to obtain the most popular rules based on the evaluation measure values.To evaluate the proposed approach,we conducted an experimental study to compare the proposed rules with existing ones using four different industrial projects.The evaluation showed that the results of our proposal are promising.Practitioners and developers can utilize these rules for defect prediction during early software development.