Under the modern education system of China, the annual scholarship evaluation is a vital thing for many of the collegestudents. This paper adopts the classification algorithm of decision tree C4.5 based on the betteri...Under the modern education system of China, the annual scholarship evaluation is a vital thing for many of the collegestudents. This paper adopts the classification algorithm of decision tree C4.5 based on the bettering of ID3 algorithm and constructa data set of the scholarship evaluation system through the analysis of the related attributes in scholarship evaluation information.And also having found some factors that plays a significant role in the growing up of the college students through analysis and re-search of moral education, intellectural education and culture&PE.展开更多
Association rules and C4.5 rules can overcome the shortage of the traditional land evaluation methods and improve the intelligibility and efficiency of the land evaluation knowledge.In order to compare these two kinds...Association rules and C4.5 rules can overcome the shortage of the traditional land evaluation methods and improve the intelligibility and efficiency of the land evaluation knowledge.In order to compare these two kinds of classification rules in the application,two fuzzy classifiers were established by combining with fuzzy decision algorithm especially based on Second General Soil Survey of Guangdong Province.The results of experiments demonstrated that the fuzzy classifier based on association rules obtain a higher accuracy rate,but with more complex calculation process and more computational overhead;the fuzzy classifier based on C4.5 rules obtain a slightly lower accuracy,but with fast computation and simpler calculation.展开更多
Based on the discuss of the basic concept of data mining technology and the decision tree method,combining with the data samples of wind and hailstorm disasters in some counties of Mudanjiang region,the forecasting mo...Based on the discuss of the basic concept of data mining technology and the decision tree method,combining with the data samples of wind and hailstorm disasters in some counties of Mudanjiang region,the forecasting model of agro-meteorological disaster grade was established by adopting the C4.5 classification algorithm of decision tree,which can forecast the direct economic loss degree to provide rational data mining model and obtain effective analysis results.展开更多
[Objective]The aim was to overcome the shortage of being difficult to build land evaluation model when the impact factors had continuous value in the traditional land evaluation process,as well as to improve the intel...[Objective]The aim was to overcome the shortage of being difficult to build land evaluation model when the impact factors had continuous value in the traditional land evaluation process,as well as to improve the intelligibility of the land evaluation knowledge.[Method] The land evaluation method combining classification rule extracted by C4.5 algorithm with fuzzy decision was proposed in this study.[Result] The result of Second General Soil Survey of Guangdong Province had demonstrated that the method was convenient to extract classification rules,and by using only 100 rules,quantity correct rate 86.67% and area correct rate 84.80% of land evaluation could be obtained.[Conclusions] The use of C4.5 algorithm to obtain the rules,combined with fuzzy decision algorithm to build classifiers had got satisfactory results,which provided a practical algorithm for the land evaluation.展开更多
AIM: To assess the usefulness of FibroTest to forecast scores by constructing decision trees in patients with chronic hepatitis C.METHODS: We used the C4.5 classification algorithm to construct decision trees with d...AIM: To assess the usefulness of FibroTest to forecast scores by constructing decision trees in patients with chronic hepatitis C.METHODS: We used the C4.5 classification algorithm to construct decision trees with data from 261 patients with chronic hepatitis C without a liver biopsy. The FibroTest attributes of age, gender, bilirubin, apolipoprotein, haptoglobin, α2 macroglobulin, and γ-glutamyl transpeptidase were used as predictors, and the FibroTest score as the target. For testing, a 10-fold cross validation was used.RESULTS: The overall classification error was 14.9% (accuracy 85.1%). FibroTest's cases with true scores of FO and F4 were classified with very high accuracy (18/20 for FO, 9/9 for FO-1 and 92/96 for F4) and the largest confusion centered on F3. The algorithm produced a set of compound rules out of the ten classification trees and was used to classify the 261 patients. The rules for the classification of patients in FO and F4 were effective in more than 75% of the cases in which they were tested.CONCLUSION: The recognition of clinical subgroups should help to enhance our ability to assess differences in fibrosis scores in clinical studies and improve our understanding of fibrosis progression,展开更多
Taking the advantage of the nearly 14 000 items of muhi-source, multi-dimension practical dataset of type 2 diabetes, and a series of data mining experiments are designed to seek for important type 2 diabetes risk fac...Taking the advantage of the nearly 14 000 items of muhi-source, multi-dimension practical dataset of type 2 diabetes, and a series of data mining experiments are designed to seek for important type 2 diabetes risk factors and their relationships with blood glucose. The valuable pathological knowledge includes, the deci- sion tree is almost identical with the list of clinical diabetic risk factors; 9 items important risk factors of type 2 diabetes were found, and the relationship between the main risk factors and the blood glucose, and the feature of critical value of the risk factors were given too in this paper. These valuable results are good to the cure and macro-control type 2 diabetes.展开更多
Education is the base of the survival and growth of any state,but due to resource scarcity,students,particularly at the university level,are forced into a difficult situation.Scholarships are the most significant fina...Education is the base of the survival and growth of any state,but due to resource scarcity,students,particularly at the university level,are forced into a difficult situation.Scholarships are the most significant financial aid mechanisms developed to overcome such obstacles and assist the students in continuing with their higher studies.In this study,the convoluted situation of scholarship eligibility criteria,including parental income,responsibilities,and academic achievements,is addressed.In an attempt to maximize the scholarship selection process,numerous machine learning algorithms,including Support Vector Machines,Neural Networks,K-Nearest Neighbors,and the C4.5 algorithm,were applied.The C4.5 algorithm,owing to its efficiency in the prediction of scholarship beneficiaries based on extraneous factors,was capable of predicting a phenomenal 95.62%of predictions using extensive data of a well-esteemed government sector university from Pakistan.This percentage is 4%and 15%better than the remainder of the methods tested,and it depicts the extent of the potential for the technique to enhance the scholarship selection process.The Decision Support Systems(DSS)would not only save the administrative cost but would also create a fair and transparent process in place.In a world where accessibility to education is the key,this research provides data-oriented consolidation to ensure that deserving students are helped and allowed to get the financial assistance that they need to reach higher studies and bridge the gap between the demands of the day and the institutions of intellect.展开更多
Classification can be regarded as dividing the data space into decision regions separated by decision boundaries.In this paper we analyze decision tree algorithms and the NBTree algorithm from this perspective.Thus,a ...Classification can be regarded as dividing the data space into decision regions separated by decision boundaries.In this paper we analyze decision tree algorithms and the NBTree algorithm from this perspective.Thus,a decision tree can be regarded as a classifier tree,in which each classifier on a non-root node is trained in decision regions of the classifier on the parent node.Meanwhile,the NBTree algorithm,which generates a classifier tree with the C4.5 algorithm and the naive Bayes classifier as the root and leaf classifiers respectively,can also be regarded as training naive Bayes classifiers in decision regions of the C4.5 algorithm.We propose a second division (SD) algorithm and three soft second division (SD-soft) algorithms to train classifiers in decision regions of the naive Bayes classifier.These four novel algorithms all generate two-level classifier trees with the naive Bayes classifier as root classifiers.The SD and three SD-soft algorithms can make good use of both the information contained in instances near decision boundaries,and those that may be ignored by the naive Bayes classifier.Finally,we conduct experiments on 30 data sets from the UC Irvine (UCI) repository.Experiment results show that the SD algorithm can obtain better generali-zation abilities than the NBTree and the averaged one-dependence estimators (AODE) algorithms when using the C4.5 algorithm and support vector machine (SVM) as leaf classifiers.Further experiments indicate that our three SD-soft algorithms can achieve better generalization abilities than the SD algorithm when argument values are selected appropriately.展开更多
文摘Under the modern education system of China, the annual scholarship evaluation is a vital thing for many of the collegestudents. This paper adopts the classification algorithm of decision tree C4.5 based on the bettering of ID3 algorithm and constructa data set of the scholarship evaluation system through the analysis of the related attributes in scholarship evaluation information.And also having found some factors that plays a significant role in the growing up of the college students through analysis and re-search of moral education, intellectural education and culture&PE.
基金Supported by Science and Technology Plan Project of Guangdong Province (2009B010900026,2009CD058,2009CD078,2009CD079,2009CD080)Special Funds for Support Program of Development of Modern Information Service Industry of Guangdong Province(06120840B0370124)Funded Fund Project of South China Agricultural University (2007K017)~~
文摘Association rules and C4.5 rules can overcome the shortage of the traditional land evaluation methods and improve the intelligibility and efficiency of the land evaluation knowledge.In order to compare these two kinds of classification rules in the application,two fuzzy classifiers were established by combining with fuzzy decision algorithm especially based on Second General Soil Survey of Guangdong Province.The results of experiments demonstrated that the fuzzy classifier based on association rules obtain a higher accuracy rate,but with more complex calculation process and more computational overhead;the fuzzy classifier based on C4.5 rules obtain a slightly lower accuracy,but with fast computation and simpler calculation.
基金Supported by Science and Technology Plan of Mudanjiang City (G200920064)Teaching Reform Construction of Mudanjiang Normal University (10-xj11080)
文摘Based on the discuss of the basic concept of data mining technology and the decision tree method,combining with the data samples of wind and hailstorm disasters in some counties of Mudanjiang region,the forecasting model of agro-meteorological disaster grade was established by adopting the C4.5 classification algorithm of decision tree,which can forecast the direct economic loss degree to provide rational data mining model and obtain effective analysis results.
基金Supported by Science and Technology Plan Project of Guangdong Province (2009B010900026,2009CD058,2009CD078,2009CD079,2009CD080)Special Funds for Support Program of Development of Modern Information Service Industry of Guangdong Province(06120840B0370124 )Fund Project of South China Agricultural University (2007K017)~~
文摘[Objective]The aim was to overcome the shortage of being difficult to build land evaluation model when the impact factors had continuous value in the traditional land evaluation process,as well as to improve the intelligibility of the land evaluation knowledge.[Method] The land evaluation method combining classification rule extracted by C4.5 algorithm with fuzzy decision was proposed in this study.[Result] The result of Second General Soil Survey of Guangdong Province had demonstrated that the method was convenient to extract classification rules,and by using only 100 rules,quantity correct rate 86.67% and area correct rate 84.80% of land evaluation could be obtained.[Conclusions] The use of C4.5 algorithm to obtain the rules,combined with fuzzy decision algorithm to build classifiers had got satisfactory results,which provided a practical algorithm for the land evaluation.
基金Supported by A grant of the Universidad Nacional Autonoma de Mexico SDI.PTID.05.6
文摘AIM: To assess the usefulness of FibroTest to forecast scores by constructing decision trees in patients with chronic hepatitis C.METHODS: We used the C4.5 classification algorithm to construct decision trees with data from 261 patients with chronic hepatitis C without a liver biopsy. The FibroTest attributes of age, gender, bilirubin, apolipoprotein, haptoglobin, α2 macroglobulin, and γ-glutamyl transpeptidase were used as predictors, and the FibroTest score as the target. For testing, a 10-fold cross validation was used.RESULTS: The overall classification error was 14.9% (accuracy 85.1%). FibroTest's cases with true scores of FO and F4 were classified with very high accuracy (18/20 for FO, 9/9 for FO-1 and 92/96 for F4) and the largest confusion centered on F3. The algorithm produced a set of compound rules out of the ten classification trees and was used to classify the 261 patients. The rules for the classification of patients in FO and F4 were effective in more than 75% of the cases in which they were tested.CONCLUSION: The recognition of clinical subgroups should help to enhance our ability to assess differences in fibrosis scores in clinical studies and improve our understanding of fibrosis progression,
基金Sponsored by the National Natural Science Foundation of China(60671008)the National Science and Technology Support Project(2006038070031)the National"863"Program Project(2006AA02Z429)
文摘Taking the advantage of the nearly 14 000 items of muhi-source, multi-dimension practical dataset of type 2 diabetes, and a series of data mining experiments are designed to seek for important type 2 diabetes risk factors and their relationships with blood glucose. The valuable pathological knowledge includes, the deci- sion tree is almost identical with the list of clinical diabetic risk factors; 9 items important risk factors of type 2 diabetes were found, and the relationship between the main risk factors and the blood glucose, and the feature of critical value of the risk factors were given too in this paper. These valuable results are good to the cure and macro-control type 2 diabetes.
文摘Education is the base of the survival and growth of any state,but due to resource scarcity,students,particularly at the university level,are forced into a difficult situation.Scholarships are the most significant financial aid mechanisms developed to overcome such obstacles and assist the students in continuing with their higher studies.In this study,the convoluted situation of scholarship eligibility criteria,including parental income,responsibilities,and academic achievements,is addressed.In an attempt to maximize the scholarship selection process,numerous machine learning algorithms,including Support Vector Machines,Neural Networks,K-Nearest Neighbors,and the C4.5 algorithm,were applied.The C4.5 algorithm,owing to its efficiency in the prediction of scholarship beneficiaries based on extraneous factors,was capable of predicting a phenomenal 95.62%of predictions using extensive data of a well-esteemed government sector university from Pakistan.This percentage is 4%and 15%better than the remainder of the methods tested,and it depicts the extent of the potential for the technique to enhance the scholarship selection process.The Decision Support Systems(DSS)would not only save the administrative cost but would also create a fair and transparent process in place.In a world where accessibility to education is the key,this research provides data-oriented consolidation to ensure that deserving students are helped and allowed to get the financial assistance that they need to reach higher studies and bridge the gap between the demands of the day and the institutions of intellect.
基金supported by the National Natural Science Foundation of China (No.60970081)the National Basic Research Program (973) of China (No.2010CB327903)
文摘Classification can be regarded as dividing the data space into decision regions separated by decision boundaries.In this paper we analyze decision tree algorithms and the NBTree algorithm from this perspective.Thus,a decision tree can be regarded as a classifier tree,in which each classifier on a non-root node is trained in decision regions of the classifier on the parent node.Meanwhile,the NBTree algorithm,which generates a classifier tree with the C4.5 algorithm and the naive Bayes classifier as the root and leaf classifiers respectively,can also be regarded as training naive Bayes classifiers in decision regions of the C4.5 algorithm.We propose a second division (SD) algorithm and three soft second division (SD-soft) algorithms to train classifiers in decision regions of the naive Bayes classifier.These four novel algorithms all generate two-level classifier trees with the naive Bayes classifier as root classifiers.The SD and three SD-soft algorithms can make good use of both the information contained in instances near decision boundaries,and those that may be ignored by the naive Bayes classifier.Finally,we conduct experiments on 30 data sets from the UC Irvine (UCI) repository.Experiment results show that the SD algorithm can obtain better generali-zation abilities than the NBTree and the averaged one-dependence estimators (AODE) algorithms when using the C4.5 algorithm and support vector machine (SVM) as leaf classifiers.Further experiments indicate that our three SD-soft algorithms can achieve better generalization abilities than the SD algorithm when argument values are selected appropriately.