Classification and association rule mining are used to take decisions based on relationships between attributes and help decision makers to take correct decisions at right time. Associative classification first genera...Classification and association rule mining are used to take decisions based on relationships between attributes and help decision makers to take correct decisions at right time. Associative classification first generates class based association rules and use that generate rule set which is used to predict the class label for unseen data. The large data sets may have many null-transac- tions. A null-transaction is a transaction that does not contain any of the itemsets being examined. It is important to consider the null invariance property when selecting appropriate interesting measures in the correlation analysis. Real time data set has mixed attributes. Analyze the mixed attribute data set is not easy. Hence, the proposed work uses cosine measure to avoid the influence of null transactions during rule generation. It employs mixed-kernel probability density function (PDF) to handle continuous attributes during data analysis. It has ably to handle both nominal and continuous attributes and generates mixed attribute rule set. To explore the search space efficiently it applies Ant Colony Optimization (ACO). The public data sets are used to analyze the performance of the algorithm. The results illustrate that the support-confidence framework with a correlation measure generates more accurate simple rule set and discover more interesting rules.展开更多
Association rules’learning is a machine learning method used in finding underlying associations in large datasets.Whether intentionally or unintentionally present,noise in training instances causes overfitting while ...Association rules’learning is a machine learning method used in finding underlying associations in large datasets.Whether intentionally or unintentionally present,noise in training instances causes overfitting while building the classifier and negatively impacts classification accuracy.This paper uses instance reduction techniques for the datasets before mining the association rules and building the classifier.Instance reduction techniques were originally developed to reduce memory requirements in instance-based learning.This paper utilizes them to remove noise from the dataset before training the association rules classifier.Extensive experiments were conducted to assess the accuracy of association rules with different instance reduction techniques,namely:DecrementalReduction Optimization Procedure(DROP)3,DROP5,ALL K-Nearest Neighbors(ALLKNN),Edited Nearest Neighbor(ENN),and Repeated Edited Nearest Neighbor(RENN)in different noise ratios.Experiments show that instance reduction techniques substantially improved the average classification accuracy on three different noise levels:0%,5%,and 10%.The RENN algorithm achieved the highest levels of accuracy with a significant improvement on seven out of eight used datasets from the University of California Irvine(UCI)machine learning repository.The improvements were more apparent in the 5%and the 10%noise cases.When RENN was applied,the average classification accuracy for the eight datasets in the zero-noise test enhanced from 70.47%to 76.65%compared to the original test.The average accuracy was improved from 66.08%to 77.47%for the 5%-noise case and from 59.89%to 77.59%in the 10%-noise case.Higher confidence was also reported in building the association rules when RENN was used.The above results indicate that RENN is a good solution in removing noise and avoiding overfitting during the construction of the association rules classifier,especially in noisy domains.展开更多
Associative classification has attracted remarkable research attention for business analytics in recent years due to its merits in accuracy and understandability.It is deemed meaningful to construct an associative cla...Associative classification has attracted remarkable research attention for business analytics in recent years due to its merits in accuracy and understandability.It is deemed meaningful to construct an associative classifier with a compact set of rules(i.e.,compactness),which is easy to understand and use in decision making.This paper presents a novel approach to fuzzy associative classification(namely Gain-based Fuzzy Rule-Covering classification,GFRC),which is a fuzzy extension of an effective classifier GARC.In GFRC,two desirable strategies are introduced to enhance the compactness with accuracy.One strategy is fuzzy partitioning for data discretization to cope with the‘sharp boundary problem’,in that simulated annealing is incorporated based on the information entropy measure;the other strategy is a data-redundancy resolution coupled with the rulecovering treatment.Data experiments show that GFRC had good accuracy,and was significantly advantageous over other classifiers in compactness.Moreover,GFRC is applied to a real-world case for predicting the growth of sellers in an electronic marketplace,illustrating the classification effectiveness with linguistic rules in business decision support.展开更多
In this paper, we introduce polygene-based evolution, a novel framework for evolutionary algorithms (EAs) that features distinctive operations in the evolutionary process. In traditional EAs, the primitive evolution...In this paper, we introduce polygene-based evolution, a novel framework for evolutionary algorithms (EAs) that features distinctive operations in the evolutionary process. In traditional EAs, the primitive evolution unit is a gene, wherein genes are independent components during evolution. In polygene-based evolutionary algorithms (PGEAs), the evolution unit is a polygene, i.e., a set of co-regulated genes. Discovering and maintaining quality polygenes can play an effective role in evolving quality individuals. Polygenes generalize genes, and PGEAs generalize EAs. Implementing the PGEA framework involves three phases: (Ⅰ) polygene discovery, (Ⅱ) polygene planting, and (Ⅲ) polygene-compatible evolution. For Phase I, we adopt an associative classificationbased approach to discover quality polygenes. For Phase Ⅱ, we perform probabilistic planting to maintain the diversity of individuals. For Phase Ⅲ, we incorporate polygenecompatible crossover and mutation in producing the next generation of individuals. Extensive experiments on function optimization benchmarks in comparison with the conventional and state-of-the-art EAs demonstrate the potential of the approach in terms of accuracy and efficiency improvement.展开更多
文摘Classification and association rule mining are used to take decisions based on relationships between attributes and help decision makers to take correct decisions at right time. Associative classification first generates class based association rules and use that generate rule set which is used to predict the class label for unseen data. The large data sets may have many null-transac- tions. A null-transaction is a transaction that does not contain any of the itemsets being examined. It is important to consider the null invariance property when selecting appropriate interesting measures in the correlation analysis. Real time data set has mixed attributes. Analyze the mixed attribute data set is not easy. Hence, the proposed work uses cosine measure to avoid the influence of null transactions during rule generation. It employs mixed-kernel probability density function (PDF) to handle continuous attributes during data analysis. It has ably to handle both nominal and continuous attributes and generates mixed attribute rule set. To explore the search space efficiently it applies Ant Colony Optimization (ACO). The public data sets are used to analyze the performance of the algorithm. The results illustrate that the support-confidence framework with a correlation measure generates more accurate simple rule set and discover more interesting rules.
基金The APC was funded by the Deanship of Scientific Research,Saudi Electronic University.
文摘Association rules’learning is a machine learning method used in finding underlying associations in large datasets.Whether intentionally or unintentionally present,noise in training instances causes overfitting while building the classifier and negatively impacts classification accuracy.This paper uses instance reduction techniques for the datasets before mining the association rules and building the classifier.Instance reduction techniques were originally developed to reduce memory requirements in instance-based learning.This paper utilizes them to remove noise from the dataset before training the association rules classifier.Extensive experiments were conducted to assess the accuracy of association rules with different instance reduction techniques,namely:DecrementalReduction Optimization Procedure(DROP)3,DROP5,ALL K-Nearest Neighbors(ALLKNN),Edited Nearest Neighbor(ENN),and Repeated Edited Nearest Neighbor(RENN)in different noise ratios.Experiments show that instance reduction techniques substantially improved the average classification accuracy on three different noise levels:0%,5%,and 10%.The RENN algorithm achieved the highest levels of accuracy with a significant improvement on seven out of eight used datasets from the University of California Irvine(UCI)machine learning repository.The improvements were more apparent in the 5%and the 10%noise cases.When RENN was applied,the average classification accuracy for the eight datasets in the zero-noise test enhanced from 70.47%to 76.65%compared to the original test.The average accuracy was improved from 66.08%to 77.47%for the 5%-noise case and from 59.89%to 77.59%in the 10%-noise case.Higher confidence was also reported in building the association rules when RENN was used.The above results indicate that RENN is a good solution in removing noise and avoiding overfitting during the construction of the association rules classifier,especially in noisy domains.
基金the MOE Project of Key Research Institute of Humanities and Social Sciences at Universities(12JJD630001)the National Natural Science Foundation of China(71372044/71110107027)Tsinghua University Initiative Scientific Research Program(20101081741).
文摘Associative classification has attracted remarkable research attention for business analytics in recent years due to its merits in accuracy and understandability.It is deemed meaningful to construct an associative classifier with a compact set of rules(i.e.,compactness),which is easy to understand and use in decision making.This paper presents a novel approach to fuzzy associative classification(namely Gain-based Fuzzy Rule-Covering classification,GFRC),which is a fuzzy extension of an effective classifier GARC.In GFRC,two desirable strategies are introduced to enhance the compactness with accuracy.One strategy is fuzzy partitioning for data discretization to cope with the‘sharp boundary problem’,in that simulated annealing is incorporated based on the information entropy measure;the other strategy is a data-redundancy resolution coupled with the rulecovering treatment.Data experiments show that GFRC had good accuracy,and was significantly advantageous over other classifiers in compactness.Moreover,GFRC is applied to a real-world case for predicting the growth of sellers in an electronic marketplace,illustrating the classification effectiveness with linguistic rules in business decision support.
基金The authors would like to thank Prof. Xin Yao for discussions and advice on this manuscript. This research was supported in part by the NSFC Joint Fund with Guangdong of China under Key Project (U 1201258), the National Natural Science Foundation of China (Grant Nos. 71402083, 61573219, 61502258) and the National Science Foundation of Shandong Province (ZR2014FQ007).
文摘In this paper, we introduce polygene-based evolution, a novel framework for evolutionary algorithms (EAs) that features distinctive operations in the evolutionary process. In traditional EAs, the primitive evolution unit is a gene, wherein genes are independent components during evolution. In polygene-based evolutionary algorithms (PGEAs), the evolution unit is a polygene, i.e., a set of co-regulated genes. Discovering and maintaining quality polygenes can play an effective role in evolving quality individuals. Polygenes generalize genes, and PGEAs generalize EAs. Implementing the PGEA framework involves three phases: (Ⅰ) polygene discovery, (Ⅱ) polygene planting, and (Ⅲ) polygene-compatible evolution. For Phase I, we adopt an associative classificationbased approach to discover quality polygenes. For Phase Ⅱ, we perform probabilistic planting to maintain the diversity of individuals. For Phase Ⅲ, we incorporate polygenecompatible crossover and mutation in producing the next generation of individuals. Extensive experiments on function optimization benchmarks in comparison with the conventional and state-of-the-art EAs demonstrate the potential of the approach in terms of accuracy and efficiency improvement.