Datamining plays a crucial role in extractingmeaningful knowledge fromlarge-scale data repositories,such as data warehouses and databases.Association rule mining,a fundamental process in data mining,involves discoveri...Datamining plays a crucial role in extractingmeaningful knowledge fromlarge-scale data repositories,such as data warehouses and databases.Association rule mining,a fundamental process in data mining,involves discovering correlations,patterns,and causal structures within datasets.In the healthcare domain,association rules offer valuable opportunities for building knowledge bases,enabling intelligent diagnoses,and extracting invaluable information rapidly.This paper presents a novel approach called the Machine Learning based Association Rule Mining and Classification for Healthcare Data Management System(MLARMC-HDMS).The MLARMC-HDMS technique integrates classification and association rule mining(ARM)processes.Initially,the chimp optimization algorithm-based feature selection(COAFS)technique is employed within MLARMC-HDMS to select relevant attributes.Inspired by the foraging behavior of chimpanzees,the COA algorithm mimics their search strategy for food.Subsequently,the classification process utilizes stochastic gradient descent with a multilayer perceptron(SGD-MLP)model,while the Apriori algorithm determines attribute relationships.We propose a COA-based feature selection approach for medical data classification using machine learning techniques.This approach involves selecting pertinent features from medical datasets through COA and training machine learning models using the reduced feature set.We evaluate the performance of our approach on various medical datasets employing diverse machine learning classifiers.Experimental results demonstrate that our proposed approach surpasses alternative feature selection methods,achieving higher accuracy and precision rates in medical data classification tasks.The study showcases the effectiveness and efficiency of the COA-based feature selection approach in identifying relevant features,thereby enhancing the diagnosis and treatment of various diseases.To provide further validation,we conduct detailed experiments on a benchmark medical dataset,revealing the superiority of the MLARMCHDMS model over other methods,with a maximum accuracy of 99.75%.Therefore,this research contributes to the advancement of feature selection techniques in medical data classification and highlights the potential for improving healthcare outcomes through accurate and efficient data analysis.The presented MLARMC-HDMS framework and COA-based feature selection approach offer valuable insights for researchers and practitioners working in the field of healthcare data mining and machine learning.展开更多
Joint probabilistic data association is an effective method for tracking multiple targets in clutter, but only the target kinematic information is used in measure-to-track association. If the kinematic likelihoods are...Joint probabilistic data association is an effective method for tracking multiple targets in clutter, but only the target kinematic information is used in measure-to-track association. If the kinematic likelihoods are similar for different closely spaced targets, there is ambiguity in using the kinematic information alone; the correct association probability will decrease in conventional joint probabilistic data association algorithm and track coalescence will occur easily. A modified algorithm of joint probabilistic data association with classification-aided is presented, which avoids track coalescence when tracking multiple neighboring targets. Firstly, an identification matrix is defined, which is used to simplify validation matrix to decrease computational complexity. Then, target class information is integrated into the data association process. Performance comparisons with and without the use of class information in JPDA are presented on multiple closely spaced maneuvering targets tracking problem. Simulation results quantify the benefits of classification-aided JPDA for improved multiple targets tracking, especially in the presence of association uncertainty in the kinematic measurement and target maneuvering. Simulation results indicate that the algorithm is valid.展开更多
The paper discusses the concept of mineral resources associated with coal measures. A rational and scientific classification of such mineral resources becomes more necessary with the development of science and technol...The paper discusses the concept of mineral resources associated with coal measures. A rational and scientific classification of such mineral resources becomes more necessary with the development of science and technology. A classification scheme is proposed based on compositions and physical properties and the utilization of these associated minerals.展开更多
In this paper, we propose an enhanced associative classification method by integrating the dynamic property in the process of associative classification. In the proposed method, we employ a support vector machine(SVM...In this paper, we propose an enhanced associative classification method by integrating the dynamic property in the process of associative classification. In the proposed method, we employ a support vector machine(SVM) based method to refine the discovered emerging ~equent patterns for classification rule extension for class label prediction. The empirical study shows that our method can be used to classify increasing resources efficiently and effectively.展开更多
In most of the passive tracking systems, only the target kinematical information is used in the measurement-to-track association, which results in error tracking in a multitarget environment, where the targets are too...In most of the passive tracking systems, only the target kinematical information is used in the measurement-to-track association, which results in error tracking in a multitarget environment, where the targets are too close to each other. To enhance the tracking accuracy, the target signal classification information (TSCI) should be used to improve the data association. The TSCI is integrated in the data association process using the JPDA (joint probabilistic data association). The use of the TSCI in the data association can improve discrimination by yielding a purer track and preserving continuity. To verify the validity of the application of TSCI, two simulation experiments are done on an air target-tracing problem, that is, one using the TSCI and the other not using the TSCI. The final comparison shows that the use of the TSCI can effectively improve tracking accuracy.展开更多
The classification of central nervous system(CNS)glioma went through a sequence of developments,between 2006 and 2021,started with only histological approach then has been aided with a major emphasis on molecular sign...The classification of central nervous system(CNS)glioma went through a sequence of developments,between 2006 and 2021,started with only histological approach then has been aided with a major emphasis on molecular signatures in the 4^(th) and 5^(th) editions of the World Health Organization(WHO).The recent reformation in the 5th edition of the WHO classification has focused more on the molecularly defined entities with better characterized natural histories as well as new tumor types and subtypes in the adult and pediatric populations.These new subclassified entities have been incorporated in the 5^(th) edition after the continuous exploration of new genomic,epigenomic and transcriptomic discovery.Indeed,the current guidelines of 2021 WHO classification of CNS tumors and European Association of Neuro-Oncology(EANO)exploited the molecular signatures in the diagnostic approach of CNS gliomas.Our current review presents a practical diagnostic approach for diffuse CNS gliomas and circumscribed astrocytomas using histomolecular criteria adopted by the recent WHO classification.We also describe the treatment strategies for these tumors based on EANO guidelines.展开更多
We propose two models in this paper. The concept of association model is put forward to obtain the co-occurrence relationships among keywords in the documents and the hierarchical Hamming clustering model is used to r...We propose two models in this paper. The concept of association model is put forward to obtain the co-occurrence relationships among keywords in the documents and the hierarchical Hamming clustering model is used to reduce the dimensionality of the category feature vector space which can solve the problem of the extremely high dimensionality of the documents' feature space. The results of experiment indicate that it can obtain the co-occurrence relations among key-words in the documents which promote the recall of classification system effectively. The hierarchical Hamming clustering model can reduce the dimensionality of the category feature vector efficiently, the size of the vector space is only about 10% of the primary dimensionality. Key words text classification - concept association - hierarchical clustering - hamming clustering CLC number TN 915. 08 Foundation item: Supporteded by the National 863 Project of China (2001AA142160, 2002AA145090)Biography: Su Gui-yang (1974-), male, Ph. D candidate, research direction: information filter and text classification.展开更多
The amount of data for decision making has increased tremendously in the age of the digital economy. Decision makers who fail to proficiently manipulate the data produced may make incorrect decisions and therefore har...The amount of data for decision making has increased tremendously in the age of the digital economy. Decision makers who fail to proficiently manipulate the data produced may make incorrect decisions and therefore harm their business. Thus, the task of extracting and classifying the useful information efficiently and effectively from huge amounts of computational data is of special importance. In this paper, we consider that the attributes of data could be both crisp and fuzzy. By examining the suitable partial data, segments with different classes are formed, then a multithreaded computation is performed to generate crisp rules (if possible), and finally, the fuzzy partition technique is employed to deal with the fuzzy attributes for classification. The rules generated in classifying the overall data can be used to gain more knowledge from the data collected.展开更多
Classification and association rule mining are used to take decisions based on relationships between attributes and help decision makers to take correct decisions at right time. Associative classification first genera...Classification and association rule mining are used to take decisions based on relationships between attributes and help decision makers to take correct decisions at right time. Associative classification first generates class based association rules and use that generate rule set which is used to predict the class label for unseen data. The large data sets may have many null-transac- tions. A null-transaction is a transaction that does not contain any of the itemsets being examined. It is important to consider the null invariance property when selecting appropriate interesting measures in the correlation analysis. Real time data set has mixed attributes. Analyze the mixed attribute data set is not easy. Hence, the proposed work uses cosine measure to avoid the influence of null transactions during rule generation. It employs mixed-kernel probability density function (PDF) to handle continuous attributes during data analysis. It has ably to handle both nominal and continuous attributes and generates mixed attribute rule set. To explore the search space efficiently it applies Ant Colony Optimization (ACO). The public data sets are used to analyze the performance of the algorithm. The results illustrate that the support-confidence framework with a correlation measure generates more accurate simple rule set and discover more interesting rules.展开更多
To deal with the problem that arises when the conventional fuzzy class-association method applies repetitive scans of the classifier to classify new texts,which has low efficiency, a new approach based on the FCR-tree...To deal with the problem that arises when the conventional fuzzy class-association method applies repetitive scans of the classifier to classify new texts,which has low efficiency, a new approach based on the FCR-tree(fuzzy classification rules tree)for text categorization is proposed.The compactness of the FCR-tree saves significant space in storing a large set of rules when there are many repeated words in the rules.In comparison with classification rules,the fuzzy classification rules contain not only words,but also the fuzzy sets corresponding to the frequencies of words appearing in texts.Therefore,the construction of an FCR-tree and its structure are different from a CR-tree.To debase the difficulty of FCR-tree construction and rules retrieval,more k-FCR-trees are built.When classifying a new text,it is not necessary to search the paths of the sub-trees led by those words not appearing in this text,thus reducing the number of traveling rules.Experimental results show that the proposed approach obviously outperforms the conventional method in efficiency.展开更多
Background Acute coronary syndrome(ACS) is one of the most common forms of heart diseases.Recent studies have revealed that interleukin(IL)-8 plays a kev role in the development of atherosclerosis plaque and its compl...Background Acute coronary syndrome(ACS) is one of the most common forms of heart diseases.Recent studies have revealed that interleukin(IL)-8 plays a kev role in the development of atherosclerosis plaque and its complications, but the relationship of its common variants with ACS has not been extensively studied.Methods We tested the hypothesis that variants in IL-8-251 A/T was associated with susceptibility to ACS and its recurrence in a Chinese case-control study comprising 675 patients with ACS and 636 control subjects and replicated the investigation in an independent study comprising 360 cases and 360 control subjects. The plasma concentration of IL-8 was measured by enzyme-linked immunosorbent assay.Results IL-8 -251A】T poly-morphism was associated with increased susceptibility to ACS (P=0.004;OR=1.30 CI:1.12-1.53).Replication in the second study yielded similar results.IL-8 -251 A/T may affect the expression of IL-8 by the evidence that augmented IL-8 production revealed in serum of the AMI patients by ELISA. Conclusions IL-8 -251 A/T polymorphism is associated with ACS risk in Chinese Han population and An allele of IL-8- 251A/T may be an independent predictive factor.展开更多
In the presence of background noise,arrival times picked from a surface microseismic data set usually include a number of false picks that can lead to uncertainty in location estimation.To eliminate false picks and im...In the presence of background noise,arrival times picked from a surface microseismic data set usually include a number of false picks that can lead to uncertainty in location estimation.To eliminate false picks and improve the accuracy of location estimates,we develop an association algorithm termed RANSAC-based Arrival Time Event Clustering(RATEC)that clusters picked arrival times into event groups based on random sampling and fitting moveout curves that approximate hyperbolas.Arrival times far from the fitted hyperbolas are classified as false picks and removed from the data set prior to location estimation.Simulations of synthetic data for a 1-D linear array show that RATEC is robust under different noise conditions and generally applicable to various types of subsurface structures.By generalizing the underlying moveout model,RATEC is extended to the case of a 2-D surface monitoring array.The effectiveness of event location for the 2-D case is demonstrated using a data set collected by the 5200-element dense Long Beach array.The obtained results suggest that RATEC is effective in removing false picks and hence can be used for phase association before location estimates.展开更多
Association rules’learning is a machine learning method used in finding underlying associations in large datasets.Whether intentionally or unintentionally present,noise in training instances causes overfitting while ...Association rules’learning is a machine learning method used in finding underlying associations in large datasets.Whether intentionally or unintentionally present,noise in training instances causes overfitting while building the classifier and negatively impacts classification accuracy.This paper uses instance reduction techniques for the datasets before mining the association rules and building the classifier.Instance reduction techniques were originally developed to reduce memory requirements in instance-based learning.This paper utilizes them to remove noise from the dataset before training the association rules classifier.Extensive experiments were conducted to assess the accuracy of association rules with different instance reduction techniques,namely:DecrementalReduction Optimization Procedure(DROP)3,DROP5,ALL K-Nearest Neighbors(ALLKNN),Edited Nearest Neighbor(ENN),and Repeated Edited Nearest Neighbor(RENN)in different noise ratios.Experiments show that instance reduction techniques substantially improved the average classification accuracy on three different noise levels:0%,5%,and 10%.The RENN algorithm achieved the highest levels of accuracy with a significant improvement on seven out of eight used datasets from the University of California Irvine(UCI)machine learning repository.The improvements were more apparent in the 5%and the 10%noise cases.When RENN was applied,the average classification accuracy for the eight datasets in the zero-noise test enhanced from 70.47%to 76.65%compared to the original test.The average accuracy was improved from 66.08%to 77.47%for the 5%-noise case and from 59.89%to 77.59%in the 10%-noise case.Higher confidence was also reported in building the association rules when RENN was used.The above results indicate that RENN is a good solution in removing noise and avoiding overfitting during the construction of the association rules classifier,especially in noisy domains.展开更多
基金Deputyship for Research&Innovation,Ministry of Education in Saudi Arabia for funding this research work through the Project Number RI-44-0444.
文摘Datamining plays a crucial role in extractingmeaningful knowledge fromlarge-scale data repositories,such as data warehouses and databases.Association rule mining,a fundamental process in data mining,involves discovering correlations,patterns,and causal structures within datasets.In the healthcare domain,association rules offer valuable opportunities for building knowledge bases,enabling intelligent diagnoses,and extracting invaluable information rapidly.This paper presents a novel approach called the Machine Learning based Association Rule Mining and Classification for Healthcare Data Management System(MLARMC-HDMS).The MLARMC-HDMS technique integrates classification and association rule mining(ARM)processes.Initially,the chimp optimization algorithm-based feature selection(COAFS)technique is employed within MLARMC-HDMS to select relevant attributes.Inspired by the foraging behavior of chimpanzees,the COA algorithm mimics their search strategy for food.Subsequently,the classification process utilizes stochastic gradient descent with a multilayer perceptron(SGD-MLP)model,while the Apriori algorithm determines attribute relationships.We propose a COA-based feature selection approach for medical data classification using machine learning techniques.This approach involves selecting pertinent features from medical datasets through COA and training machine learning models using the reduced feature set.We evaluate the performance of our approach on various medical datasets employing diverse machine learning classifiers.Experimental results demonstrate that our proposed approach surpasses alternative feature selection methods,achieving higher accuracy and precision rates in medical data classification tasks.The study showcases the effectiveness and efficiency of the COA-based feature selection approach in identifying relevant features,thereby enhancing the diagnosis and treatment of various diseases.To provide further validation,we conduct detailed experiments on a benchmark medical dataset,revealing the superiority of the MLARMCHDMS model over other methods,with a maximum accuracy of 99.75%.Therefore,this research contributes to the advancement of feature selection techniques in medical data classification and highlights the potential for improving healthcare outcomes through accurate and efficient data analysis.The presented MLARMC-HDMS framework and COA-based feature selection approach offer valuable insights for researchers and practitioners working in the field of healthcare data mining and machine learning.
基金Defense Advanced Research Project "the Techniques of Information Integrated Processing and Fusion" in the Eleventh Five-Year Plan (513060302).
文摘Joint probabilistic data association is an effective method for tracking multiple targets in clutter, but only the target kinematic information is used in measure-to-track association. If the kinematic likelihoods are similar for different closely spaced targets, there is ambiguity in using the kinematic information alone; the correct association probability will decrease in conventional joint probabilistic data association algorithm and track coalescence will occur easily. A modified algorithm of joint probabilistic data association with classification-aided is presented, which avoids track coalescence when tracking multiple neighboring targets. Firstly, an identification matrix is defined, which is used to simplify validation matrix to decrease computational complexity. Then, target class information is integrated into the data association process. Performance comparisons with and without the use of class information in JPDA are presented on multiple closely spaced maneuvering targets tracking problem. Simulation results quantify the benefits of classification-aided JPDA for improved multiple targets tracking, especially in the presence of association uncertainty in the kinematic measurement and target maneuvering. Simulation results indicate that the algorithm is valid.
文摘The paper discusses the concept of mineral resources associated with coal measures. A rational and scientific classification of such mineral resources becomes more necessary with the development of science and technology. A classification scheme is proposed based on compositions and physical properties and the utilization of these associated minerals.
基金Supported by the National High Technology Research and Development Program of China (No. 2007AA01Z132) the National Natural Science Foundation of China (No.60775035, 60933004, 60970088, 60903141)+1 种基金 the National Basic Research Priorities Programme (No. 2007CB311004) the National Science and Technology Support Plan (No.2006BAC08B06).
文摘In this paper, we propose an enhanced associative classification method by integrating the dynamic property in the process of associative classification. In the proposed method, we employ a support vector machine(SVM) based method to refine the discovered emerging ~equent patterns for classification rule extension for class label prediction. The empirical study shows that our method can be used to classify increasing resources efficiently and effectively.
基金the Youth Science and Technology Foundection of University of Electronic Science andTechnology of China (JX0622).
文摘In most of the passive tracking systems, only the target kinematical information is used in the measurement-to-track association, which results in error tracking in a multitarget environment, where the targets are too close to each other. To enhance the tracking accuracy, the target signal classification information (TSCI) should be used to improve the data association. The TSCI is integrated in the data association process using the JPDA (joint probabilistic data association). The use of the TSCI in the data association can improve discrimination by yielding a purer track and preserving continuity. To verify the validity of the application of TSCI, two simulation experiments are done on an air target-tracing problem, that is, one using the TSCI and the other not using the TSCI. The final comparison shows that the use of the TSCI can effectively improve tracking accuracy.
文摘The classification of central nervous system(CNS)glioma went through a sequence of developments,between 2006 and 2021,started with only histological approach then has been aided with a major emphasis on molecular signatures in the 4^(th) and 5^(th) editions of the World Health Organization(WHO).The recent reformation in the 5th edition of the WHO classification has focused more on the molecularly defined entities with better characterized natural histories as well as new tumor types and subtypes in the adult and pediatric populations.These new subclassified entities have been incorporated in the 5^(th) edition after the continuous exploration of new genomic,epigenomic and transcriptomic discovery.Indeed,the current guidelines of 2021 WHO classification of CNS tumors and European Association of Neuro-Oncology(EANO)exploited the molecular signatures in the diagnostic approach of CNS gliomas.Our current review presents a practical diagnostic approach for diffuse CNS gliomas and circumscribed astrocytomas using histomolecular criteria adopted by the recent WHO classification.We also describe the treatment strategies for these tumors based on EANO guidelines.
文摘We propose two models in this paper. The concept of association model is put forward to obtain the co-occurrence relationships among keywords in the documents and the hierarchical Hamming clustering model is used to reduce the dimensionality of the category feature vector space which can solve the problem of the extremely high dimensionality of the documents' feature space. The results of experiment indicate that it can obtain the co-occurrence relations among key-words in the documents which promote the recall of classification system effectively. The hierarchical Hamming clustering model can reduce the dimensionality of the category feature vector efficiently, the size of the vector space is only about 10% of the primary dimensionality. Key words text classification - concept association - hierarchical clustering - hamming clustering CLC number TN 915. 08 Foundation item: Supporteded by the National 863 Project of China (2001AA142160, 2002AA145090)Biography: Su Gui-yang (1974-), male, Ph. D candidate, research direction: information filter and text classification.
文摘The amount of data for decision making has increased tremendously in the age of the digital economy. Decision makers who fail to proficiently manipulate the data produced may make incorrect decisions and therefore harm their business. Thus, the task of extracting and classifying the useful information efficiently and effectively from huge amounts of computational data is of special importance. In this paper, we consider that the attributes of data could be both crisp and fuzzy. By examining the suitable partial data, segments with different classes are formed, then a multithreaded computation is performed to generate crisp rules (if possible), and finally, the fuzzy partition technique is employed to deal with the fuzzy attributes for classification. The rules generated in classifying the overall data can be used to gain more knowledge from the data collected.
文摘Classification and association rule mining are used to take decisions based on relationships between attributes and help decision makers to take correct decisions at right time. Associative classification first generates class based association rules and use that generate rule set which is used to predict the class label for unseen data. The large data sets may have many null-transac- tions. A null-transaction is a transaction that does not contain any of the itemsets being examined. It is important to consider the null invariance property when selecting appropriate interesting measures in the correlation analysis. Real time data set has mixed attributes. Analyze the mixed attribute data set is not easy. Hence, the proposed work uses cosine measure to avoid the influence of null transactions during rule generation. It employs mixed-kernel probability density function (PDF) to handle continuous attributes during data analysis. It has ably to handle both nominal and continuous attributes and generates mixed attribute rule set. To explore the search space efficiently it applies Ant Colony Optimization (ACO). The public data sets are used to analyze the performance of the algorithm. The results illustrate that the support-confidence framework with a correlation measure generates more accurate simple rule set and discover more interesting rules.
基金The National Natural Science Foundation of China(No.60473045)the Technology Research Project of Hebei Province(No.05213573)the Research Plan of Education Office of Hebei Province(No.2004406)
文摘To deal with the problem that arises when the conventional fuzzy class-association method applies repetitive scans of the classifier to classify new texts,which has low efficiency, a new approach based on the FCR-tree(fuzzy classification rules tree)for text categorization is proposed.The compactness of the FCR-tree saves significant space in storing a large set of rules when there are many repeated words in the rules.In comparison with classification rules,the fuzzy classification rules contain not only words,but also the fuzzy sets corresponding to the frequencies of words appearing in texts.Therefore,the construction of an FCR-tree and its structure are different from a CR-tree.To debase the difficulty of FCR-tree construction and rules retrieval,more k-FCR-trees are built.When classifying a new text,it is not necessary to search the paths of the sub-trees led by those words not appearing in this text,thus reducing the number of traveling rules.Experimental results show that the proposed approach obviously outperforms the conventional method in efficiency.
文摘Background Acute coronary syndrome(ACS) is one of the most common forms of heart diseases.Recent studies have revealed that interleukin(IL)-8 plays a kev role in the development of atherosclerosis plaque and its complications, but the relationship of its common variants with ACS has not been extensively studied.Methods We tested the hypothesis that variants in IL-8-251 A/T was associated with susceptibility to ACS and its recurrence in a Chinese case-control study comprising 675 patients with ACS and 636 control subjects and replicated the investigation in an independent study comprising 360 cases and 360 control subjects. The plasma concentration of IL-8 was measured by enzyme-linked immunosorbent assay.Results IL-8 -251A】T poly-morphism was associated with increased susceptibility to ACS (P=0.004;OR=1.30 CI:1.12-1.53).Replication in the second study yielded similar results.IL-8 -251 A/T may affect the expression of IL-8 by the evidence that augmented IL-8 production revealed in serum of the AMI patients by ELISA. Conclusions IL-8 -251 A/T polymorphism is associated with ACS risk in Chinese Han population and An allele of IL-8- 251A/T may be an independent predictive factor.
文摘In the presence of background noise,arrival times picked from a surface microseismic data set usually include a number of false picks that can lead to uncertainty in location estimation.To eliminate false picks and improve the accuracy of location estimates,we develop an association algorithm termed RANSAC-based Arrival Time Event Clustering(RATEC)that clusters picked arrival times into event groups based on random sampling and fitting moveout curves that approximate hyperbolas.Arrival times far from the fitted hyperbolas are classified as false picks and removed from the data set prior to location estimation.Simulations of synthetic data for a 1-D linear array show that RATEC is robust under different noise conditions and generally applicable to various types of subsurface structures.By generalizing the underlying moveout model,RATEC is extended to the case of a 2-D surface monitoring array.The effectiveness of event location for the 2-D case is demonstrated using a data set collected by the 5200-element dense Long Beach array.The obtained results suggest that RATEC is effective in removing false picks and hence can be used for phase association before location estimates.
基金The APC was funded by the Deanship of Scientific Research,Saudi Electronic University.
文摘Association rules’learning is a machine learning method used in finding underlying associations in large datasets.Whether intentionally or unintentionally present,noise in training instances causes overfitting while building the classifier and negatively impacts classification accuracy.This paper uses instance reduction techniques for the datasets before mining the association rules and building the classifier.Instance reduction techniques were originally developed to reduce memory requirements in instance-based learning.This paper utilizes them to remove noise from the dataset before training the association rules classifier.Extensive experiments were conducted to assess the accuracy of association rules with different instance reduction techniques,namely:DecrementalReduction Optimization Procedure(DROP)3,DROP5,ALL K-Nearest Neighbors(ALLKNN),Edited Nearest Neighbor(ENN),and Repeated Edited Nearest Neighbor(RENN)in different noise ratios.Experiments show that instance reduction techniques substantially improved the average classification accuracy on three different noise levels:0%,5%,and 10%.The RENN algorithm achieved the highest levels of accuracy with a significant improvement on seven out of eight used datasets from the University of California Irvine(UCI)machine learning repository.The improvements were more apparent in the 5%and the 10%noise cases.When RENN was applied,the average classification accuracy for the eight datasets in the zero-noise test enhanced from 70.47%to 76.65%compared to the original test.The average accuracy was improved from 66.08%to 77.47%for the 5%-noise case and from 59.89%to 77.59%in the 10%-noise case.Higher confidence was also reported in building the association rules when RENN was used.The above results indicate that RENN is a good solution in removing noise and avoiding overfitting during the construction of the association rules classifier,especially in noisy domains.