In enterprise operations,maintaining manual rules for enterprise processes can be expensive,time-consuming,and dependent on specialized domain knowledge in that enterprise domain.Recently,rule-generation has been auto...In enterprise operations,maintaining manual rules for enterprise processes can be expensive,time-consuming,and dependent on specialized domain knowledge in that enterprise domain.Recently,rule-generation has been automated in enterprises,particularly through Machine Learning,to streamline routine tasks.Typically,these machine models are black boxes where the reasons for the decisions are not always transparent,and the end users need to verify the model proposals as a part of the user acceptance testing to trust it.In such scenarios,rules excel over Machine Learning models as the end-users can verify the rules and have more trust.In many scenarios,the truth label changes frequently thus,it becomes difficult for the Machine Learning model to learn till a considerable amount of data has been accumulated,but with rules,the truth can be adapted.This paper presents a novel framework for generating human-understandable rules using the Classification and Regression Tree(CART)decision tree method,which ensures both optimization and user trust in automated decision-making processes.The framework generates comprehensible rules in the form of if condition and then predicts class even in domains where noise is present.The proposed system transforms enterprise operations by automating the production of human-readable rules from structured data,resulting in increased efficiency and transparency.Removing the need for human rule construction saves time and money while guaranteeing that users can readily check and trust the automatic judgments of the system.The remarkable performance metrics of the framework,which achieve 99.85%accuracy and 96.30%precision,further support its efficiency in translating complex data into comprehensible rules,eventually empowering users and enhancing organizational decision-making processes.展开更多
To deal with the problem that arises when the conventional fuzzy class-association method applies repetitive scans of the classifier to classify new texts,which has low efficiency, a new approach based on the FCR-tree...To deal with the problem that arises when the conventional fuzzy class-association method applies repetitive scans of the classifier to classify new texts,which has low efficiency, a new approach based on the FCR-tree(fuzzy classification rules tree)for text categorization is proposed.The compactness of the FCR-tree saves significant space in storing a large set of rules when there are many repeated words in the rules.In comparison with classification rules,the fuzzy classification rules contain not only words,but also the fuzzy sets corresponding to the frequencies of words appearing in texts.Therefore,the construction of an FCR-tree and its structure are different from a CR-tree.To debase the difficulty of FCR-tree construction and rules retrieval,more k-FCR-trees are built.When classifying a new text,it is not necessary to search the paths of the sub-trees led by those words not appearing in this text,thus reducing the number of traveling rules.Experimental results show that the proposed approach obviously outperforms the conventional method in efficiency.展开更多
The amount of data for decision making has increased tremendously in the age of the digital economy. Decision makers who fail to proficiently manipulate the data produced may make incorrect decisions and therefore har...The amount of data for decision making has increased tremendously in the age of the digital economy. Decision makers who fail to proficiently manipulate the data produced may make incorrect decisions and therefore harm their business. Thus, the task of extracting and classifying the useful information efficiently and effectively from huge amounts of computational data is of special importance. In this paper, we consider that the attributes of data could be both crisp and fuzzy. By examining the suitable partial data, segments with different classes are formed, then a multithreaded computation is performed to generate crisp rules (if possible), and finally, the fuzzy partition technique is employed to deal with the fuzzy attributes for classification. The rules generated in classifying the overall data can be used to gain more knowledge from the data collected.展开更多
The hidden dimension of the urban morphology is the underlying the urban morphological rules system.The number of these rules has increased and their application tends to become more complex.The urban morphosis based ...The hidden dimension of the urban morphology is the underlying the urban morphological rules system.The number of these rules has increased and their application tends to become more complex.The urban morphosis based digital approaches tends to become widespread.However,achieving the target values for all the rules is difficult.This impacts the social,environmental and aesthetic objectives of these rules.This paper proposes a classification of urban morphological rules to assist the digital morphosis of urban form.The aim is to endow the system of rules with a hierarchy,which can make efficient the automatic generation of the urban forms respectful of the urban law.Thus,this work promotes the concerns of artificial intelligence in urban morphology.展开更多
Distributed genetic algorithm can be combined with the adaptive genetic algorithm for mining the interesting and comprehensible classification rules.The paper gives the method to encode for the rules,the fitness funct...Distributed genetic algorithm can be combined with the adaptive genetic algorithm for mining the interesting and comprehensible classification rules.The paper gives the method to encode for the rules,the fitness function,the selecting,crossover,mutation and migration operator for the DAGA at the same time are designed.展开更多
A machine-learning approach was developed for automated building of knowledgebases for soil resources mapping by using a classification tree to generate knowledge from trainingdata. With this method, building a knowle...A machine-learning approach was developed for automated building of knowledgebases for soil resources mapping by using a classification tree to generate knowledge from trainingdata. With this method, building a knowledge base for automated soil mapping was easier than usingthe conventional knowledge acquisition approach. The knowledge base built by classification tree wasused by the knowledge classifier to perform the soil type classification of Longyou County,Zhejiang Province, China using Landsat TM bi-temporal images and CIS data. To evaluate theperformance of the resultant knowledge bases, the classification results were compared to existingsoil map based on a field survey. The accuracy assessment and analysis of the resultant soil mapssuggested that the knowledge bases built by the machine-learning method was of good quality formapping distribution model of soil classes over the study area.展开更多
The choice of a fuzzy partitioning is crucial to the performance of a fuzzy system based on if-then rules. However, most of the existing methods are complicated or lead ,o too many subspaces, which is unfit for the ap...The choice of a fuzzy partitioning is crucial to the performance of a fuzzy system based on if-then rules. However, most of the existing methods are complicated or lead ,o too many subspaces, which is unfit for the applications of pattern classification. A simple but effective clustering approach is proposed in this paper, which obtains a set of compact subspaces and is applicable for classification problems with higher dimensional feature. Its effectiveness is demonstrated by the experimental results.展开更多
This paper discusses the problem of classifying a multivariate Gaussian random field observation into one of the several categories specified by different parametric mean models. Investigation is conducted on the clas...This paper discusses the problem of classifying a multivariate Gaussian random field observation into one of the several categories specified by different parametric mean models. Investigation is conducted on the classifier based on plug-in Bayes classification rule (PBCR) formed by replacing unknown parameters in Bayes classification rule (BCR) with category parameters estimators. This is the extension of the previous one from the two category cases to the multi-category case. The novel closed-form expressions for the Bayes classification probability and actual correct classification rate associated with PBCR are derived. These correct classification rates are suggested as performance measures for the classifications procedure. An empirical study has been carried out to analyze the dependence of derived classification rates on category parameters.展开更多
Classification of the patterns is a crucial structure of research and applications. Using fuzzy set theory, classifying the patterns has become of great interest because of its ability to understand the parameters. On...Classification of the patterns is a crucial structure of research and applications. Using fuzzy set theory, classifying the patterns has become of great interest because of its ability to understand the parameters. One of the problemsobserved in the fuzzification of an unknown pattern is that importance is givenonly to the known patterns but not to their features. In contrast, features of thepatterns play an essential role when their respective patterns overlap. In this paper,an optimal fuzzy nearest neighbor model has been introduced in which a fuzzifi-cation process has been carried out for the unknown pattern using k nearest neighbor. With the help of the fuzzification process, the membership matrix has beenformed. In this membership matrix, fuzzification has been carried out of the features of the unknown pattern. Classification results are verified on a completelyllabelled Telugu vowel data set, and the accuracy is compared with the differentmodels and the fuzzy k nearest neighbor algorithm. The proposed model gives84.86% accuracy on 50% training data set and 89.35% accuracy on 80% trainingdata set. The proposed classifier learns well enough with a small amount of training data, resulting in an efficient and faster approach.展开更多
As per World Health Organization report which was released in the year of 2019,Diabetes claimed the lives of approximately 1.5 million individuals globally in 2019 and around 450 million people are affected by diabete...As per World Health Organization report which was released in the year of 2019,Diabetes claimed the lives of approximately 1.5 million individuals globally in 2019 and around 450 million people are affected by diabetes all over the world.Hence it is inferred that diabetes is rampant across the world with the majority of the world population being affected by it.Among the diabetics,it can be observed that a large number of people had failed to identify their disease in the initial stage itself and hence the disease level moved from Type-1 to Type-2.To avoid this situation,we propose a new fuzzy logic based neural classifier for early detection of diabetes.A set of new neuro-fuzzy rules is introduced with time constraints that are applied for thefirst level classification.These levels are further refined by using the Fuzzy Cognitive Maps(FCM)with time intervals for making thefinal decision over the classification process.The main objective of this proposed model is to detect the diabetes level based on the time.Also,the set of neuro-fuzzy rules are used for selecting the most contributing values over the decision-making process in diabetes prediction.The proposed model proved its efficiency in performance after experiments conducted not only from the repository but also by using the standard diabetic detection models that are available in the market.展开更多
In this paper, we propose an enhanced associative classification method by integrating the dynamic property in the process of associative classification. In the proposed method, we employ a support vector machine(SVM...In this paper, we propose an enhanced associative classification method by integrating the dynamic property in the process of associative classification. In the proposed method, we employ a support vector machine(SVM) based method to refine the discovered emerging ~equent patterns for classification rule extension for class label prediction. The empirical study shows that our method can be used to classify increasing resources efficiently and effectively.展开更多
Datamining plays a crucial role in extractingmeaningful knowledge fromlarge-scale data repositories,such as data warehouses and databases.Association rule mining,a fundamental process in data mining,involves discoveri...Datamining plays a crucial role in extractingmeaningful knowledge fromlarge-scale data repositories,such as data warehouses and databases.Association rule mining,a fundamental process in data mining,involves discovering correlations,patterns,and causal structures within datasets.In the healthcare domain,association rules offer valuable opportunities for building knowledge bases,enabling intelligent diagnoses,and extracting invaluable information rapidly.This paper presents a novel approach called the Machine Learning based Association Rule Mining and Classification for Healthcare Data Management System(MLARMC-HDMS).The MLARMC-HDMS technique integrates classification and association rule mining(ARM)processes.Initially,the chimp optimization algorithm-based feature selection(COAFS)technique is employed within MLARMC-HDMS to select relevant attributes.Inspired by the foraging behavior of chimpanzees,the COA algorithm mimics their search strategy for food.Subsequently,the classification process utilizes stochastic gradient descent with a multilayer perceptron(SGD-MLP)model,while the Apriori algorithm determines attribute relationships.We propose a COA-based feature selection approach for medical data classification using machine learning techniques.This approach involves selecting pertinent features from medical datasets through COA and training machine learning models using the reduced feature set.We evaluate the performance of our approach on various medical datasets employing diverse machine learning classifiers.Experimental results demonstrate that our proposed approach surpasses alternative feature selection methods,achieving higher accuracy and precision rates in medical data classification tasks.The study showcases the effectiveness and efficiency of the COA-based feature selection approach in identifying relevant features,thereby enhancing the diagnosis and treatment of various diseases.To provide further validation,we conduct detailed experiments on a benchmark medical dataset,revealing the superiority of the MLARMCHDMS model over other methods,with a maximum accuracy of 99.75%.Therefore,this research contributes to the advancement of feature selection techniques in medical data classification and highlights the potential for improving healthcare outcomes through accurate and efficient data analysis.The presented MLARMC-HDMS framework and COA-based feature selection approach offer valuable insights for researchers and practitioners working in the field of healthcare data mining and machine learning.展开更多
Customer requirements analysis is the key step for product variety design of mass customiza-tion(MC). Quality function deployment (QFD) is a widely used management technique for understanding the voice of the customer...Customer requirements analysis is the key step for product variety design of mass customiza-tion(MC). Quality function deployment (QFD) is a widely used management technique for understanding the voice of the customer (VOC), however, QFD depends heavily on human subject judgment during extracting customer requirements and determination of the importance weights of customer requirements. QFD pro-cess and related problems are so complicated that it is not easily used. In this paper, based on a general data structure of product family, generic bill of material (GBOM), association rules analysis was introduced to construct the classification mechanism between customer requirements and product architecture. The new method can map customer requirements to the items of product family architecture respectively, accomplish the mapping process from customer domain to physical domain directly, and decrease mutual process between customer and designer, improve the product design quality, and thus furthest satisfy customer needs. Finally, an example of customer requirements mapping of the elevator cabin was used to illustrate the proposed method.展开更多
This paper is motivated by the interest in finding significant movements in financial stock prices. However, when the number of profitable opportunities is scarce, the prediction of these cases is difficult. In a prev...This paper is motivated by the interest in finding significant movements in financial stock prices. However, when the number of profitable opportunities is scarce, the prediction of these cases is difficult. In a previous work, we have introduced evolving decision rules (EDR) to detect financial opportunities. The objective of EDR is to classify the minority class (positive eases) in imbalaneed environments. EDR provides a range of classifications to find the best balance between not making mistakes and not missing opportunities. The goals of this paper are: 1) to show that EDR produces a range of solutions to suit the investor's preferences and 2) to analyze the factors that benefit the performance of EDR. A series of experiments was performed. EDR was tested using a data set from the London Financial Market. To analyze the EDR behaviour, another experiment was carried out using three artificial data sets, whose solutions have different levels of complexity. Finally, an illustrative example was provided to show how a bigger collection of rules is able to classify more positive eases in imbalanced data sets. Experimental results show that: 1) EDR offers a range of solutions to fit the risk guidelines of different types of investors, and 2) a bigger collection of rules is able to classify more positive eases in imbalanced environments.展开更多
Association rules’learning is a machine learning method used in finding underlying associations in large datasets.Whether intentionally or unintentionally present,noise in training instances causes overfitting while ...Association rules’learning is a machine learning method used in finding underlying associations in large datasets.Whether intentionally or unintentionally present,noise in training instances causes overfitting while building the classifier and negatively impacts classification accuracy.This paper uses instance reduction techniques for the datasets before mining the association rules and building the classifier.Instance reduction techniques were originally developed to reduce memory requirements in instance-based learning.This paper utilizes them to remove noise from the dataset before training the association rules classifier.Extensive experiments were conducted to assess the accuracy of association rules with different instance reduction techniques,namely:DecrementalReduction Optimization Procedure(DROP)3,DROP5,ALL K-Nearest Neighbors(ALLKNN),Edited Nearest Neighbor(ENN),and Repeated Edited Nearest Neighbor(RENN)in different noise ratios.Experiments show that instance reduction techniques substantially improved the average classification accuracy on three different noise levels:0%,5%,and 10%.The RENN algorithm achieved the highest levels of accuracy with a significant improvement on seven out of eight used datasets from the University of California Irvine(UCI)machine learning repository.The improvements were more apparent in the 5%and the 10%noise cases.When RENN was applied,the average classification accuracy for the eight datasets in the zero-noise test enhanced from 70.47%to 76.65%compared to the original test.The average accuracy was improved from 66.08%to 77.47%for the 5%-noise case and from 59.89%to 77.59%in the 10%-noise case.Higher confidence was also reported in building the association rules when RENN was used.The above results indicate that RENN is a good solution in removing noise and avoiding overfitting during the construction of the association rules classifier,especially in noisy domains.展开更多
文摘In enterprise operations,maintaining manual rules for enterprise processes can be expensive,time-consuming,and dependent on specialized domain knowledge in that enterprise domain.Recently,rule-generation has been automated in enterprises,particularly through Machine Learning,to streamline routine tasks.Typically,these machine models are black boxes where the reasons for the decisions are not always transparent,and the end users need to verify the model proposals as a part of the user acceptance testing to trust it.In such scenarios,rules excel over Machine Learning models as the end-users can verify the rules and have more trust.In many scenarios,the truth label changes frequently thus,it becomes difficult for the Machine Learning model to learn till a considerable amount of data has been accumulated,but with rules,the truth can be adapted.This paper presents a novel framework for generating human-understandable rules using the Classification and Regression Tree(CART)decision tree method,which ensures both optimization and user trust in automated decision-making processes.The framework generates comprehensible rules in the form of if condition and then predicts class even in domains where noise is present.The proposed system transforms enterprise operations by automating the production of human-readable rules from structured data,resulting in increased efficiency and transparency.Removing the need for human rule construction saves time and money while guaranteeing that users can readily check and trust the automatic judgments of the system.The remarkable performance metrics of the framework,which achieve 99.85%accuracy and 96.30%precision,further support its efficiency in translating complex data into comprehensible rules,eventually empowering users and enhancing organizational decision-making processes.
基金The National Natural Science Foundation of China(No.60473045)the Technology Research Project of Hebei Province(No.05213573)the Research Plan of Education Office of Hebei Province(No.2004406)
文摘To deal with the problem that arises when the conventional fuzzy class-association method applies repetitive scans of the classifier to classify new texts,which has low efficiency, a new approach based on the FCR-tree(fuzzy classification rules tree)for text categorization is proposed.The compactness of the FCR-tree saves significant space in storing a large set of rules when there are many repeated words in the rules.In comparison with classification rules,the fuzzy classification rules contain not only words,but also the fuzzy sets corresponding to the frequencies of words appearing in texts.Therefore,the construction of an FCR-tree and its structure are different from a CR-tree.To debase the difficulty of FCR-tree construction and rules retrieval,more k-FCR-trees are built.When classifying a new text,it is not necessary to search the paths of the sub-trees led by those words not appearing in this text,thus reducing the number of traveling rules.Experimental results show that the proposed approach obviously outperforms the conventional method in efficiency.
文摘The amount of data for decision making has increased tremendously in the age of the digital economy. Decision makers who fail to proficiently manipulate the data produced may make incorrect decisions and therefore harm their business. Thus, the task of extracting and classifying the useful information efficiently and effectively from huge amounts of computational data is of special importance. In this paper, we consider that the attributes of data could be both crisp and fuzzy. By examining the suitable partial data, segments with different classes are formed, then a multithreaded computation is performed to generate crisp rules (if possible), and finally, the fuzzy partition technique is employed to deal with the fuzzy attributes for classification. The rules generated in classifying the overall data can be used to gain more knowledge from the data collected.
文摘The hidden dimension of the urban morphology is the underlying the urban morphological rules system.The number of these rules has increased and their application tends to become more complex.The urban morphosis based digital approaches tends to become widespread.However,achieving the target values for all the rules is difficult.This impacts the social,environmental and aesthetic objectives of these rules.This paper proposes a classification of urban morphological rules to assist the digital morphosis of urban form.The aim is to endow the system of rules with a hierarchy,which can make efficient the automatic generation of the urban forms respectful of the urban law.Thus,this work promotes the concerns of artificial intelligence in urban morphology.
基金National Ethnic Affairs Commission NatureScience Foundation of China(PMZY06004)the Education Science Foundation of Guangxi(2006A-E004)
文摘Distributed genetic algorithm can be combined with the adaptive genetic algorithm for mining the interesting and comprehensible classification rules.The paper gives the method to encode for the rules,the fitness function,the selecting,crossover,mutation and migration operator for the DAGA at the same time are designed.
基金Project supported by the National Natural Science Foundation of China(Nos.40101014 and 40001008).
文摘A machine-learning approach was developed for automated building of knowledgebases for soil resources mapping by using a classification tree to generate knowledge from trainingdata. With this method, building a knowledge base for automated soil mapping was easier than usingthe conventional knowledge acquisition approach. The knowledge base built by classification tree wasused by the knowledge classifier to perform the soil type classification of Longyou County,Zhejiang Province, China using Landsat TM bi-temporal images and CIS data. To evaluate theperformance of the resultant knowledge bases, the classification results were compared to existingsoil map based on a field survey. The accuracy assessment and analysis of the resultant soil mapssuggested that the knowledge bases built by the machine-learning method was of good quality formapping distribution model of soil classes over the study area.
文摘The choice of a fuzzy partitioning is crucial to the performance of a fuzzy system based on if-then rules. However, most of the existing methods are complicated or lead ,o too many subspaces, which is unfit for the applications of pattern classification. A simple but effective clustering approach is proposed in this paper, which obtains a set of compact subspaces and is applicable for classification problems with higher dimensional feature. Its effectiveness is demonstrated by the experimental results.
文摘This paper discusses the problem of classifying a multivariate Gaussian random field observation into one of the several categories specified by different parametric mean models. Investigation is conducted on the classifier based on plug-in Bayes classification rule (PBCR) formed by replacing unknown parameters in Bayes classification rule (BCR) with category parameters estimators. This is the extension of the previous one from the two category cases to the multi-category case. The novel closed-form expressions for the Bayes classification probability and actual correct classification rate associated with PBCR are derived. These correct classification rates are suggested as performance measures for the classifications procedure. An empirical study has been carried out to analyze the dependence of derived classification rates on category parameters.
基金supported by the Taif University Researchers Supporting Project Number(TURSP-2020/79),Taif University,Taif,Saudi Arabia.
文摘Classification of the patterns is a crucial structure of research and applications. Using fuzzy set theory, classifying the patterns has become of great interest because of its ability to understand the parameters. One of the problemsobserved in the fuzzification of an unknown pattern is that importance is givenonly to the known patterns but not to their features. In contrast, features of thepatterns play an essential role when their respective patterns overlap. In this paper,an optimal fuzzy nearest neighbor model has been introduced in which a fuzzifi-cation process has been carried out for the unknown pattern using k nearest neighbor. With the help of the fuzzification process, the membership matrix has beenformed. In this membership matrix, fuzzification has been carried out of the features of the unknown pattern. Classification results are verified on a completelyllabelled Telugu vowel data set, and the accuracy is compared with the differentmodels and the fuzzy k nearest neighbor algorithm. The proposed model gives84.86% accuracy on 50% training data set and 89.35% accuracy on 80% trainingdata set. The proposed classifier learns well enough with a small amount of training data, resulting in an efficient and faster approach.
文摘As per World Health Organization report which was released in the year of 2019,Diabetes claimed the lives of approximately 1.5 million individuals globally in 2019 and around 450 million people are affected by diabetes all over the world.Hence it is inferred that diabetes is rampant across the world with the majority of the world population being affected by it.Among the diabetics,it can be observed that a large number of people had failed to identify their disease in the initial stage itself and hence the disease level moved from Type-1 to Type-2.To avoid this situation,we propose a new fuzzy logic based neural classifier for early detection of diabetes.A set of new neuro-fuzzy rules is introduced with time constraints that are applied for thefirst level classification.These levels are further refined by using the Fuzzy Cognitive Maps(FCM)with time intervals for making thefinal decision over the classification process.The main objective of this proposed model is to detect the diabetes level based on the time.Also,the set of neuro-fuzzy rules are used for selecting the most contributing values over the decision-making process in diabetes prediction.The proposed model proved its efficiency in performance after experiments conducted not only from the repository but also by using the standard diabetic detection models that are available in the market.
基金Supported by the National High Technology Research and Development Program of China (No. 2007AA01Z132) the National Natural Science Foundation of China (No.60775035, 60933004, 60970088, 60903141)+1 种基金 the National Basic Research Priorities Programme (No. 2007CB311004) the National Science and Technology Support Plan (No.2006BAC08B06).
文摘In this paper, we propose an enhanced associative classification method by integrating the dynamic property in the process of associative classification. In the proposed method, we employ a support vector machine(SVM) based method to refine the discovered emerging ~equent patterns for classification rule extension for class label prediction. The empirical study shows that our method can be used to classify increasing resources efficiently and effectively.
基金Deputyship for Research&Innovation,Ministry of Education in Saudi Arabia for funding this research work through the Project Number RI-44-0444.
文摘Datamining plays a crucial role in extractingmeaningful knowledge fromlarge-scale data repositories,such as data warehouses and databases.Association rule mining,a fundamental process in data mining,involves discovering correlations,patterns,and causal structures within datasets.In the healthcare domain,association rules offer valuable opportunities for building knowledge bases,enabling intelligent diagnoses,and extracting invaluable information rapidly.This paper presents a novel approach called the Machine Learning based Association Rule Mining and Classification for Healthcare Data Management System(MLARMC-HDMS).The MLARMC-HDMS technique integrates classification and association rule mining(ARM)processes.Initially,the chimp optimization algorithm-based feature selection(COAFS)technique is employed within MLARMC-HDMS to select relevant attributes.Inspired by the foraging behavior of chimpanzees,the COA algorithm mimics their search strategy for food.Subsequently,the classification process utilizes stochastic gradient descent with a multilayer perceptron(SGD-MLP)model,while the Apriori algorithm determines attribute relationships.We propose a COA-based feature selection approach for medical data classification using machine learning techniques.This approach involves selecting pertinent features from medical datasets through COA and training machine learning models using the reduced feature set.We evaluate the performance of our approach on various medical datasets employing diverse machine learning classifiers.Experimental results demonstrate that our proposed approach surpasses alternative feature selection methods,achieving higher accuracy and precision rates in medical data classification tasks.The study showcases the effectiveness and efficiency of the COA-based feature selection approach in identifying relevant features,thereby enhancing the diagnosis and treatment of various diseases.To provide further validation,we conduct detailed experiments on a benchmark medical dataset,revealing the superiority of the MLARMCHDMS model over other methods,with a maximum accuracy of 99.75%.Therefore,this research contributes to the advancement of feature selection techniques in medical data classification and highlights the potential for improving healthcare outcomes through accurate and efficient data analysis.The presented MLARMC-HDMS framework and COA-based feature selection approach offer valuable insights for researchers and practitioners working in the field of healthcare data mining and machine learning.
基金the National Natural Science Founda-tion of China (No. 70471022)the NSFC / Hong KongResearch Grant Council (No. 70418013)
文摘Customer requirements analysis is the key step for product variety design of mass customiza-tion(MC). Quality function deployment (QFD) is a widely used management technique for understanding the voice of the customer (VOC), however, QFD depends heavily on human subject judgment during extracting customer requirements and determination of the importance weights of customer requirements. QFD pro-cess and related problems are so complicated that it is not easily used. In this paper, based on a general data structure of product family, generic bill of material (GBOM), association rules analysis was introduced to construct the classification mechanism between customer requirements and product architecture. The new method can map customer requirements to the items of product family architecture respectively, accomplish the mapping process from customer domain to physical domain directly, and decrease mutual process between customer and designer, improve the product design quality, and thus furthest satisfy customer needs. Finally, an example of customer requirements mapping of the elevator cabin was used to illustrate the proposed method.
文摘This paper is motivated by the interest in finding significant movements in financial stock prices. However, when the number of profitable opportunities is scarce, the prediction of these cases is difficult. In a previous work, we have introduced evolving decision rules (EDR) to detect financial opportunities. The objective of EDR is to classify the minority class (positive eases) in imbalaneed environments. EDR provides a range of classifications to find the best balance between not making mistakes and not missing opportunities. The goals of this paper are: 1) to show that EDR produces a range of solutions to suit the investor's preferences and 2) to analyze the factors that benefit the performance of EDR. A series of experiments was performed. EDR was tested using a data set from the London Financial Market. To analyze the EDR behaviour, another experiment was carried out using three artificial data sets, whose solutions have different levels of complexity. Finally, an illustrative example was provided to show how a bigger collection of rules is able to classify more positive eases in imbalanced data sets. Experimental results show that: 1) EDR offers a range of solutions to fit the risk guidelines of different types of investors, and 2) a bigger collection of rules is able to classify more positive eases in imbalanced environments.
基金The APC was funded by the Deanship of Scientific Research,Saudi Electronic University.
文摘Association rules’learning is a machine learning method used in finding underlying associations in large datasets.Whether intentionally or unintentionally present,noise in training instances causes overfitting while building the classifier and negatively impacts classification accuracy.This paper uses instance reduction techniques for the datasets before mining the association rules and building the classifier.Instance reduction techniques were originally developed to reduce memory requirements in instance-based learning.This paper utilizes them to remove noise from the dataset before training the association rules classifier.Extensive experiments were conducted to assess the accuracy of association rules with different instance reduction techniques,namely:DecrementalReduction Optimization Procedure(DROP)3,DROP5,ALL K-Nearest Neighbors(ALLKNN),Edited Nearest Neighbor(ENN),and Repeated Edited Nearest Neighbor(RENN)in different noise ratios.Experiments show that instance reduction techniques substantially improved the average classification accuracy on three different noise levels:0%,5%,and 10%.The RENN algorithm achieved the highest levels of accuracy with a significant improvement on seven out of eight used datasets from the University of California Irvine(UCI)machine learning repository.The improvements were more apparent in the 5%and the 10%noise cases.When RENN was applied,the average classification accuracy for the eight datasets in the zero-noise test enhanced from 70.47%to 76.65%compared to the original test.The average accuracy was improved from 66.08%to 77.47%for the 5%-noise case and from 59.89%to 77.59%in the 10%-noise case.Higher confidence was also reported in building the association rules when RENN was used.The above results indicate that RENN is a good solution in removing noise and avoiding overfitting during the construction of the association rules classifier,especially in noisy domains.