To realize the reuse of process design knowledge and improve the efficiency and quality of process design, a method for extracting thinking process rules for process design is proposed. An instance representation mode...To realize the reuse of process design knowledge and improve the efficiency and quality of process design, a method for extracting thinking process rules for process design is proposed. An instance representation model of the process planning reflecting the thinking process of techni- cians is established to achieve an effective representation of the process documents. The related process attributes are extracted from the model to form the related events. The manifold learning algorithm and clustering analysis are used to preprocess the process instance data. A rule extraction mechanism of process design is introduced, which is based on the related events after dimension reduction and clustering, and uses the association rule mining algorithm to realize the similar process information extraction in the same cluster. Through the vectorization description of the related events, the final process design rules are formed. Finally, an example is given to evaluate the method of process design rules extraction.展开更多
In metal cutting industry it is a common practice to search for optimal combination of cutting parameters in order to maximize the tool life for a fixed minimum value of material removal rate(MRR). After the advent ...In metal cutting industry it is a common practice to search for optimal combination of cutting parameters in order to maximize the tool life for a fixed minimum value of material removal rate(MRR). After the advent of high-speed milling(HSM) pro cess, lots of experimental and theoretical researches have been done for this purpose which mainly emphasized on the optimization of the cutting parameters. It is highly beneficial to convert raw data into a comprehensive knowledge-based expert system using fuzzy logic as the reasoning mechanism. In this paper an attempt has been presented for the extraction of the rules from fuzzy neural network(FNN) so as to have the most effective knowledge-base for given set of data. Experiments were conducted to determine the best values of cutting speeds that can maximize tool life for different combinations of input parameters. A fuzzy neural network was constructed based on the fuzzification of input parameters and the cutting speed. After training process, raw rule sets were extracted and a rule pruning approach was proposed to obtain concise linguistic rules. The estimation process with fuzzy inference showed that the optimized combination of fuzzy rules provided the estimation error of only 6.34 m/min as compared to 314 m/min of that of randomized combination of rule s.展开更多
For various reasons,many of the security programming rules applicable to specific software have not been recorded in official documents,and hence can hardly be employed by static analysis tools for detection.In this p...For various reasons,many of the security programming rules applicable to specific software have not been recorded in official documents,and hence can hardly be employed by static analysis tools for detection.In this paper,we propose a new approach,named SVR-Miner(Security Validation Rules Miner),which uses frequent sequence mining technique [1-4] to automatically infer implicit security validation rules from large software code written in C programming language.Different from the past works in this area,SVR-Miner introduces three techniques which are sensitive thread,program slicing [5-7],and equivalent statements computing to improve the accuracy of rules.Experiments with the Linux Kernel demonstrate the effectiveness of our approach.With the ten given sensitive threads,SVR-Miner automatically generated 17 security validation rules and detected 8 violations,5 of which were published by Linux Kernel Organization before we detected them.We have reported the other three to the Linux Kernel Organization recently.展开更多
In this paper, a novel data mining method is introduced to solve the multi-objective optimization problems of process industry. A hyperrectangle association rule mining (HARM) algorithm based on support vector machi...In this paper, a novel data mining method is introduced to solve the multi-objective optimization problems of process industry. A hyperrectangle association rule mining (HARM) algorithm based on support vector machines (SVMs) is proposed. Hyperrectangles rules are constructed on the base of prototypes and support vectors (SVs) under some heuristic limitations. The proposed algorithm is applied to a simulated moving bed (SMB) paraxylene (PX) adsorption process. The relationships between the key process variables and some objective variables such as purity, recovery rate of PX are obtained. Using existing domain knowledge about PX adsorption process, most of the obtained association rules can be explained.展开更多
Tourism demand forecasting has attracted substantial interest because of the significant economic contributions of the fast-growing tourism industry. Although various quantitative forecasting techniques have been wide...Tourism demand forecasting has attracted substantial interest because of the significant economic contributions of the fast-growing tourism industry. Although various quantitative forecasting techniques have been widely studied, highly accurate and understandable forecasting models have not been developed. The present paper proposes a novel tourism demand forecasting method that extracts fuzzy Takagi-Sugeno (T-S) rules from trained SVMs. Unlike previous approaches, this study uses fuzzy T-S models extracted from the outputs of trained SVMs on tourism data. Owing to the symbolic fuzzy rules and the generalization ability of SVMs, the extracted fuzzy T-S rules exhibit high forecasting accuracy and include understandable pre-condition parts for practitioners. Based on the tourism demand forecasting problem in Hong Kong SAR, China as a case study, empirical findings on tourist arrivals from nine overseas origins reveal that the proposed approach performs comparably with SVMs and can achieve better prediction accuracy than other forecasting techniques for most origins. The findings demonstrated that decision makers can easily interpret fuzzy T-S rules extracted from SVMs. Thus, the approach is highly beneficial to tourism market management. This finding demonstrates the excellent scientific and practical values of the proposed approach in tourism demand forecasting.展开更多
In order to make full use of the driver’s long-term driving experience in the process of perception, interaction and vehicle control of road traffic information, a driving behavior rule extraction algorithm based on ...In order to make full use of the driver’s long-term driving experience in the process of perception, interaction and vehicle control of road traffic information, a driving behavior rule extraction algorithm based on artificial neural network interface(ANNI) and its integration is proposed. Firstly, based on the cognitive learning theory, the cognitive driving behavior model is established, and then the cognitive driving behavior is described and analyzed. Next, based on ANNI, the model and the rule extraction algorithm(ANNI-REA) are designed to explain not only the driving behavior but also the non-sequence. Rules have high fidelity and safety during driving without discretizing continuous input variables. The experimental results on the UCI standard data set and on the self-built driving behavior data set, show that the method is about 0.4% more accurate and about 10% less complex than the common C4.5-REA, Neuro-Rule and REFNE. Further, simulation experiments verify the correctness of the extracted driving rules and the effectiveness of the extraction based on cognitive driving behavior rules. In general, the several driving rules extracted fully reflect the execution mechanism of sequential activity of driving comprehensive cognition, which is of great significance for the traffic of mixed traffic flow under the network of vehicles and future research on unmanned driving.展开更多
The paper introduce segmentation ideas in the pretreatment process of web page. By page segmentation technique to extract the accurate information in the extract region, the region was processed to extract according t...The paper introduce segmentation ideas in the pretreatment process of web page. By page segmentation technique to extract the accurate information in the extract region, the region was processed to extract according to the rules of ontology extraction, and ultimately get the information you need. Through experiments on two real datasets and compare with related work, experimental results show that this method can achieve good extraction results.展开更多
Disease diagnosis is a challenging task due to a large number of associated factors.Uncertainty in the diagnosis process arises frominaccuracy in patient attributes,missing data,and limitation in the medical expert’s...Disease diagnosis is a challenging task due to a large number of associated factors.Uncertainty in the diagnosis process arises frominaccuracy in patient attributes,missing data,and limitation in the medical expert’s ability to define cause and effect relationships when there are multiple interrelated variables.This paper aims to demonstrate an integrated view of deploying smart disease diagnosis using the Internet of Things(IoT)empowered by the fuzzy inference system(FIS)to diagnose various diseases.The Fuzzy Systemis one of the best systems to diagnose medical conditions because every disease diagnosis involves many uncertainties,and fuzzy logic is the best way to handle uncertainties.Our proposed system differentiates new cases provided symptoms of the disease.Generally,it becomes a time-sensitive task to discriminate symptomatic diseases.The proposed system can track symptoms firmly to diagnose diseases through IoT and FIS smartly and efficiently.Different coefficients have been employed to predict and compute the identified disease’s severity for each sign of disease.This study aims to differentiate and diagnose COVID-19,Typhoid,Malaria,and Pneumonia.This study used the FIS method to figure out the disease over the use of given data related to correlating with input symptoms.MATLAB tool is utilised for the implementation of FIS.Fuzzy procedure on the aforementioned given data presents that affectionate disease can derive from the symptoms.The results of our proposed method proved that FIS could be utilised for the diagnosis of other diseases.This study may assist doctors,patients,medical practitioners,and other healthcare professionals in early diagnosis and better treat diseases.展开更多
The concepts of Rough Decision Support System (RDSS) and equivalence matrix are introduced in this paper. Based on a rough attribute vector tree (RAVT) method, two kinds of matrix computation algorithms — Recursive M...The concepts of Rough Decision Support System (RDSS) and equivalence matrix are introduced in this paper. Based on a rough attribute vector tree (RAVT) method, two kinds of matrix computation algorithms — Recursive Matrix Computation (RMC) and Parallel Matrix Computation (PMC) are proposed for rules extraction, attributes reduction and data cleaning finished synchronously. The algorithms emphasize the practicability and efficiency of rules generation. A case study of PMC is analyzed, and a comparison experiment of RMC algorithm shows that it is feasible and efficient for data mining and knowledge-discovery in RDSS.展开更多
A machine-learning approach was developed for automated building of knowledgebases for soil resources mapping by using a classification tree to generate knowledge from trainingdata. With this method, building a knowle...A machine-learning approach was developed for automated building of knowledgebases for soil resources mapping by using a classification tree to generate knowledge from trainingdata. With this method, building a knowledge base for automated soil mapping was easier than usingthe conventional knowledge acquisition approach. The knowledge base built by classification tree wasused by the knowledge classifier to perform the soil type classification of Longyou County,Zhejiang Province, China using Landsat TM bi-temporal images and CIS data. To evaluate theperformance of the resultant knowledge bases, the classification results were compared to existingsoil map based on a field survey. The accuracy assessment and analysis of the resultant soil mapssuggested that the knowledge bases built by the machine-learning method was of good quality formapping distribution model of soil classes over the study area.展开更多
This article presents two approaches for automated building of knowledge bases of soil resources mapping. These methods used decision tree and Bayesian predictive modeling, respectively to generate knowledge from tra...This article presents two approaches for automated building of knowledge bases of soil resources mapping. These methods used decision tree and Bayesian predictive modeling, respectively to generate knowledge from training data. With these methods, building a knowledge base for automated soil mapping is easier than using the conventional knowledge acquisition approach. The knowledge bases built by these two methods were used by the knowledge classifier for soil type classification of the Longyou area, Zhejiang Province, China using TM bi-temporal imageries and GIS data. To evaluate the performance of the resultant knowledge bases, the classification results were compared to existing soil map based on field survey. The accuracy assessment and analysis of the resultant soil maps suggested that the knowledge bases built by these two methods were of good quality for mapping distribution model of soil classes over the study area.展开更多
This paper combines computational intelligence tools: neural network, fuzzylogic, and genetic algorithm to develop a data mining architecture (NFGDM), which discovers patternsand represents them in understandable form...This paper combines computational intelligence tools: neural network, fuzzylogic, and genetic algorithm to develop a data mining architecture (NFGDM), which discovers patternsand represents them in understandable forms. In the NFGDM, input data arepreprocesscd byfuzzification, the preprocessed data of input variables arc then used to train a radial basisprobabilistic neural network to classify the dataset according to the classes considered, A ruleextraction technique is then applied in order to extract explicit knowledge from the trained neuralnetworks and represent it m the form of fuzzy if-then rules. In the final stage, genetic algorithmis used as a rule-pruning module to eliminate those weak rules that are still in the rule bases.Comparison with some known neural network classifier, the architecture has fast learning speed, andit is characterized by the incorporation of the possibility information into the consequents ofclassification rules in human understandable forms. The experiments show that the NFGDM is moreefficient and more robust than traditional decision tree method.展开更多
Data analysis and automatic processing is often interpreted as knowledge acquisition. In many cases it is necessary to somehow classify data or find regularities in them. Results obtained in the search of regularities...Data analysis and automatic processing is often interpreted as knowledge acquisition. In many cases it is necessary to somehow classify data or find regularities in them. Results obtained in the search of regularities in intelligent data analyzing applications are mostly represented with the help of IF-THEN rules. With the help of these rules the following tasks are solved: prediction, classification, pattern recognition and others. Using different approaches---clustering algorithms, neural network methods, fuzzy rule processing methods--we can extract rules that in an understandable language characterize the data. This allows interpreting the data, finding relationships in the data and extracting new rules that characterize them. Knowledge acquisition in this paper is defined as the process of extracting knowledge from numerical data in the form of rules. Extraction of rules in this context is based on clustering methods K-means and fuzzy C-means. With the assistance of K-means, clustering algorithm rules are derived from trained neural networks. Fuzzy C-means is used in fuzzy rule based design method. Rule extraction methodology is demonstrated in the Fisher's Iris flower data set samples. The effectiveness of the extracted rules is evaluated. Clustering and rule extraction methodology can be widely used in evaluating and analyzing various economic and financial processes.展开更多
This paper proposes a method in order to detect the importance of the input variables in multivariate analysis problems. When there is correlation among predictor variables, the importance of each input variable, when...This paper proposes a method in order to detect the importance of the input variables in multivariate analysis problems. When there is correlation among predictor variables, the importance of each input variable, when adding variables in the model, can be detected from the knowledge stored in Artificial Neural Network (NN) and it must be taken into account. Neural networks models have been used with the analysis of sensibility, these models predict more accurately the relationship between variables, and it is the way to find a set of forecasting variables in order to be included in the new prediction model. The obtained results have been applied in a system to forecast the volume of wood for a tree, and to detect relationships between input and output variables.展开更多
Heavy renewable penetrations and high-voltage cross-regional transmission systems reduce the inertia and critical frequency stability of power systems after disturbances.Therefore,the power system operators should ens...Heavy renewable penetrations and high-voltage cross-regional transmission systems reduce the inertia and critical frequency stability of power systems after disturbances.Therefore,the power system operators should ensure the frequency nadirs after possible disturbances are within the set restriction,e.g.,0.20 Hz.Traditional methods utilize linearized and simplified control models to quantify the frequency nadirs and achieve frequency-constrained unit commitments(FCUCs).However,the simplified models are hard to depict the frequency responses of practical units after disturbances.Also,they usually neglect the regulations from battery storage.This paper achieves FCUCs with linear rules extracted from massive simulation results.We simulate the frequency responses on typical thermal-hydro-storage systems under diverse unit online conditions.Then,we extract the rules of frequency nadirs after disturbances merely with linear support vector machine to evaluate the frequency stability of power systems.The algorithm holds a high accuracy in a wide range of frequency restrictions.Finally,we apply the rules to three typical cases to show the influences of frequency constraints on unit commitments.展开更多
In the research of rule extraction from neural networks, fidelity describeshow well the rules mimic the behavior of a neural network while accuracy describes how well therules can be generalized. This paper identifies...In the research of rule extraction from neural networks, fidelity describeshow well the rules mimic the behavior of a neural network while accuracy describes how well therules can be generalized. This paper identifies the fidelity-acuracy dilemma. It argues todistinguish rule extraction using neural networks and rule extraction for neural networks accordingto their different goals, where fidelity and accuracy should be excluded from the rule qualityevaluation framework, respectively.展开更多
This paper discusses how to extract symbolic rules from trained artificial neural network (ANN) in domains involving classification using genetic algorithms (GA). Previous methods based on an exhaustive analysis of ne...This paper discusses how to extract symbolic rules from trained artificial neural network (ANN) in domains involving classification using genetic algorithms (GA). Previous methods based on an exhaustive analysis of network connections and output values have already been demonstrated to be intractable in that the scale-up factor increases with the number of nodes and connections in the network. Some experiments explaining effectiveness of the presented method are given as well.展开更多
Phishing is a technique under Social Engineering attacks which is most widely used to get user sensitive information,such as login credentials and credit and debit card information,etc.It is carried out by a person ma...Phishing is a technique under Social Engineering attacks which is most widely used to get user sensitive information,such as login credentials and credit and debit card information,etc.It is carried out by a person masquerading as an authentic individual.To protect web users from these attacks,various anti-phishing techniques are developed,but they fail to protect the user from these attacks in various ways.In this paper,we propose a novel technique to identify phishing websites effortlessly on the client side by proposing a novel browser architecture.In this system,we use the rule of extraction framework to extract the properties or features of a website using the URL only.This list consists of 30 different properties of a URL,which will later be used by the Random Forest Classification machine learning model to detect the authenticity of the website.A dataset consisting of 11,055 tuples is used to train the model.These processes are carried out on the client-side with the help of a redesigned browser architecture.Today Researches have come up with machine learning frameworks to detect phishing sites,but they are not in a state to be used by individuals having no technical knowledge.To make sure that these tools are accessible to every individual,we have improvised and introduced detection methods into the browser architecture named as‘Embedded Phishing Detection Browser’(EPDB),which is a novel method to preserve the existing user experience while improving the security.The newly designed browser architecture introduces a special segment to perform phishing detection operations in real-time.We have prototyped this technique to ensure maximum security,better accuracy of 99.36%in the identification of phishing websites in realtime.展开更多
In the quest for interpretable models,two versions of a neural network rule extraction algorithm were proposed and compared.The two algorithms are called the Piece-Wise Linear Artificial Neural Network(PWL-ANN)and enh...In the quest for interpretable models,two versions of a neural network rule extraction algorithm were proposed and compared.The two algorithms are called the Piece-Wise Linear Artificial Neural Network(PWL-ANN)and enhanced Piece-Wise Linear Artificial Neural Network(enhanced PWL-ANN)algorithms.The PWL-ANN algorithm is a decomposition artificial neural network(ANN)rule extraction algorithm,and the enhanced PWL-ANN algorithm improves upon the PWL-ANN algorithm and extracts multiple linear regression equations from a trained ANN model by approximating the hidden sigmoid activation functions using N-piece linear equations.In doing so,the algorithm provides interpretable models from the originally trained opaque ANN models.A detailed application case study illustrates how the generated enhanced-PWL-ANN models can provide understandable IF-THEN rules about a problem domain.Comparison of the results generated by the two versions of the PWL-ANN algorithm showed that in comparison to the PWL-ANN models,the enhanced-PWL-ANN models support improved fidelities to the originally trained ANN models.The results also showed that more concise rule sets could be generated using the enhanced-PWL-ANN algorithm.If a more simplified set of rules is desired,the enhanced-PWL-ANN algorithm can be combined with the decision tree approach.Potential application of the algorithms to domains related to petroleum engineering can help enhance understanding of the problems.展开更多
Phishing is a technique under Social Engineering attacks which is most widely used to get user sensitive information,such as login credentials and credit and debit card information,etc.It is carried out by a person ma...Phishing is a technique under Social Engineering attacks which is most widely used to get user sensitive information,such as login credentials and credit and debit card information,etc.It is carried out by a person masquerading as an authentic individual.To protect web users from these attacks,various anti-phishing techniques are developed,but they fail to protect the user from these attacks in various ways.In this paper,we propose a novel technique to identify phishing websites effortlessly on the client side by proposing a novel browser architecture.In this system,we use the rule of extraction framework to extract the properties or features of a website using the URL only.This list consists of 30 different properties of a URL,which will later be used by the Random Forest Classification machine learning model to detect the authenticity of the website.A dataset consisting of 11,055 tuples is used to train the model.These processes are carried out on the client-side with the help of a redesigned browser architecture.Today Researches have come up with machine learning frameworks to detect phishing sites,but they are not in a state to be used by individuals having no technical knowledge.To make sure that these tools are accessible to every individual,we have improvised and introduced detection methods into the browser architecture named as‘Embedded Phishing Detection Browser’(EPDB),which is a novel method to preserve the existing user experience while improving the security.The newly designed browser architecture introduces a special segment to perform phishing detection operations in real-time.We have prototyped this technique to ensure maximum security,better accuracy of 99.36% in the identification of phishing websites in realtime.展开更多
文摘To realize the reuse of process design knowledge and improve the efficiency and quality of process design, a method for extracting thinking process rules for process design is proposed. An instance representation model of the process planning reflecting the thinking process of techni- cians is established to achieve an effective representation of the process documents. The related process attributes are extracted from the model to form the related events. The manifold learning algorithm and clustering analysis are used to preprocess the process instance data. A rule extraction mechanism of process design is introduced, which is based on the related events after dimension reduction and clustering, and uses the association rule mining algorithm to realize the similar process information extraction in the same cluster. Through the vectorization description of the related events, the final process design rules are formed. Finally, an example is given to evaluate the method of process design rules extraction.
基金supported by International Science and Technology Cooperation project (Grant No. 2008DFA71750)
文摘In metal cutting industry it is a common practice to search for optimal combination of cutting parameters in order to maximize the tool life for a fixed minimum value of material removal rate(MRR). After the advent of high-speed milling(HSM) pro cess, lots of experimental and theoretical researches have been done for this purpose which mainly emphasized on the optimization of the cutting parameters. It is highly beneficial to convert raw data into a comprehensive knowledge-based expert system using fuzzy logic as the reasoning mechanism. In this paper an attempt has been presented for the extraction of the rules from fuzzy neural network(FNN) so as to have the most effective knowledge-base for given set of data. Experiments were conducted to determine the best values of cutting speeds that can maximize tool life for different combinations of input parameters. A fuzzy neural network was constructed based on the fuzzification of input parameters and the cutting speed. After training process, raw rule sets were extracted and a rule pruning approach was proposed to obtain concise linguistic rules. The estimation process with fuzzy inference showed that the optimized combination of fuzzy rules provided the estimation error of only 6.34 m/min as compared to 314 m/min of that of randomized combination of rule s.
基金National Natural Science Foundation of China under Grant No.60873213,91018008 and 61070192Beijing Science Foundation under Grant No. 4082018Shanghai Key Laboratory of Intelligent Information Processing of China under Grant No. IIPL-09-006
文摘For various reasons,many of the security programming rules applicable to specific software have not been recorded in official documents,and hence can hardly be employed by static analysis tools for detection.In this paper,we propose a new approach,named SVR-Miner(Security Validation Rules Miner),which uses frequent sequence mining technique [1-4] to automatically infer implicit security validation rules from large software code written in C programming language.Different from the past works in this area,SVR-Miner introduces three techniques which are sensitive thread,program slicing [5-7],and equivalent statements computing to improve the accuracy of rules.Experiments with the Linux Kernel demonstrate the effectiveness of our approach.With the ten given sensitive threads,SVR-Miner automatically generated 17 security validation rules and detected 8 violations,5 of which were published by Linux Kernel Organization before we detected them.We have reported the other three to the Linux Kernel Organization recently.
基金Supported by the National Natural Science Foundation of China (No. 60421002)National Outstanding Youth Science Foundation of China (No. 60025308)the New Century 151 Talent Project of Zhejiang Province.
文摘In this paper, a novel data mining method is introduced to solve the multi-objective optimization problems of process industry. A hyperrectangle association rule mining (HARM) algorithm based on support vector machines (SVMs) is proposed. Hyperrectangles rules are constructed on the base of prototypes and support vectors (SVs) under some heuristic limitations. The proposed algorithm is applied to a simulated moving bed (SMB) paraxylene (PX) adsorption process. The relationships between the key process variables and some objective variables such as purity, recovery rate of PX are obtained. Using existing domain knowledge about PX adsorption process, most of the obtained association rules can be explained.
文摘Tourism demand forecasting has attracted substantial interest because of the significant economic contributions of the fast-growing tourism industry. Although various quantitative forecasting techniques have been widely studied, highly accurate and understandable forecasting models have not been developed. The present paper proposes a novel tourism demand forecasting method that extracts fuzzy Takagi-Sugeno (T-S) rules from trained SVMs. Unlike previous approaches, this study uses fuzzy T-S models extracted from the outputs of trained SVMs on tourism data. Owing to the symbolic fuzzy rules and the generalization ability of SVMs, the extracted fuzzy T-S rules exhibit high forecasting accuracy and include understandable pre-condition parts for practitioners. Based on the tourism demand forecasting problem in Hong Kong SAR, China as a case study, empirical findings on tourist arrivals from nine overseas origins reveal that the proposed approach performs comparably with SVMs and can achieve better prediction accuracy than other forecasting techniques for most origins. The findings demonstrated that decision makers can easily interpret fuzzy T-S rules extracted from SVMs. Thus, the approach is highly beneficial to tourism market management. This finding demonstrates the excellent scientific and practical values of the proposed approach in tourism demand forecasting.
基金Project(2017YFB0102503)supported by the National Key Research and Development Program of ChinaProjects(U1664258,51875255,61601203)supported by the National Natural Science Foundation of China+1 种基金Projects(DZXX-048,2018-TD-GDZB-022)supported by the Jiangsu Province’s Six Talent Peak,ChinaProject(18KJA580002)supported by Major Natural Science Research Project of Higher Learning in Jiangsu Province,China
文摘In order to make full use of the driver’s long-term driving experience in the process of perception, interaction and vehicle control of road traffic information, a driving behavior rule extraction algorithm based on artificial neural network interface(ANNI) and its integration is proposed. Firstly, based on the cognitive learning theory, the cognitive driving behavior model is established, and then the cognitive driving behavior is described and analyzed. Next, based on ANNI, the model and the rule extraction algorithm(ANNI-REA) are designed to explain not only the driving behavior but also the non-sequence. Rules have high fidelity and safety during driving without discretizing continuous input variables. The experimental results on the UCI standard data set and on the self-built driving behavior data set, show that the method is about 0.4% more accurate and about 10% less complex than the common C4.5-REA, Neuro-Rule and REFNE. Further, simulation experiments verify the correctness of the extracted driving rules and the effectiveness of the extraction based on cognitive driving behavior rules. In general, the several driving rules extracted fully reflect the execution mechanism of sequential activity of driving comprehensive cognition, which is of great significance for the traffic of mixed traffic flow under the network of vehicles and future research on unmanned driving.
文摘The paper introduce segmentation ideas in the pretreatment process of web page. By page segmentation technique to extract the accurate information in the extract region, the region was processed to extract according to the rules of ontology extraction, and ultimately get the information you need. Through experiments on two real datasets and compare with related work, experimental results show that this method can achieve good extraction results.
文摘Disease diagnosis is a challenging task due to a large number of associated factors.Uncertainty in the diagnosis process arises frominaccuracy in patient attributes,missing data,and limitation in the medical expert’s ability to define cause and effect relationships when there are multiple interrelated variables.This paper aims to demonstrate an integrated view of deploying smart disease diagnosis using the Internet of Things(IoT)empowered by the fuzzy inference system(FIS)to diagnose various diseases.The Fuzzy Systemis one of the best systems to diagnose medical conditions because every disease diagnosis involves many uncertainties,and fuzzy logic is the best way to handle uncertainties.Our proposed system differentiates new cases provided symptoms of the disease.Generally,it becomes a time-sensitive task to discriminate symptomatic diseases.The proposed system can track symptoms firmly to diagnose diseases through IoT and FIS smartly and efficiently.Different coefficients have been employed to predict and compute the identified disease’s severity for each sign of disease.This study aims to differentiate and diagnose COVID-19,Typhoid,Malaria,and Pneumonia.This study used the FIS method to figure out the disease over the use of given data related to correlating with input symptoms.MATLAB tool is utilised for the implementation of FIS.Fuzzy procedure on the aforementioned given data presents that affectionate disease can derive from the symptoms.The results of our proposed method proved that FIS could be utilised for the diagnosis of other diseases.This study may assist doctors,patients,medical practitioners,and other healthcare professionals in early diagnosis and better treat diseases.
文摘The concepts of Rough Decision Support System (RDSS) and equivalence matrix are introduced in this paper. Based on a rough attribute vector tree (RAVT) method, two kinds of matrix computation algorithms — Recursive Matrix Computation (RMC) and Parallel Matrix Computation (PMC) are proposed for rules extraction, attributes reduction and data cleaning finished synchronously. The algorithms emphasize the practicability and efficiency of rules generation. A case study of PMC is analyzed, and a comparison experiment of RMC algorithm shows that it is feasible and efficient for data mining and knowledge-discovery in RDSS.
基金Project supported by the National Natural Science Foundation of China(Nos.40101014 and 40001008).
文摘A machine-learning approach was developed for automated building of knowledgebases for soil resources mapping by using a classification tree to generate knowledge from trainingdata. With this method, building a knowledge base for automated soil mapping was easier than usingthe conventional knowledge acquisition approach. The knowledge base built by classification tree wasused by the knowledge classifier to perform the soil type classification of Longyou County,Zhejiang Province, China using Landsat TM bi-temporal images and CIS data. To evaluate theperformance of the resultant knowledge bases, the classification results were compared to existingsoil map based on a field survey. The accuracy assessment and analysis of the resultant soil mapssuggested that the knowledge bases built by the machine-learning method was of good quality formapping distribution model of soil classes over the study area.
基金Project supported by the National Natural Science Foundation ofChina (No. 40101014) and by the Science and technology Committee of Zhejiang Province (No. 001110445) China
文摘This article presents two approaches for automated building of knowledge bases of soil resources mapping. These methods used decision tree and Bayesian predictive modeling, respectively to generate knowledge from training data. With these methods, building a knowledge base for automated soil mapping is easier than using the conventional knowledge acquisition approach. The knowledge bases built by these two methods were used by the knowledge classifier for soil type classification of the Longyou area, Zhejiang Province, China using TM bi-temporal imageries and GIS data. To evaluate the performance of the resultant knowledge bases, the classification results were compared to existing soil map based on field survey. The accuracy assessment and analysis of the resultant soil maps suggested that the knowledge bases built by these two methods were of good quality for mapping distribution model of soil classes over the study area.
基金Supported by the National Research Foundation for the Doctoral Program of Higher Education of China (20030487032)
文摘This paper combines computational intelligence tools: neural network, fuzzylogic, and genetic algorithm to develop a data mining architecture (NFGDM), which discovers patternsand represents them in understandable forms. In the NFGDM, input data arepreprocesscd byfuzzification, the preprocessed data of input variables arc then used to train a radial basisprobabilistic neural network to classify the dataset according to the classes considered, A ruleextraction technique is then applied in order to extract explicit knowledge from the trained neuralnetworks and represent it m the form of fuzzy if-then rules. In the final stage, genetic algorithmis used as a rule-pruning module to eliminate those weak rules that are still in the rule bases.Comparison with some known neural network classifier, the architecture has fast learning speed, andit is characterized by the incorporation of the possibility information into the consequents ofclassification rules in human understandable forms. The experiments show that the NFGDM is moreefficient and more robust than traditional decision tree method.
文摘Data analysis and automatic processing is often interpreted as knowledge acquisition. In many cases it is necessary to somehow classify data or find regularities in them. Results obtained in the search of regularities in intelligent data analyzing applications are mostly represented with the help of IF-THEN rules. With the help of these rules the following tasks are solved: prediction, classification, pattern recognition and others. Using different approaches---clustering algorithms, neural network methods, fuzzy rule processing methods--we can extract rules that in an understandable language characterize the data. This allows interpreting the data, finding relationships in the data and extracting new rules that characterize them. Knowledge acquisition in this paper is defined as the process of extracting knowledge from numerical data in the form of rules. Extraction of rules in this context is based on clustering methods K-means and fuzzy C-means. With the assistance of K-means, clustering algorithm rules are derived from trained neural networks. Fuzzy C-means is used in fuzzy rule based design method. Rule extraction methodology is demonstrated in the Fisher's Iris flower data set samples. The effectiveness of the extracted rules is evaluated. Clustering and rule extraction methodology can be widely used in evaluating and analyzing various economic and financial processes.
文摘This paper proposes a method in order to detect the importance of the input variables in multivariate analysis problems. When there is correlation among predictor variables, the importance of each input variable, when adding variables in the model, can be detected from the knowledge stored in Artificial Neural Network (NN) and it must be taken into account. Neural networks models have been used with the analysis of sensibility, these models predict more accurately the relationship between variables, and it is the way to find a set of forecasting variables in order to be included in the new prediction model. The obtained results have been applied in a system to forecast the volume of wood for a tree, and to detect relationships between input and output variables.
基金supported by the research project from China Three Gorges Corporation(No.202103386).
文摘Heavy renewable penetrations and high-voltage cross-regional transmission systems reduce the inertia and critical frequency stability of power systems after disturbances.Therefore,the power system operators should ensure the frequency nadirs after possible disturbances are within the set restriction,e.g.,0.20 Hz.Traditional methods utilize linearized and simplified control models to quantify the frequency nadirs and achieve frequency-constrained unit commitments(FCUCs).However,the simplified models are hard to depict the frequency responses of practical units after disturbances.Also,they usually neglect the regulations from battery storage.This paper achieves FCUCs with linear rules extracted from massive simulation results.We simulate the frequency responses on typical thermal-hydro-storage systems under diverse unit online conditions.Then,we extract the rules of frequency nadirs after disturbances merely with linear support vector machine to evaluate the frequency stability of power systems.The algorithm holds a high accuracy in a wide range of frequency restrictions.Finally,we apply the rules to three typical cases to show the influences of frequency constraints on unit commitments.
文摘In the research of rule extraction from neural networks, fidelity describeshow well the rules mimic the behavior of a neural network while accuracy describes how well therules can be generalized. This paper identifies the fidelity-acuracy dilemma. It argues todistinguish rule extraction using neural networks and rule extraction for neural networks accordingto their different goals, where fidelity and accuracy should be excluded from the rule qualityevaluation framework, respectively.
文摘This paper discusses how to extract symbolic rules from trained artificial neural network (ANN) in domains involving classification using genetic algorithms (GA). Previous methods based on an exhaustive analysis of network connections and output values have already been demonstrated to be intractable in that the scale-up factor increases with the number of nodes and connections in the network. Some experiments explaining effectiveness of the presented method are given as well.
文摘Phishing is a technique under Social Engineering attacks which is most widely used to get user sensitive information,such as login credentials and credit and debit card information,etc.It is carried out by a person masquerading as an authentic individual.To protect web users from these attacks,various anti-phishing techniques are developed,but they fail to protect the user from these attacks in various ways.In this paper,we propose a novel technique to identify phishing websites effortlessly on the client side by proposing a novel browser architecture.In this system,we use the rule of extraction framework to extract the properties or features of a website using the URL only.This list consists of 30 different properties of a URL,which will later be used by the Random Forest Classification machine learning model to detect the authenticity of the website.A dataset consisting of 11,055 tuples is used to train the model.These processes are carried out on the client-side with the help of a redesigned browser architecture.Today Researches have come up with machine learning frameworks to detect phishing sites,but they are not in a state to be used by individuals having no technical knowledge.To make sure that these tools are accessible to every individual,we have improvised and introduced detection methods into the browser architecture named as‘Embedded Phishing Detection Browser’(EPDB),which is a novel method to preserve the existing user experience while improving the security.The newly designed browser architecture introduces a special segment to perform phishing detection operations in real-time.We have prototyped this technique to ensure maximum security,better accuracy of 99.36%in the identification of phishing websites in realtime.
文摘In the quest for interpretable models,two versions of a neural network rule extraction algorithm were proposed and compared.The two algorithms are called the Piece-Wise Linear Artificial Neural Network(PWL-ANN)and enhanced Piece-Wise Linear Artificial Neural Network(enhanced PWL-ANN)algorithms.The PWL-ANN algorithm is a decomposition artificial neural network(ANN)rule extraction algorithm,and the enhanced PWL-ANN algorithm improves upon the PWL-ANN algorithm and extracts multiple linear regression equations from a trained ANN model by approximating the hidden sigmoid activation functions using N-piece linear equations.In doing so,the algorithm provides interpretable models from the originally trained opaque ANN models.A detailed application case study illustrates how the generated enhanced-PWL-ANN models can provide understandable IF-THEN rules about a problem domain.Comparison of the results generated by the two versions of the PWL-ANN algorithm showed that in comparison to the PWL-ANN models,the enhanced-PWL-ANN models support improved fidelities to the originally trained ANN models.The results also showed that more concise rule sets could be generated using the enhanced-PWL-ANN algorithm.If a more simplified set of rules is desired,the enhanced-PWL-ANN algorithm can be combined with the decision tree approach.Potential application of the algorithms to domains related to petroleum engineering can help enhance understanding of the problems.
文摘Phishing is a technique under Social Engineering attacks which is most widely used to get user sensitive information,such as login credentials and credit and debit card information,etc.It is carried out by a person masquerading as an authentic individual.To protect web users from these attacks,various anti-phishing techniques are developed,but they fail to protect the user from these attacks in various ways.In this paper,we propose a novel technique to identify phishing websites effortlessly on the client side by proposing a novel browser architecture.In this system,we use the rule of extraction framework to extract the properties or features of a website using the URL only.This list consists of 30 different properties of a URL,which will later be used by the Random Forest Classification machine learning model to detect the authenticity of the website.A dataset consisting of 11,055 tuples is used to train the model.These processes are carried out on the client-side with the help of a redesigned browser architecture.Today Researches have come up with machine learning frameworks to detect phishing sites,but they are not in a state to be used by individuals having no technical knowledge.To make sure that these tools are accessible to every individual,we have improvised and introduced detection methods into the browser architecture named as‘Embedded Phishing Detection Browser’(EPDB),which is a novel method to preserve the existing user experience while improving the security.The newly designed browser architecture introduces a special segment to perform phishing detection operations in real-time.We have prototyped this technique to ensure maximum security,better accuracy of 99.36% in the identification of phishing websites in realtime.