In enterprise operations,maintaining manual rules for enterprise processes can be expensive,time-consuming,and dependent on specialized domain knowledge in that enterprise domain.Recently,rule-generation has been auto...In enterprise operations,maintaining manual rules for enterprise processes can be expensive,time-consuming,and dependent on specialized domain knowledge in that enterprise domain.Recently,rule-generation has been automated in enterprises,particularly through Machine Learning,to streamline routine tasks.Typically,these machine models are black boxes where the reasons for the decisions are not always transparent,and the end users need to verify the model proposals as a part of the user acceptance testing to trust it.In such scenarios,rules excel over Machine Learning models as the end-users can verify the rules and have more trust.In many scenarios,the truth label changes frequently thus,it becomes difficult for the Machine Learning model to learn till a considerable amount of data has been accumulated,but with rules,the truth can be adapted.This paper presents a novel framework for generating human-understandable rules using the Classification and Regression Tree(CART)decision tree method,which ensures both optimization and user trust in automated decision-making processes.The framework generates comprehensible rules in the form of if condition and then predicts class even in domains where noise is present.The proposed system transforms enterprise operations by automating the production of human-readable rules from structured data,resulting in increased efficiency and transparency.Removing the need for human rule construction saves time and money while guaranteeing that users can readily check and trust the automatic judgments of the system.The remarkable performance metrics of the framework,which achieve 99.85%accuracy and 96.30%precision,further support its efficiency in translating complex data into comprehensible rules,eventually empowering users and enhancing organizational decision-making processes.展开更多
The amount of data for decision making has increased tremendously in the age of the digital economy. Decision makers who fail to proficiently manipulate the data produced may make incorrect decisions and therefore har...The amount of data for decision making has increased tremendously in the age of the digital economy. Decision makers who fail to proficiently manipulate the data produced may make incorrect decisions and therefore harm their business. Thus, the task of extracting and classifying the useful information efficiently and effectively from huge amounts of computational data is of special importance. In this paper, we consider that the attributes of data could be both crisp and fuzzy. By examining the suitable partial data, segments with different classes are formed, then a multithreaded computation is performed to generate crisp rules (if possible), and finally, the fuzzy partition technique is employed to deal with the fuzzy attributes for classification. The rules generated in classifying the overall data can be used to gain more knowledge from the data collected.展开更多
This paper summarizes the research results dealing with washer and nut taxonomy and knowledge base design, making the use of fuzzy methodology. In particular, the theory of fuzzy membership functions, similarity matri...This paper summarizes the research results dealing with washer and nut taxonomy and knowledge base design, making the use of fuzzy methodology. In particular, the theory of fuzzy membership functions, similarity matrices, and the operation of fuzzy inference play important roles.A realistic set of 25 washers and nuts are employed to conduct extensive experiments and simulations.The investigation includes a complete demonstration of engineering design. The results obtained from this feasibility study are very encouraging indeed because they represent the lower bound with respect to performance, namely correctrecognition rate, of what fuzzy methodology can do. This lower bound shows high recognition rate even with noisy input patterns, robustness in terms of noise tolerance, and simplicity in hardware implementation. Possible future works are suggested in the conclusion.展开更多
Dominance-based rough set approach(DRSA) permits representation and analysis of all phenomena involving monotonicity relationship between some measures or perceptions.DRSA has also some merits within granular computin...Dominance-based rough set approach(DRSA) permits representation and analysis of all phenomena involving monotonicity relationship between some measures or perceptions.DRSA has also some merits within granular computing,as it extends the paradigm of granular computing to ordered data,specifies a syntax and modality of information granules which are appropriate for dealing with ordered data,and enables computing with words and reasoning about ordered data.Granular computing with ordered data is a very general paradigm,because other modalities of information constraints,such as veristic,possibilistic and probabilistic modalities,have also to deal with ordered value sets(with qualifiers relative to grades of truth,possibility and probability),which gives DRSA a large area of applications.展开更多
Heart diagnosis is not always possible at every medical center, especially in the rural areas where less support and care, due to lack of advanced heart diagnosis equipment. Also, physician intuition and experience ar...Heart diagnosis is not always possible at every medical center, especially in the rural areas where less support and care, due to lack of advanced heart diagnosis equipment. Also, physician intuition and experience are not always sufficient to achieve high quality medical procedures results. Therefore, medical errors and undesirable results are reasons for a need for unconventional computer-based diagnosis systems, which in turns reduce medical fatal errors, increasing the patient safety and save lives. The proposed solution, which is based on an Artificial Neural Networks (ANNs), provides a decision support system to identify three main heart diseases: mitral stenosis, aortic stenosis and ventricular septal defect. Furthermore, the system deals with an encouraging opportunity to develop an operational screening and testing device for heart disease diagnosis and can deliver great assistance for clinicians to make advanced heart diagnosis. Using real medical data, series of experiments have been conducted to examine the performance and accuracy of the proposed solution. Compared results revealed that the system performance and accuracy are acceptable, with a heart diseases classification accuracy of 92%.展开更多
Traditional Belief-Rule-Based(BRB) ensemble learning methods integrate all of the trained sub-BRB systems to obtain better results than a single belief-rule-based system. However, as the number of BRB systems particip...Traditional Belief-Rule-Based(BRB) ensemble learning methods integrate all of the trained sub-BRB systems to obtain better results than a single belief-rule-based system. However, as the number of BRB systems participating in ensemble learning increases, a large amount of redundant sub-BRB systems are generated because of the diminishing difference between subsystems. This drastically decreases the prediction speed and increases the storage requirements for BRB systems. In order to solve these problems, this paper proposes BRBCS-PAES: a selective ensemble learning approach for BRB Classification Systems(BRBCS) based on ParetoArchived Evolutionary Strategy(PAES) multi-objective optimization. This system employs the improved Bagging algorithm to train the base classifier. For the purpose of increasing the degree of difference in the integration of the base classifier, the training set is constructed by the repeated sampling of data. In the base classifier selection stage, the trained base classifier is binary coded, and the number of base classifiers participating in integration and generalization error of the base classifier is used as the objective function for multi-objective optimization. Finally,the elite retention strategy and the adaptive mesh algorithm are adopted to produce the PAES optimal solution set.Three experimental studies on classification problems are performed to verify the effectiveness of the proposed method. The comparison results demonstrate that the proposed method can effectively reduce the number of base classifiers participating in the integration and improve the accuracy of BRBCS.展开更多
The paper considers the problem of semantic processing of web documents by designing an approach, which combines extracted semantic document model and domain- related knowledge base. The knowledge base is populated wi...The paper considers the problem of semantic processing of web documents by designing an approach, which combines extracted semantic document model and domain- related knowledge base. The knowledge base is populated with learnt classification rules categorizing documents into topics. Classification provides for the reduction of the dimensio0ality of the document feature space. The semantic model of retrieved web documents is semantically labeled by querying domain ontology and processed with content-based classification method. The model obtained is mapped to the existing knowledge base by implementing inference algorithm. It enables models of the same semantic type to be recognized and integrated into the knowledge base. The approach provides for the domain knowledge integration and assists the extraction and modeling web documents semantics. Implementation results of the proposed approach are presented.展开更多
自从国际海事组织的目标型船舶建造标准(goal based standard,GBS)方法论被成功应用于国际船级社协会的油船和散货船船体结构规范以来,GBS方法论已逐渐推广至各海事领域和所有海船船型的规范编制中。该文综述了GBS方法论在国内外海船规...自从国际海事组织的目标型船舶建造标准(goal based standard,GBS)方法论被成功应用于国际船级社协会的油船和散货船船体结构规范以来,GBS方法论已逐渐推广至各海事领域和所有海船船型的规范编制中。该文综述了GBS方法论在国内外海船规范研究中的发展动态,深入剖析了GBS船体结构规范的技术特性,并据此提出了中国船级社海船船体结构规范体系采纳GBS方法的框架方案。展开更多
To promote behavioral change among adolescents in Zambia, the National HIV/AIDS/STI/TB Council, in collaboration with UNICEF, developed the Zambia U-Report platform. This platform provides young people with improved a...To promote behavioral change among adolescents in Zambia, the National HIV/AIDS/STI/TB Council, in collaboration with UNICEF, developed the Zambia U-Report platform. This platform provides young people with improved access to information on various Sexual Reproductive Health topics through Short Messaging Service (SMS) messages. Over the years, the platform has accumulated millions of incoming and outgoing messages, which need to be categorized into key thematic areas for better tracking of sexual reproductive health knowledge gaps among young people. The current manual categorization process of these text messages is inefficient and time-consuming and this study aims to automate the process for improved analysis using text-mining techniques. Firstly, the study investigates the current text message categorization process and identifies a list of categories adopted by counselors over time which are then used to build and train a categorization model. Secondly, the study presents a proof of concept tool that automates the categorization of U-report messages into key thematic areas using the developed categorization model. Finally, it compares the performance and effectiveness of the developed proof of concept tool against the manual system. The study used a dataset comprising 206,625 text messages. The current process would take roughly 2.82 years to categorise this dataset whereas the trained SVM model would require only 6.4 minutes while achieving an accuracy of 70.4% demonstrating that the automated method is significantly faster, more scalable, and consistent when compared to the current manual categorization. These advantages make the SVM model a more efficient and effective tool for categorizing large unstructured text datasets. These results and the proof-of-concept tool developed demonstrate the potential for enhancing the efficiency and accuracy of message categorization on the Zambia U-report platform and other similar text messages-based platforms.展开更多
文摘In enterprise operations,maintaining manual rules for enterprise processes can be expensive,time-consuming,and dependent on specialized domain knowledge in that enterprise domain.Recently,rule-generation has been automated in enterprises,particularly through Machine Learning,to streamline routine tasks.Typically,these machine models are black boxes where the reasons for the decisions are not always transparent,and the end users need to verify the model proposals as a part of the user acceptance testing to trust it.In such scenarios,rules excel over Machine Learning models as the end-users can verify the rules and have more trust.In many scenarios,the truth label changes frequently thus,it becomes difficult for the Machine Learning model to learn till a considerable amount of data has been accumulated,but with rules,the truth can be adapted.This paper presents a novel framework for generating human-understandable rules using the Classification and Regression Tree(CART)decision tree method,which ensures both optimization and user trust in automated decision-making processes.The framework generates comprehensible rules in the form of if condition and then predicts class even in domains where noise is present.The proposed system transforms enterprise operations by automating the production of human-readable rules from structured data,resulting in increased efficiency and transparency.Removing the need for human rule construction saves time and money while guaranteeing that users can readily check and trust the automatic judgments of the system.The remarkable performance metrics of the framework,which achieve 99.85%accuracy and 96.30%precision,further support its efficiency in translating complex data into comprehensible rules,eventually empowering users and enhancing organizational decision-making processes.
文摘The amount of data for decision making has increased tremendously in the age of the digital economy. Decision makers who fail to proficiently manipulate the data produced may make incorrect decisions and therefore harm their business. Thus, the task of extracting and classifying the useful information efficiently and effectively from huge amounts of computational data is of special importance. In this paper, we consider that the attributes of data could be both crisp and fuzzy. By examining the suitable partial data, segments with different classes are formed, then a multithreaded computation is performed to generate crisp rules (if possible), and finally, the fuzzy partition technique is employed to deal with the fuzzy attributes for classification. The rules generated in classifying the overall data can be used to gain more knowledge from the data collected.
文摘This paper summarizes the research results dealing with washer and nut taxonomy and knowledge base design, making the use of fuzzy methodology. In particular, the theory of fuzzy membership functions, similarity matrices, and the operation of fuzzy inference play important roles.A realistic set of 25 washers and nuts are employed to conduct extensive experiments and simulations.The investigation includes a complete demonstration of engineering design. The results obtained from this feasibility study are very encouraging indeed because they represent the lower bound with respect to performance, namely correctrecognition rate, of what fuzzy methodology can do. This lower bound shows high recognition rate even with noisy input patterns, robustness in terms of noise tolerance, and simplicity in hardware implementation. Possible future works are suggested in the conclusion.
文摘Dominance-based rough set approach(DRSA) permits representation and analysis of all phenomena involving monotonicity relationship between some measures or perceptions.DRSA has also some merits within granular computing,as it extends the paradigm of granular computing to ordered data,specifies a syntax and modality of information granules which are appropriate for dealing with ordered data,and enables computing with words and reasoning about ordered data.Granular computing with ordered data is a very general paradigm,because other modalities of information constraints,such as veristic,possibilistic and probabilistic modalities,have also to deal with ordered value sets(with qualifiers relative to grades of truth,possibility and probability),which gives DRSA a large area of applications.
文摘Heart diagnosis is not always possible at every medical center, especially in the rural areas where less support and care, due to lack of advanced heart diagnosis equipment. Also, physician intuition and experience are not always sufficient to achieve high quality medical procedures results. Therefore, medical errors and undesirable results are reasons for a need for unconventional computer-based diagnosis systems, which in turns reduce medical fatal errors, increasing the patient safety and save lives. The proposed solution, which is based on an Artificial Neural Networks (ANNs), provides a decision support system to identify three main heart diseases: mitral stenosis, aortic stenosis and ventricular septal defect. Furthermore, the system deals with an encouraging opportunity to develop an operational screening and testing device for heart disease diagnosis and can deliver great assistance for clinicians to make advanced heart diagnosis. Using real medical data, series of experiments have been conducted to examine the performance and accuracy of the proposed solution. Compared results revealed that the system performance and accuracy are acceptable, with a heart diseases classification accuracy of 92%.
基金supported by the National Natural Science Foundation of China (Nos. 71501047 and 61773123)the Natural Science Foundation of Fujian Province (No. 2019J01647)
文摘Traditional Belief-Rule-Based(BRB) ensemble learning methods integrate all of the trained sub-BRB systems to obtain better results than a single belief-rule-based system. However, as the number of BRB systems participating in ensemble learning increases, a large amount of redundant sub-BRB systems are generated because of the diminishing difference between subsystems. This drastically decreases the prediction speed and increases the storage requirements for BRB systems. In order to solve these problems, this paper proposes BRBCS-PAES: a selective ensemble learning approach for BRB Classification Systems(BRBCS) based on ParetoArchived Evolutionary Strategy(PAES) multi-objective optimization. This system employs the improved Bagging algorithm to train the base classifier. For the purpose of increasing the degree of difference in the integration of the base classifier, the training set is constructed by the repeated sampling of data. In the base classifier selection stage, the trained base classifier is binary coded, and the number of base classifiers participating in integration and generalization error of the base classifier is used as the objective function for multi-objective optimization. Finally,the elite retention strategy and the adaptive mesh algorithm are adopted to produce the PAES optimal solution set.Three experimental studies on classification problems are performed to verify the effectiveness of the proposed method. The comparison results demonstrate that the proposed method can effectively reduce the number of base classifiers participating in the integration and improve the accuracy of BRBCS.
文摘The paper considers the problem of semantic processing of web documents by designing an approach, which combines extracted semantic document model and domain- related knowledge base. The knowledge base is populated with learnt classification rules categorizing documents into topics. Classification provides for the reduction of the dimensio0ality of the document feature space. The semantic model of retrieved web documents is semantically labeled by querying domain ontology and processed with content-based classification method. The model obtained is mapped to the existing knowledge base by implementing inference algorithm. It enables models of the same semantic type to be recognized and integrated into the knowledge base. The approach provides for the domain knowledge integration and assists the extraction and modeling web documents semantics. Implementation results of the proposed approach are presented.
文摘自从国际海事组织的目标型船舶建造标准(goal based standard,GBS)方法论被成功应用于国际船级社协会的油船和散货船船体结构规范以来,GBS方法论已逐渐推广至各海事领域和所有海船船型的规范编制中。该文综述了GBS方法论在国内外海船规范研究中的发展动态,深入剖析了GBS船体结构规范的技术特性,并据此提出了中国船级社海船船体结构规范体系采纳GBS方法的框架方案。
文摘To promote behavioral change among adolescents in Zambia, the National HIV/AIDS/STI/TB Council, in collaboration with UNICEF, developed the Zambia U-Report platform. This platform provides young people with improved access to information on various Sexual Reproductive Health topics through Short Messaging Service (SMS) messages. Over the years, the platform has accumulated millions of incoming and outgoing messages, which need to be categorized into key thematic areas for better tracking of sexual reproductive health knowledge gaps among young people. The current manual categorization process of these text messages is inefficient and time-consuming and this study aims to automate the process for improved analysis using text-mining techniques. Firstly, the study investigates the current text message categorization process and identifies a list of categories adopted by counselors over time which are then used to build and train a categorization model. Secondly, the study presents a proof of concept tool that automates the categorization of U-report messages into key thematic areas using the developed categorization model. Finally, it compares the performance and effectiveness of the developed proof of concept tool against the manual system. The study used a dataset comprising 206,625 text messages. The current process would take roughly 2.82 years to categorise this dataset whereas the trained SVM model would require only 6.4 minutes while achieving an accuracy of 70.4% demonstrating that the automated method is significantly faster, more scalable, and consistent when compared to the current manual categorization. These advantages make the SVM model a more efficient and effective tool for categorizing large unstructured text datasets. These results and the proof-of-concept tool developed demonstrate the potential for enhancing the efficiency and accuracy of message categorization on the Zambia U-report platform and other similar text messages-based platforms.