Data mining is the process of extracting implicit but potentially useful information from incomplete, noisy, and fuzzy data. Data mining offers excellent nonlinear modeling and self-organized learning, and it can play...Data mining is the process of extracting implicit but potentially useful information from incomplete, noisy, and fuzzy data. Data mining offers excellent nonlinear modeling and self-organized learning, and it can play a vital role in the interpretation of well logging data of complex reservoirs. We used data mining to identify the lithologies in a complex reservoir. The reservoir lithologies served as the classification task target and were identified using feature extraction, feature selection, and modeling of data streams. We used independent component analysis to extract information from well curves. We then used the branch-and- bound algorithm to look for the optimal feature subsets and eliminate redundant information. Finally, we used the C5.0 decision-tree algorithm to set up disaggregated models of the well logging curves. The modeling and actual logging data were in good agreement, showing the usefulness of data mining methods in complex reservoirs.展开更多
We studied multiple attribute decision-making problems with uncertain linguistic information, in which the preference values took the form of uncertain linguistic variables. We introduced some operational laws of unce...We studied multiple attribute decision-making problems with uncertain linguistic information, in which the preference values took the form of uncertain linguistic variables. We introduced some operational laws of uncertain linguistic variables and a formula for the comparison between two uncertain linguistic variables. We proposed two new aggregation operators called extended uncertain linguistic aggregation (EULA) operator and interval linguistic aggregation (ILA) operator, and then develop an EULA operator-based linguistic approach and an ILA operator-based linguistic approach, respectively, to multiple attribute decision making in uncertain linguistic setting. The approaches were straightforward and do not produce any loss of information. Finally, an illustrative example was given to verify the developed approaches and to demonstrate their practicality and effectiveness.展开更多
Association rules and C4.5 rules can overcome the shortage of the traditional land evaluation methods and improve the intelligibility and efficiency of the land evaluation knowledge.In order to compare these two kinds...Association rules and C4.5 rules can overcome the shortage of the traditional land evaluation methods and improve the intelligibility and efficiency of the land evaluation knowledge.In order to compare these two kinds of classification rules in the application,two fuzzy classifiers were established by combining with fuzzy decision algorithm especially based on Second General Soil Survey of Guangdong Province.The results of experiments demonstrated that the fuzzy classifier based on association rules obtain a higher accuracy rate,but with more complex calculation process and more computational overhead;the fuzzy classifier based on C4.5 rules obtain a slightly lower accuracy,but with fast computation and simpler calculation.展开更多
In order to solve the problems of potential incident rescue on expressway networks, the opportunity cost-based method is used to establish a resource dispatch decision model. The model aims to dispatch the rescue reso...In order to solve the problems of potential incident rescue on expressway networks, the opportunity cost-based method is used to establish a resource dispatch decision model. The model aims to dispatch the rescue resources from the regional road networks and to obtain the location of the rescue depots and the numbers of service vehicles assigned for the potential incidents. Due to the computational complexity of the decision model, a scene decomposition algorithm is proposed. The algorithm decomposes the dispatch problem from various kinds of resources to a single resource, and determines the original scene of rescue resources based on the rescue requirements and the resource matrix. Finally, a convenient optimal dispatch scheme is obtained by decomposing each original scene and simplifying the objective function. To illustrate the application of the decision model and the algorithm, a case of the expressway network is studied on areas around Nanjing city in China and the results show that the model used and the algorithm proposed are appropriate.展开更多
To overcome the limitation that complex data types with noun attributes cannot be processed by rank learning algorithms, a new rank learning algorithm is designed. In the learning algorithm based on the decision tree,...To overcome the limitation that complex data types with noun attributes cannot be processed by rank learning algorithms, a new rank learning algorithm is designed. In the learning algorithm based on the decision tree, the splitting rule of the decision tree is revised with a new definition of rank impurity. A new rank learning algorithm, which can be intuitively explained, is obtained and its theoretical basis is provided. The experimental results show that in the aspect of average rank loss, the ranking tree algorithm outperforms perception ranking and ordinal regression algorithms and it also has a faster convergence speed. The rank learning algorithm based on the decision tree is able to process categorical data and select relative features.展开更多
In this paper, we present a fuzzy linguistic scale, which is characterized by triangular fuzzy numbers on [1/9, 9], for the comparison between two alternatives, and introduce a possibility degree formula for comparing...In this paper, we present a fuzzy linguistic scale, which is characterized by triangular fuzzy numbers on [1/9, 9], for the comparison between two alternatives, and introduce a possibility degree formula for comparing triangular fuzzy numbers. We utilize the fuzzy linguistic scale to construct a linguistic preference matrix, and propose a fuzzy induced ordered weighted geometric averaging (FIOWGA) operator to aggregate linguistic preference information. A method based on the fuzzy linguistic scale and FIOWGA operator for decision-making problems is presented. Finally, an illustrative example is given to verify the developed method and to demonstrate its feasibility and effectiveness.展开更多
[Objective]The aim was to overcome the shortage of being difficult to build land evaluation model when the impact factors had continuous value in the traditional land evaluation process,as well as to improve the intel...[Objective]The aim was to overcome the shortage of being difficult to build land evaluation model when the impact factors had continuous value in the traditional land evaluation process,as well as to improve the intelligibility of the land evaluation knowledge.[Method] The land evaluation method combining classification rule extracted by C4.5 algorithm with fuzzy decision was proposed in this study.[Result] The result of Second General Soil Survey of Guangdong Province had demonstrated that the method was convenient to extract classification rules,and by using only 100 rules,quantity correct rate 86.67% and area correct rate 84.80% of land evaluation could be obtained.[Conclusions] The use of C4.5 algorithm to obtain the rules,combined with fuzzy decision algorithm to build classifiers had got satisfactory results,which provided a practical algorithm for the land evaluation.展开更多
For the dense macro-femto coexistence networks scenario, a long-term-based handover(LTBH) algorithm is proposed. The handover decision algorithm is jointly determined by the angle of handover(AHO) and the time-tos...For the dense macro-femto coexistence networks scenario, a long-term-based handover(LTBH) algorithm is proposed. The handover decision algorithm is jointly determined by the angle of handover(AHO) and the time-tostay(TTS) to reduce the unnecessary handover numbers.First, the proposed AHO parameter is used to decrease the computation complexity in multiple candidate base stations(CBSs) scenario. Then, two types of TTS parameters are given for the fixed base stations and mobile base stations to make handover decisions among multiple CBSs. The simulation results show that the proposed LTBH algorithm can not only maintain the required transmission rate of users, but also effectively reduce the unnecessary numbers of handover in the dense macro-femto networks with the coexisting mobile BSs.展开更多
Some techniques and methods for deriving water information from SPOT-4(XI) image were investigated and discussed in this paper. An algorithm of decision tree (DT) classification which includes several classifiers base...Some techniques and methods for deriving water information from SPOT-4(XI) image were investigated and discussed in this paper. An algorithm of decision tree (DT) classification which includes several classifiers based on the spectral responding characteristics of water bodies and other objects, was developed and put forward to delineate water bodies. Another algorithm of decision tree classification based on both spectral characteristics and auxiliary information of DEM and slope (DTDS) was also designed for water bodies extraction. In addition, supervised classification method of maximum likelyhood classification (MLC), and unsupervised method of interactive self organizing dada analysis technique (ISODATA) were used to extract waterbodies for comparison purpose. An index was designed and used to assess the accuracy of different methods adopted in the research. Results have shown that water extraction accuracy was variable with respect to the various techniques applied. It was low using ISODATA, very high using DT algorithm and much higher using both DTDS and MLC.展开更多
In order to improve the generalization ability of binary decision trees, a new learning algorithm, the MMDT algorithm, is presented. Based on statistical learning theory the generalization performance of binary decisi...In order to improve the generalization ability of binary decision trees, a new learning algorithm, the MMDT algorithm, is presented. Based on statistical learning theory the generalization performance of binary decision trees is analyzed, and the assessment rule is proposed. Under the direction of the assessment rule, the MMDT algorithm is implemented. The algorithm maps training examples from an original space to a high dimension feature space, and constructs a decision tree in it. In the feature space, a new decision node splitting criterion, the max-min rule, is used, and the margin of each decision node is maximized using a support vector machine, to improve the generalization performance. Experimental results show that the new learning algorithm is much superior to others such as C4. 5 and OCI.展开更多
Controller vulnerabilities allow malicious actors to disrupt or hijack the Software-Defined Networking. Traditionally, it is static mappings between the control plane and data plane. Adversaries have plenty of time to...Controller vulnerabilities allow malicious actors to disrupt or hijack the Software-Defined Networking. Traditionally, it is static mappings between the control plane and data plane. Adversaries have plenty of time to exploit the controller's vulnerabilities and launch attacks wisely. We tend to believe that dynamically altering such static mappings is a promising approach to alleviate this issue, since a moving target is difficult to be compromised even by skilled adversaries. It is critical to determine the right time to conduct scheduling and to balance the overhead afforded and the security levels guaranteed. Little previous work has been done to investigate the economical time in dynamic-scheduling controllers. In this paper, we take the first step to both theoretically and experimentally study the scheduling-timing problem in dynamic control plane. We model this problem as a renewal reward process and propose an optimal algorithm in deciding the right time to schedule with the objective of minimizing the long-term loss rate. In our experiments, simulations based on real network attack datasets are conducted and we demonstrate that our proposed algorithm outperforms given scheduling schemes.展开更多
A new method based on rough set theory and genetic algorithm was proposedto predict the rock burst proneness. Nine influencing factors were first selected, and then,the decision table was set up. Attributes were reduc...A new method based on rough set theory and genetic algorithm was proposedto predict the rock burst proneness. Nine influencing factors were first selected, and then,the decision table was set up. Attributes were reduced by genetic algorithm. Rough setwas used to extract the simplified decision rules of rock burst proneness. Taking the practical engineering for example, the rock burst proneness was evaluated and predicted bydecision rules. Comparing the prediction results with the actual results, it shows that theproposed method is feasible and effective.展开更多
In machine learning,randomness is a crucial factor in the success of ensemble learning,and it can be injected into tree-based ensembles by rotating the feature space.However,it is a common practice to rotate the featu...In machine learning,randomness is a crucial factor in the success of ensemble learning,and it can be injected into tree-based ensembles by rotating the feature space.However,it is a common practice to rotate the feature space randomly.Thus,a large number of trees are required to ensure the performance of the ensemble model.This random rotation method is theoretically feasible,but it requires massive computing resources,potentially restricting its applications.A multimodal genetic algorithm based rotation forest(MGARF)algorithm is proposed in this paper to solve this problem.It is a tree-based ensemble learning algorithm for classification,taking advantage of the characteristic of trees to inject randomness by feature rotation.However,this algorithm attempts to select a subset of more diverse and accurate base learners using the multimodal optimization method.The classification accuracy of the proposed MGARF algorithm was evaluated by comparing it with the original random forest and random rotation ensemble methods on 23 UCI classification datasets.Experimental results show that the MGARF method outperforms the other methods,and the number of base learners in MGARF models is much fewer.展开更多
Interact traffic classification is vital to the areas of network operation and management. Traditional classification methods such as port mapping and payload analysis are becoming increasingly difficult as newly emer...Interact traffic classification is vital to the areas of network operation and management. Traditional classification methods such as port mapping and payload analysis are becoming increasingly difficult as newly emerged applications (e. g. Peer-to-Peer) using dynamic port numbers, masquerading techniques and encryption to avoid detection. This paper presents a machine learning (ML) based traffic classifica- tion scheme, which offers solutions to a variety of network activities and provides a platform of performance evaluation for the classifiers. The impact of dataset size, feature selection, number of application types and ML algorithm selection on classification performance is analyzed and demonstrated by the following experiments: (1) The genetic algorithm based feature selection can dramatically reduce the cost without diminishing classification accuracy. (2) The chosen ML algorithms can achieve high classification accuracy. Particularly, REPTree and C4.5 outperform the other ML algorithms when computational complexity and accuracy are both taken into account. (3) Larger dataset and fewer application types would result in better classification accuracy. Finally, early detection with only several initial packets is proposed for real-time network activity and it is proved to be feasible according to the preliminary results.展开更多
AIM: To assess the usefulness of FibroTest to forecast scores by constructing decision trees in patients with chronic hepatitis C.METHODS: We used the C4.5 classification algorithm to construct decision trees with d...AIM: To assess the usefulness of FibroTest to forecast scores by constructing decision trees in patients with chronic hepatitis C.METHODS: We used the C4.5 classification algorithm to construct decision trees with data from 261 patients with chronic hepatitis C without a liver biopsy. The FibroTest attributes of age, gender, bilirubin, apolipoprotein, haptoglobin, α2 macroglobulin, and γ-glutamyl transpeptidase were used as predictors, and the FibroTest score as the target. For testing, a 10-fold cross validation was used.RESULTS: The overall classification error was 14.9% (accuracy 85.1%). FibroTest's cases with true scores of FO and F4 were classified with very high accuracy (18/20 for FO, 9/9 for FO-1 and 92/96 for F4) and the largest confusion centered on F3. The algorithm produced a set of compound rules out of the ten classification trees and was used to classify the 261 patients. The rules for the classification of patients in FO and F4 were effective in more than 75% of the cases in which they were tested.CONCLUSION: The recognition of clinical subgroups should help to enhance our ability to assess differences in fibrosis scores in clinical studies and improve our understanding of fibrosis progression,展开更多
A cost-based selective maintenance decision-making method was presented.The purpose of this method was to find an optimal choice of maintenance actions to be performed on a selected group of machines for manufacturing...A cost-based selective maintenance decision-making method was presented.The purpose of this method was to find an optimal choice of maintenance actions to be performed on a selected group of machines for manufacturing systems.The arithmetic reduction of intensity model was introduced to describe the influence on machine failure intensity by different maintenance actions (preventive maintenance,minimal repair and overhaul).In the meantime,a resolution algorithm combining the greedy heuristic rules with genetic algorithm was provided.Finally,a case study of the maintenance decision-making problem of automobile workshop was given.Furthermore,the case study demonstrates the practicability of this method.展开更多
In this paper, it described the architecture of a tool called DiagData. This tool aims to use a large amount of data and information in the field of plant disease diagnostic to generate a disease predictive system. In...In this paper, it described the architecture of a tool called DiagData. This tool aims to use a large amount of data and information in the field of plant disease diagnostic to generate a disease predictive system. In this approach, techniques of data mining are used to extract knowledge from existing data. The data is extracted in the form of rules that are used in the development of a predictive intelligent system. Currently, the specification of these rules is built by an expert or data mining. When data mining on a large database is used, the number of generated rules is very complex too. The main goal of this work is minimize the rule generation time. The proposed tool, called DiagData, extracts knowledge automatically or semi-automatically from a database and uses it to build an intelligent system for disease prediction. In this work, the decision tree learning algorithm was used to generate the rules. A toolbox called Fuzzygen was used to generate a prediction system from rules generated by decision tree algorithm. The language used to implement this software was Java. The DiagData has been used in diseases prediction and diagnosis systems and in the validation of economic and environmental indicators in agricultural production systems. The validation process involved measurements and comparisons of the time spent to enter the rules by an expert with the time used to insert the same rules with the proposed tool. Thus, the tool was successfully validated, providing a reduction of time.展开更多
For the last two decades, there has been an ongoing research concerning the international new ventures (INV) or born global (BG) companies which are rapidly entering foreign markets. They face challenges connected...For the last two decades, there has been an ongoing research concerning the international new ventures (INV) or born global (BG) companies which are rapidly entering foreign markets. They face challenges connected with their marketing activity, because they launch relatively more product innovations in a shorter time than the gradually internationalized companies (GRAD). The entrepreneurial marketing (EM) concept could become a solution to some of these challenges, because of a greater entrepreneurial intensity (EI) and different decision-making approach than "classical" marketing concept. This study's aim is to analyze the EM concept and application of its elements by the INVs originating from Poland. Based on two computer-aided telephone interview (CATI) studies of INVs from the Polish industrial processing sector, the central elements of EM, applied by them, are explored, together with their relationship to INV performance. As it is shown, the INVs introduce significantly more product innovations than the gradually internationalized small and medium sized enterprises (SMEs). They often exceed competitors in the speed of launching innovations and are flexible in entering new markets. The entrepreneurial orientation (EO) indicators are at low to medium levels in all studied SMEs. However, the propensity to risk is slightly stronger in the INVs and correlated moderately with the financial performance. As the study shows, lack of emphasis on marketing planning and information gathering is the characteristic of the Polish INVs, which may testify to their effectual approach to decision making. Furthermore, similar as in the foreign-based INVs, there may exist a relationship between the application of the EM concept and performance of the Polish INNs, which, however, requires further study with respect to some mediating factors. It has been concluded that innovativeness of the product offering and propensity to risk seems to be the characteristic EM concept elements accompanying the rapid internationalization of INVs. The future research should focus on other elements of the EM-mix applied by INVs originating from emerging economies.展开更多
Product analytics is a blend of computational methods with the express purpose of facilitating the multifaceted process of decision-making based on demographic and consumer preferences. This complex subject is derived...Product analytics is a blend of computational methods with the express purpose of facilitating the multifaceted process of decision-making based on demographic and consumer preferences. This complex subject is derived from consensus theory and includes structured analytics, categories, and the combination of evidence. The methodology is applicable to a wide range of business, economic, social, political, and strategic decisions. The paper describes a product allocation application to demonstrate the conceots.展开更多
Collaboration in wireless sensor systems must be fault-tolerant due to the harsh environmental conditions at which such systems can be deployed. This paper focuses on finding the signal processing algorithms for colla...Collaboration in wireless sensor systems must be fault-tolerant due to the harsh environmental conditions at which such systems can be deployed. This paper focuses on finding the signal processing algorithms for collaborative target detection based on the generalized approach to signal processing (GASP) in the presence of noise. The signal processing algorithms are efficient in terms of communication cost, precision, accuracy, and number of faulty sensors tolerable in the wireless sensor systems. Two types of generalized signal processing algorithms, namely, value fusion and decision fusion constructed according to GASP in the presence of noise, are identified first. When comparing their performance and communication overhead, the decision fusion algorithm is found to become superior to the value fusion algorithm as the ratio of faulty sensors to fault free sensors increases. The use of GASP under designing the value and decision fusion algorithms in wireless sensor systems allows us to obtain the same performance, but at low values of signal energy, as well as under employment of the universally adopted signal processing algorithms widely used in practice.展开更多
基金sponsored by the National Science and Technology Major Project(No.2011ZX05023-005-006)
文摘Data mining is the process of extracting implicit but potentially useful information from incomplete, noisy, and fuzzy data. Data mining offers excellent nonlinear modeling and self-organized learning, and it can play a vital role in the interpretation of well logging data of complex reservoirs. We used data mining to identify the lithologies in a complex reservoir. The reservoir lithologies served as the classification task target and were identified using feature extraction, feature selection, and modeling of data streams. We used independent component analysis to extract information from well curves. We then used the branch-and- bound algorithm to look for the optimal feature subsets and eliminate redundant information. Finally, we used the C5.0 decision-tree algorithm to set up disaggregated models of the well logging curves. The modeling and actual logging data were in good agreement, showing the usefulness of data mining methods in complex reservoirs.
文摘We studied multiple attribute decision-making problems with uncertain linguistic information, in which the preference values took the form of uncertain linguistic variables. We introduced some operational laws of uncertain linguistic variables and a formula for the comparison between two uncertain linguistic variables. We proposed two new aggregation operators called extended uncertain linguistic aggregation (EULA) operator and interval linguistic aggregation (ILA) operator, and then develop an EULA operator-based linguistic approach and an ILA operator-based linguistic approach, respectively, to multiple attribute decision making in uncertain linguistic setting. The approaches were straightforward and do not produce any loss of information. Finally, an illustrative example was given to verify the developed approaches and to demonstrate their practicality and effectiveness.
基金Supported by Science and Technology Plan Project of Guangdong Province (2009B010900026,2009CD058,2009CD078,2009CD079,2009CD080)Special Funds for Support Program of Development of Modern Information Service Industry of Guangdong Province(06120840B0370124)Funded Fund Project of South China Agricultural University (2007K017)~~
文摘Association rules and C4.5 rules can overcome the shortage of the traditional land evaluation methods and improve the intelligibility and efficiency of the land evaluation knowledge.In order to compare these two kinds of classification rules in the application,two fuzzy classifiers were established by combining with fuzzy decision algorithm especially based on Second General Soil Survey of Guangdong Province.The results of experiments demonstrated that the fuzzy classifier based on association rules obtain a higher accuracy rate,but with more complex calculation process and more computational overhead;the fuzzy classifier based on C4.5 rules obtain a slightly lower accuracy,but with fast computation and simpler calculation.
基金The National Natural Science Foundation of China (No.50422283)the Science and Technology Key Plan Project of Henan Province (No.072102360060)
文摘In order to solve the problems of potential incident rescue on expressway networks, the opportunity cost-based method is used to establish a resource dispatch decision model. The model aims to dispatch the rescue resources from the regional road networks and to obtain the location of the rescue depots and the numbers of service vehicles assigned for the potential incidents. Due to the computational complexity of the decision model, a scene decomposition algorithm is proposed. The algorithm decomposes the dispatch problem from various kinds of resources to a single resource, and determines the original scene of rescue resources based on the rescue requirements and the resource matrix. Finally, a convenient optimal dispatch scheme is obtained by decomposing each original scene and simplifying the objective function. To illustrate the application of the decision model and the algorithm, a case of the expressway network is studied on areas around Nanjing city in China and the results show that the model used and the algorithm proposed are appropriate.
基金The Planning Program of Science and Technology of Hunan Province (No05JT1039)
文摘To overcome the limitation that complex data types with noun attributes cannot be processed by rank learning algorithms, a new rank learning algorithm is designed. In the learning algorithm based on the decision tree, the splitting rule of the decision tree is revised with a new definition of rank impurity. A new rank learning algorithm, which can be intuitively explained, is obtained and its theoretical basis is provided. The experimental results show that in the aspect of average rank loss, the ranking tree algorithm outperforms perception ranking and ordinal regression algorithms and it also has a faster convergence speed. The rank learning algorithm based on the decision tree is able to process categorical data and select relative features.
基金The National Natural Science Foundation of China(79970093) the Ph.D. Dissertation Foundation of Southeast University- NARI-Relays Electric Co. Ltd.
文摘In this paper, we present a fuzzy linguistic scale, which is characterized by triangular fuzzy numbers on [1/9, 9], for the comparison between two alternatives, and introduce a possibility degree formula for comparing triangular fuzzy numbers. We utilize the fuzzy linguistic scale to construct a linguistic preference matrix, and propose a fuzzy induced ordered weighted geometric averaging (FIOWGA) operator to aggregate linguistic preference information. A method based on the fuzzy linguistic scale and FIOWGA operator for decision-making problems is presented. Finally, an illustrative example is given to verify the developed method and to demonstrate its feasibility and effectiveness.
基金Supported by Science and Technology Plan Project of Guangdong Province (2009B010900026,2009CD058,2009CD078,2009CD079,2009CD080)Special Funds for Support Program of Development of Modern Information Service Industry of Guangdong Province(06120840B0370124 )Fund Project of South China Agricultural University (2007K017)~~
文摘[Objective]The aim was to overcome the shortage of being difficult to build land evaluation model when the impact factors had continuous value in the traditional land evaluation process,as well as to improve the intelligibility of the land evaluation knowledge.[Method] The land evaluation method combining classification rule extracted by C4.5 algorithm with fuzzy decision was proposed in this study.[Result] The result of Second General Soil Survey of Guangdong Province had demonstrated that the method was convenient to extract classification rules,and by using only 100 rules,quantity correct rate 86.67% and area correct rate 84.80% of land evaluation could be obtained.[Conclusions] The use of C4.5 algorithm to obtain the rules,combined with fuzzy decision algorithm to build classifiers had got satisfactory results,which provided a practical algorithm for the land evaluation.
基金The National Natural Science Foundation of China(No.61471164)the Fundamental Research Funds for the Central Universitiesthe Scientific Innovation Research of College Graduates in Jiangsu Province(No.KYLX-0133)
文摘For the dense macro-femto coexistence networks scenario, a long-term-based handover(LTBH) algorithm is proposed. The handover decision algorithm is jointly determined by the angle of handover(AHO) and the time-tostay(TTS) to reduce the unnecessary handover numbers.First, the proposed AHO parameter is used to decrease the computation complexity in multiple candidate base stations(CBSs) scenario. Then, two types of TTS parameters are given for the fixed base stations and mobile base stations to make handover decisions among multiple CBSs. The simulation results show that the proposed LTBH algorithm can not only maintain the required transmission rate of users, but also effectively reduce the unnecessary numbers of handover in the dense macro-femto networks with the coexisting mobile BSs.
文摘Some techniques and methods for deriving water information from SPOT-4(XI) image were investigated and discussed in this paper. An algorithm of decision tree (DT) classification which includes several classifiers based on the spectral responding characteristics of water bodies and other objects, was developed and put forward to delineate water bodies. Another algorithm of decision tree classification based on both spectral characteristics and auxiliary information of DEM and slope (DTDS) was also designed for water bodies extraction. In addition, supervised classification method of maximum likelyhood classification (MLC), and unsupervised method of interactive self organizing dada analysis technique (ISODATA) were used to extract waterbodies for comparison purpose. An index was designed and used to assess the accuracy of different methods adopted in the research. Results have shown that water extraction accuracy was variable with respect to the various techniques applied. It was low using ISODATA, very high using DT algorithm and much higher using both DTDS and MLC.
文摘In order to improve the generalization ability of binary decision trees, a new learning algorithm, the MMDT algorithm, is presented. Based on statistical learning theory the generalization performance of binary decision trees is analyzed, and the assessment rule is proposed. Under the direction of the assessment rule, the MMDT algorithm is implemented. The algorithm maps training examples from an original space to a high dimension feature space, and constructs a decision tree in it. In the feature space, a new decision node splitting criterion, the max-min rule, is used, and the margin of each decision node is maximized using a support vector machine, to improve the generalization performance. Experimental results show that the new learning algorithm is much superior to others such as C4. 5 and OCI.
基金supported by the Foundation for Innovative Research Groups of the National Natural Science Foundation of China (No. 61521003)The National Key R&D Program of China (No.2016YFB0800101)+1 种基金the National Science Foundation for Distinguished Young Scholars of China (No.61602509)Henan Province Key Technologies R&D Program of China(No.172102210615)
文摘Controller vulnerabilities allow malicious actors to disrupt or hijack the Software-Defined Networking. Traditionally, it is static mappings between the control plane and data plane. Adversaries have plenty of time to exploit the controller's vulnerabilities and launch attacks wisely. We tend to believe that dynamically altering such static mappings is a promising approach to alleviate this issue, since a moving target is difficult to be compromised even by skilled adversaries. It is critical to determine the right time to conduct scheduling and to balance the overhead afforded and the security levels guaranteed. Little previous work has been done to investigate the economical time in dynamic-scheduling controllers. In this paper, we take the first step to both theoretically and experimentally study the scheduling-timing problem in dynamic control plane. We model this problem as a renewal reward process and propose an optimal algorithm in deciding the right time to schedule with the objective of minimizing the long-term loss rate. In our experiments, simulations based on real network attack datasets are conducted and we demonstrate that our proposed algorithm outperforms given scheduling schemes.
基金Supported by the Youth Science Foundation of North China University of Water Conservancy and Electric Power(HSQJ2009016)
文摘A new method based on rough set theory and genetic algorithm was proposedto predict the rock burst proneness. Nine influencing factors were first selected, and then,the decision table was set up. Attributes were reduced by genetic algorithm. Rough setwas used to extract the simplified decision rules of rock burst proneness. Taking the practical engineering for example, the rock burst proneness was evaluated and predicted bydecision rules. Comparing the prediction results with the actual results, it shows that theproposed method is feasible and effective.
基金Project(61603274)supported by the National Natural Science Foundation of ChinaProject(2017KJ249)supported by the Research Project of Tianjin Municipal Education Commission,China。
文摘In machine learning,randomness is a crucial factor in the success of ensemble learning,and it can be injected into tree-based ensembles by rotating the feature space.However,it is a common practice to rotate the feature space randomly.Thus,a large number of trees are required to ensure the performance of the ensemble model.This random rotation method is theoretically feasible,but it requires massive computing resources,potentially restricting its applications.A multimodal genetic algorithm based rotation forest(MGARF)algorithm is proposed in this paper to solve this problem.It is a tree-based ensemble learning algorithm for classification,taking advantage of the characteristic of trees to inject randomness by feature rotation.However,this algorithm attempts to select a subset of more diverse and accurate base learners using the multimodal optimization method.The classification accuracy of the proposed MGARF algorithm was evaluated by comparing it with the original random forest and random rotation ensemble methods on 23 UCI classification datasets.Experimental results show that the MGARF method outperforms the other methods,and the number of base learners in MGARF models is much fewer.
基金Supported by the National High Technology Research and Development Programme of China (No. 2005AA121620, 2006AA01Z232)the Zhejiang Provincial Natural Science Foundation of China (No. Y1080935 )the Research Innovation Program for Graduate Students in Jiangsu Province (No. CX07B_ 110zF)
文摘Interact traffic classification is vital to the areas of network operation and management. Traditional classification methods such as port mapping and payload analysis are becoming increasingly difficult as newly emerged applications (e. g. Peer-to-Peer) using dynamic port numbers, masquerading techniques and encryption to avoid detection. This paper presents a machine learning (ML) based traffic classifica- tion scheme, which offers solutions to a variety of network activities and provides a platform of performance evaluation for the classifiers. The impact of dataset size, feature selection, number of application types and ML algorithm selection on classification performance is analyzed and demonstrated by the following experiments: (1) The genetic algorithm based feature selection can dramatically reduce the cost without diminishing classification accuracy. (2) The chosen ML algorithms can achieve high classification accuracy. Particularly, REPTree and C4.5 outperform the other ML algorithms when computational complexity and accuracy are both taken into account. (3) Larger dataset and fewer application types would result in better classification accuracy. Finally, early detection with only several initial packets is proposed for real-time network activity and it is proved to be feasible according to the preliminary results.
基金Supported by A grant of the Universidad Nacional Autonoma de Mexico SDI.PTID.05.6
文摘AIM: To assess the usefulness of FibroTest to forecast scores by constructing decision trees in patients with chronic hepatitis C.METHODS: We used the C4.5 classification algorithm to construct decision trees with data from 261 patients with chronic hepatitis C without a liver biopsy. The FibroTest attributes of age, gender, bilirubin, apolipoprotein, haptoglobin, α2 macroglobulin, and γ-glutamyl transpeptidase were used as predictors, and the FibroTest score as the target. For testing, a 10-fold cross validation was used.RESULTS: The overall classification error was 14.9% (accuracy 85.1%). FibroTest's cases with true scores of FO and F4 were classified with very high accuracy (18/20 for FO, 9/9 for FO-1 and 92/96 for F4) and the largest confusion centered on F3. The algorithm produced a set of compound rules out of the ten classification trees and was used to classify the 261 patients. The rules for the classification of patients in FO and F4 were effective in more than 75% of the cases in which they were tested.CONCLUSION: The recognition of clinical subgroups should help to enhance our ability to assess differences in fibrosis scores in clinical studies and improve our understanding of fibrosis progression,
基金Project(51105141,51275191)supported by the National Natural Science Foundation of ChinaProject(2009AA043301)supported by the National High Technology Research and Development Program of ChinaProject(2012TS073)supported by the Fundamental Research Funds for the Central University of HUST,China
文摘A cost-based selective maintenance decision-making method was presented.The purpose of this method was to find an optimal choice of maintenance actions to be performed on a selected group of machines for manufacturing systems.The arithmetic reduction of intensity model was introduced to describe the influence on machine failure intensity by different maintenance actions (preventive maintenance,minimal repair and overhaul).In the meantime,a resolution algorithm combining the greedy heuristic rules with genetic algorithm was provided.Finally,a case study of the maintenance decision-making problem of automobile workshop was given.Furthermore,the case study demonstrates the practicability of this method.
文摘In this paper, it described the architecture of a tool called DiagData. This tool aims to use a large amount of data and information in the field of plant disease diagnostic to generate a disease predictive system. In this approach, techniques of data mining are used to extract knowledge from existing data. The data is extracted in the form of rules that are used in the development of a predictive intelligent system. Currently, the specification of these rules is built by an expert or data mining. When data mining on a large database is used, the number of generated rules is very complex too. The main goal of this work is minimize the rule generation time. The proposed tool, called DiagData, extracts knowledge automatically or semi-automatically from a database and uses it to build an intelligent system for disease prediction. In this work, the decision tree learning algorithm was used to generate the rules. A toolbox called Fuzzygen was used to generate a prediction system from rules generated by decision tree algorithm. The language used to implement this software was Java. The DiagData has been used in diseases prediction and diagnosis systems and in the validation of economic and environmental indicators in agricultural production systems. The validation process involved measurements and comparisons of the time spent to enter the rules by an expert with the time used to insert the same rules with the proposed tool. Thus, the tool was successfully validated, providing a reduction of time.
文摘For the last two decades, there has been an ongoing research concerning the international new ventures (INV) or born global (BG) companies which are rapidly entering foreign markets. They face challenges connected with their marketing activity, because they launch relatively more product innovations in a shorter time than the gradually internationalized companies (GRAD). The entrepreneurial marketing (EM) concept could become a solution to some of these challenges, because of a greater entrepreneurial intensity (EI) and different decision-making approach than "classical" marketing concept. This study's aim is to analyze the EM concept and application of its elements by the INVs originating from Poland. Based on two computer-aided telephone interview (CATI) studies of INVs from the Polish industrial processing sector, the central elements of EM, applied by them, are explored, together with their relationship to INV performance. As it is shown, the INVs introduce significantly more product innovations than the gradually internationalized small and medium sized enterprises (SMEs). They often exceed competitors in the speed of launching innovations and are flexible in entering new markets. The entrepreneurial orientation (EO) indicators are at low to medium levels in all studied SMEs. However, the propensity to risk is slightly stronger in the INVs and correlated moderately with the financial performance. As the study shows, lack of emphasis on marketing planning and information gathering is the characteristic of the Polish INVs, which may testify to their effectual approach to decision making. Furthermore, similar as in the foreign-based INVs, there may exist a relationship between the application of the EM concept and performance of the Polish INNs, which, however, requires further study with respect to some mediating factors. It has been concluded that innovativeness of the product offering and propensity to risk seems to be the characteristic EM concept elements accompanying the rapid internationalization of INVs. The future research should focus on other elements of the EM-mix applied by INVs originating from emerging economies.
文摘Product analytics is a blend of computational methods with the express purpose of facilitating the multifaceted process of decision-making based on demographic and consumer preferences. This complex subject is derived from consensus theory and includes structured analytics, categories, and the combination of evidence. The methodology is applicable to a wide range of business, economic, social, political, and strategic decisions. The paper describes a product allocation application to demonstrate the conceots.
文摘Collaboration in wireless sensor systems must be fault-tolerant due to the harsh environmental conditions at which such systems can be deployed. This paper focuses on finding the signal processing algorithms for collaborative target detection based on the generalized approach to signal processing (GASP) in the presence of noise. The signal processing algorithms are efficient in terms of communication cost, precision, accuracy, and number of faulty sensors tolerable in the wireless sensor systems. Two types of generalized signal processing algorithms, namely, value fusion and decision fusion constructed according to GASP in the presence of noise, are identified first. When comparing their performance and communication overhead, the decision fusion algorithm is found to become superior to the value fusion algorithm as the ratio of faulty sensors to fault free sensors increases. The use of GASP under designing the value and decision fusion algorithms in wireless sensor systems allows us to obtain the same performance, but at low values of signal energy, as well as under employment of the universally adopted signal processing algorithms widely used in practice.