The dendritic cell algorithm(DCA)is an excellent prototype for developing Machine Learning inspired by the function of the powerful natural immune system.Too many parameters increase complexity and lead to plenty of c...The dendritic cell algorithm(DCA)is an excellent prototype for developing Machine Learning inspired by the function of the powerful natural immune system.Too many parameters increase complexity and lead to plenty of criticism in the signal fusion procedure of DCA.The loss function of DCA is ambiguous due to its complexity.To reduce the uncertainty,several researchers simplified the algorithm program;some introduced gradient descent to optimize parameters;some utilized searching methods to find the optimal parameter combination.However,these studies are either time-consuming or need to be revised in the case of non-convex functions.To overcome the problems,this study models the parameter optimization into a black-box optimization problem without knowing the information about its loss function.This study hybridizes bayesian optimization hyperband(BOHB)with DCA to propose a novel DCA version,BHDCA,for accomplishing parameter optimization in the signal fusion process.The BHDCA utilizes the bayesian optimization(BO)of BOHB to find promising parameter configurations and applies the hyperband of BOHB to allocate the suitable budget for each potential configuration.The experimental results show that the proposed algorithm has significant advantages over the otherDCAexpansion algorithms in terms of signal fusion.展开更多
Bayesian networks are a powerful class of graphical decision models used to represent causal relationships among variables.However,the reliability and integrity of learned Bayesian network models are highly dependent ...Bayesian networks are a powerful class of graphical decision models used to represent causal relationships among variables.However,the reliability and integrity of learned Bayesian network models are highly dependent on the quality of incoming data streams.One of the primary challenges with Bayesian networks is their vulnerability to adversarial data poisoning attacks,wherein malicious data is injected into the training dataset to negatively influence the Bayesian network models and impair their performance.In this research paper,we propose an efficient framework for detecting data poisoning attacks against Bayesian network structure learning algorithms.Our framework utilizes latent variables to quantify the amount of belief between every two nodes in each causal model over time.We use our innovative methodology to tackle an important issue with data poisoning assaults in the context of Bayesian networks.With regard to four different forms of data poisoning attacks,we specifically aim to strengthen the security and dependability of Bayesian network structure learning techniques,such as the PC algorithm.By doing this,we explore the complexity of this area and offer workablemethods for identifying and reducing these sneaky dangers.Additionally,our research investigates one particular use case,the“Visit to Asia Network.”The practical consequences of using uncertainty as a way to spot cases of data poisoning are explored in this inquiry,which is of utmost relevance.Our results demonstrate the promising efficacy of latent variables in detecting and mitigating the threat of data poisoning attacks.Additionally,our proposed latent-based framework proves to be sensitive in detecting malicious data poisoning attacks in the context of stream data.展开更多
The topic of this article is one-sided hypothesis testing for disparity, i.e., the mean of one group is larger than that of another when there is uncertainty as to which group a datum is drawn. For each datum, the unc...The topic of this article is one-sided hypothesis testing for disparity, i.e., the mean of one group is larger than that of another when there is uncertainty as to which group a datum is drawn. For each datum, the uncertainty is captured with a given discrete probability distribution over the groups. Such situations arise, for example, in the use of Bayesian imputation methods to assess race and ethnicity disparities with certain insurance, health, and financial data. A widely used method to implement this assessment is the Bayesian Improved Surname Geocoding (BISG) method which assigns a discrete probability over six race/ethnicity groups to an individual given the individual’s surname and address location. Using a Bayesian framework and Markov Chain Monte Carlo sampling from the joint posterior distribution of the group means, the probability of a disparity hypothesis is estimated. Four methods are developed and compared with an illustrative data set. Three of these methods are implemented in an R-code and one method in WinBUGS. These methods are programed for any number of groups between two and six inclusive. All the codes are provided in the appendices.展开更多
Target distribution in cooperative combat is a difficult and emphases. We build up the optimization model according to the rule of fire distribution. We have researched on the optimization model with BOA. The BOA can ...Target distribution in cooperative combat is a difficult and emphases. We build up the optimization model according to the rule of fire distribution. We have researched on the optimization model with BOA. The BOA can estimate the joint probability distribution of the variables with Bayesian network, and the new candidate solutions also can be generated by the joint distribution. The simulation example verified that the method could be used to solve the complex question, the operation was quickly and the solution was best.展开更多
In the post-genomic biology era,the reconstruction of gene regulatory networks from microarray gene expression data is very important to understand the underlying biological system,and it has been a challenging task i...In the post-genomic biology era,the reconstruction of gene regulatory networks from microarray gene expression data is very important to understand the underlying biological system,and it has been a challenging task in bioinformatics.The Bayesian network model has been used in reconstructing the gene regulatory network for its advantages,but how to determine the network structure and parameters is still important to be explored.This paper proposes a two-stage structure learning algorithm which integrates immune evolution algorithm to build a Bayesian network.The new algorithm is evaluated with the use of both simulated and yeast cell cycle data.The experimental results indicate that the proposed algorithm can find many of the known real regulatory relationships from literature and predict the others unknown with high validity and accuracy.展开更多
Finding out reasonable structures from bulky data is one of the difficulties in modeling of Bayesian network (BN), which is also necessary in promoting the application of BN. This pa- per proposes an immune algorith...Finding out reasonable structures from bulky data is one of the difficulties in modeling of Bayesian network (BN), which is also necessary in promoting the application of BN. This pa- per proposes an immune algorithm based method (BN-IA) for the learning of the BN structure with the idea of vaccination. Further- more, the methods on how to extract the effective vaccines from local optimal structure and root nodes are also described in details. Finally, the simulation studies are implemented with the helicopter convertor BN model and the car start BN model. The comparison results show that the proposed vaccines and the BN-IA can learn the BN structure effectively and efficiently.展开更多
A new method to evaluate the fitness of the Bayesian networks according to the observed data is provided. The main advantage of this criterion is that it is suitable for both the complete and incomplete cases while th...A new method to evaluate the fitness of the Bayesian networks according to the observed data is provided. The main advantage of this criterion is that it is suitable for both the complete and incomplete cases while the others not. Moreover it facilitates the computation greatly. In order to reduce the search space, the notation of equivalent class proposed by David Chickering is adopted. Instead of using the method directly, the novel criterion, variable ordering, and equivalent class are combined,moreover the proposed mthod avoids some problems caused by the previous one. Later, the genetic algorithm which allows global convergence, lack in the most of the methods searching for Bayesian network is applied to search for a good model in thisspace. To speed up the convergence, the genetic algorithm is combined with the greedy algorithm. Finally, the simulation shows the validity of the proposed approach.展开更多
The main thrust of this paper is application of a novel data mining approach on the log of user' s feedback to improve web multimedia information retrieval performance. A user space model was constructed based...The main thrust of this paper is application of a novel data mining approach on the log of user' s feedback to improve web multimedia information retrieval performance. A user space model was constructed based on data mining, and then integrated into the original information space model to improve the accuracy of the new information space model. It can remove clutter and irrelevant text information and help to eliminate mismatch between the page author' s expression and the user' s understanding and expectation. User spacemodel was also utilized to discover the relationship between high-level and low-level features for assigning weight. The authors proposed improved Bayesian algorithm for data mining. Experiment proved that the au-thors' proposed algorithm was efficient.展开更多
Well production optimization is a complex and time-consuming task in the oilfield development.The combination of reservoir numerical simulator with optimization algorithms is usually used to optimize well production.T...Well production optimization is a complex and time-consuming task in the oilfield development.The combination of reservoir numerical simulator with optimization algorithms is usually used to optimize well production.This method spends most of computing time in objective function evaluation by reservoir numerical simulator which limits its optimization efficiency.To improve optimization efficiency,a well production optimization method using streamline features-based objective function and Bayesian adaptive direct search optimization(BADS)algorithm is established.This new objective function,which represents the water flooding potential,is extracted from streamline features.It only needs to call the streamline simulator to run one time step,instead of calling the simulator to calculate the target value at the end of development,which greatly reduces the running time of the simulator.Then the well production optimization model is established and solved by the BADS algorithm.The feasibility of the new objective function and the efficiency of this optimization method are verified by three examples.Results demonstrate that the new objective function is positively correlated with the cumulative oil production.And the BADS algorithm is superior to other common algorithms in convergence speed,solution stability and optimization accuracy.Besides,this method can significantly accelerate the speed of well production optimization process compared with the objective function calculated by other conventional methods.It can provide a more effective basis for determining the optimal well production for actual oilfield development.展开更多
The typical characteristic of the topology of Bayesian networks (BNs) is the interdependence among different nodes (variables), which makes it impossible to optimize one variable independently of others, and the learn...The typical characteristic of the topology of Bayesian networks (BNs) is the interdependence among different nodes (variables), which makes it impossible to optimize one variable independently of others, and the learning of BNs structures by general genetic algorithms is liable to converge to local extremum. To resolve efficiently this problem, a self-organizing genetic algorithm (SGA) based method for constructing BNs from databases is presented. This method makes use of a self-organizing mechanism to develop a genetic algorithm that extended the crossover operator from one to two, providing mutual competition between them, even adjusting the numbers of parents in recombination (crossover/recomposition) schemes. With the K2 algorithm, this method also optimizes the genetic operators, and utilizes adequately the domain knowledge. As a result, with this method it is able to find a global optimum of the topology of BNs, avoiding premature convergence to local extremum. The experimental results proved to be and the convergence of the SGA was discussed.展开更多
A system reliability model based on Bayesian network(BN)is built via an evolutionary strategy called dual genetic algorithm(DGA).BN is a probabilistic approach to analyze relationships between stochastic events.In con...A system reliability model based on Bayesian network(BN)is built via an evolutionary strategy called dual genetic algorithm(DGA).BN is a probabilistic approach to analyze relationships between stochastic events.In contrast with traditional methods where BN model is built by professionals,DGA is proposed for the automatic analysis of historical data and construction of BN for the estimation of system reliability.The whole solution space of BN structures is searched by DGA and a more accurate BN model is obtained.Efficacy of the proposed method is shown by some literature examples.展开更多
Production optimization is of significance for carbonate reservoirs,directly affecting the sustainability and profitability of reservoir development.Traditional physics-based numerical simulations suffer from insuffic...Production optimization is of significance for carbonate reservoirs,directly affecting the sustainability and profitability of reservoir development.Traditional physics-based numerical simulations suffer from insufficient calculation accuracy and excessive time consumption when performing production optimization.We establish an ensemble proxy-model-assisted optimization framework combining the Bayesian random forest(BRF)with the particle swarm optimization algorithm(PSO).The BRF method is implemented to construct a proxy model of the injectioneproduction system that can accurately predict the dynamic parameters of producers based on injection data and production measures.With the help of proxy model,PSO is applied to search the optimal injection pattern integrating Pareto front analysis.After experimental testing,the proxy model not only boasts higher prediction accuracy compared to deep learning,but it also requires 8 times less time for training.In addition,the injection mode adjusted by the PSO algorithm can effectively reduce the gaseoil ratio and increase the oil production by more than 10% for carbonate reservoirs.The proposed proxy-model-assisted optimization protocol brings new perspectives on the multi-objective optimization problems in the petroleum industry,which can provide more options for the project decision-makers to balance the oil production and the gaseoil ratio considering physical and operational constraints.展开更多
We statistically validate the 2011-2022 earthquake prediction records of Ada, the sixth finalist of the 2nd China AETA in 2021, who made 147 earthquake predictions (including 60% of magnitude 5.5 earthquakes) with a p...We statistically validate the 2011-2022 earthquake prediction records of Ada, the sixth finalist of the 2nd China AETA in 2021, who made 147 earthquake predictions (including 60% of magnitude 5.5 earthquakes) with a prediction accuracy higher than 70% and a confidence level of 95% over a 12-year period. Since the reliable earthquake precursor signals described by Ada and the characteristics of Alfvén waves match quite well, this paper proposes a hypothesis on how earthquakes are triggered based on the Alfvén (Q G) torsional wave model of Gillette et al. When the plume of the upper mantle column intrudes into the magma and lithosphere of the soft flow layer during the exchange of hot and cold molten material masses deep inside the Earth’s interior during ascent and descent, it is possible to form body and surface plasma sheets under certain conditions to form Alfven nonlinear isolated waves, and Alfven waves often perturb the geomagnetic field, releasing huge heat and kinetic energy thus triggering earthquakes. To explain the complex phenomenon of how Ada senses Alvfen waves and how to locate epicenters, we venture to speculate that special magnetosensory cells in a few human bodies can sense earthquake precursors and attempt to hypothesize an algorithm that analyzes how the human biological nervous system encodes and decodes earthquake precursors and explains how human magnetosensory cells can solve complex problems such as predicting earthquake magnitude and locating epicenters.展开更多
基金National Natural Science Foundation of China with the Grant Number 61877045。
文摘The dendritic cell algorithm(DCA)is an excellent prototype for developing Machine Learning inspired by the function of the powerful natural immune system.Too many parameters increase complexity and lead to plenty of criticism in the signal fusion procedure of DCA.The loss function of DCA is ambiguous due to its complexity.To reduce the uncertainty,several researchers simplified the algorithm program;some introduced gradient descent to optimize parameters;some utilized searching methods to find the optimal parameter combination.However,these studies are either time-consuming or need to be revised in the case of non-convex functions.To overcome the problems,this study models the parameter optimization into a black-box optimization problem without knowing the information about its loss function.This study hybridizes bayesian optimization hyperband(BOHB)with DCA to propose a novel DCA version,BHDCA,for accomplishing parameter optimization in the signal fusion process.The BHDCA utilizes the bayesian optimization(BO)of BOHB to find promising parameter configurations and applies the hyperband of BOHB to allocate the suitable budget for each potential configuration.The experimental results show that the proposed algorithm has significant advantages over the otherDCAexpansion algorithms in terms of signal fusion.
文摘Bayesian networks are a powerful class of graphical decision models used to represent causal relationships among variables.However,the reliability and integrity of learned Bayesian network models are highly dependent on the quality of incoming data streams.One of the primary challenges with Bayesian networks is their vulnerability to adversarial data poisoning attacks,wherein malicious data is injected into the training dataset to negatively influence the Bayesian network models and impair their performance.In this research paper,we propose an efficient framework for detecting data poisoning attacks against Bayesian network structure learning algorithms.Our framework utilizes latent variables to quantify the amount of belief between every two nodes in each causal model over time.We use our innovative methodology to tackle an important issue with data poisoning assaults in the context of Bayesian networks.With regard to four different forms of data poisoning attacks,we specifically aim to strengthen the security and dependability of Bayesian network structure learning techniques,such as the PC algorithm.By doing this,we explore the complexity of this area and offer workablemethods for identifying and reducing these sneaky dangers.Additionally,our research investigates one particular use case,the“Visit to Asia Network.”The practical consequences of using uncertainty as a way to spot cases of data poisoning are explored in this inquiry,which is of utmost relevance.Our results demonstrate the promising efficacy of latent variables in detecting and mitigating the threat of data poisoning attacks.Additionally,our proposed latent-based framework proves to be sensitive in detecting malicious data poisoning attacks in the context of stream data.
文摘The topic of this article is one-sided hypothesis testing for disparity, i.e., the mean of one group is larger than that of another when there is uncertainty as to which group a datum is drawn. For each datum, the uncertainty is captured with a given discrete probability distribution over the groups. Such situations arise, for example, in the use of Bayesian imputation methods to assess race and ethnicity disparities with certain insurance, health, and financial data. A widely used method to implement this assessment is the Bayesian Improved Surname Geocoding (BISG) method which assigns a discrete probability over six race/ethnicity groups to an individual given the individual’s surname and address location. Using a Bayesian framework and Markov Chain Monte Carlo sampling from the joint posterior distribution of the group means, the probability of a disparity hypothesis is estimated. Four methods are developed and compared with an illustrative data set. Three of these methods are implemented in an R-code and one method in WinBUGS. These methods are programed for any number of groups between two and six inclusive. All the codes are provided in the appendices.
基金This project was supported by the Fund of College Doctor Degree (20020699009)
文摘Target distribution in cooperative combat is a difficult and emphases. We build up the optimization model according to the rule of fire distribution. We have researched on the optimization model with BOA. The BOA can estimate the joint probability distribution of the variables with Bayesian network, and the new candidate solutions also can be generated by the joint distribution. The simulation example verified that the method could be used to solve the complex question, the operation was quickly and the solution was best.
基金supported by National Natural Science Foundation of China (Grant Nos. 60433020, 60175024 and 60773095)European Commission under grant No. TH/Asia Link/010 (111084)the Key Science-Technology Project of the National Education Ministry of China (Grant No. 02090),and the Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, Jilin University, P. R. China
文摘In the post-genomic biology era,the reconstruction of gene regulatory networks from microarray gene expression data is very important to understand the underlying biological system,and it has been a challenging task in bioinformatics.The Bayesian network model has been used in reconstructing the gene regulatory network for its advantages,but how to determine the network structure and parameters is still important to be explored.This paper proposes a two-stage structure learning algorithm which integrates immune evolution algorithm to build a Bayesian network.The new algorithm is evaluated with the use of both simulated and yeast cell cycle data.The experimental results indicate that the proposed algorithm can find many of the known real regulatory relationships from literature and predict the others unknown with high validity and accuracy.
基金supported by the National Natural Science Foundation of China(7110111671271170)+1 种基金the Program for New Century Excellent Talents in University(NCET-13-0475)the Basic Research Foundation of NPU(JC20120228)
文摘Finding out reasonable structures from bulky data is one of the difficulties in modeling of Bayesian network (BN), which is also necessary in promoting the application of BN. This pa- per proposes an immune algorithm based method (BN-IA) for the learning of the BN structure with the idea of vaccination. Further- more, the methods on how to extract the effective vaccines from local optimal structure and root nodes are also described in details. Finally, the simulation studies are implemented with the helicopter convertor BN model and the car start BN model. The comparison results show that the proposed vaccines and the BN-IA can learn the BN structure effectively and efficiently.
基金This project was supported by the National Natural Science Foundation of China (70572045).
文摘A new method to evaluate the fitness of the Bayesian networks according to the observed data is provided. The main advantage of this criterion is that it is suitable for both the complete and incomplete cases while the others not. Moreover it facilitates the computation greatly. In order to reduce the search space, the notation of equivalent class proposed by David Chickering is adopted. Instead of using the method directly, the novel criterion, variable ordering, and equivalent class are combined,moreover the proposed mthod avoids some problems caused by the previous one. Later, the genetic algorithm which allows global convergence, lack in the most of the methods searching for Bayesian network is applied to search for a good model in thisspace. To speed up the convergence, the genetic algorithm is combined with the greedy algorithm. Finally, the simulation shows the validity of the proposed approach.
文摘The main thrust of this paper is application of a novel data mining approach on the log of user' s feedback to improve web multimedia information retrieval performance. A user space model was constructed based on data mining, and then integrated into the original information space model to improve the accuracy of the new information space model. It can remove clutter and irrelevant text information and help to eliminate mismatch between the page author' s expression and the user' s understanding and expectation. User spacemodel was also utilized to discover the relationship between high-level and low-level features for assigning weight. The authors proposed improved Bayesian algorithm for data mining. Experiment proved that the au-thors' proposed algorithm was efficient.
基金supported partly by the National Science and Technology Major Project of China(Grant No.2016ZX05025-001006)Major Science and Technology Project of CNPC(Grant No.ZD2019-183-007)
文摘Well production optimization is a complex and time-consuming task in the oilfield development.The combination of reservoir numerical simulator with optimization algorithms is usually used to optimize well production.This method spends most of computing time in objective function evaluation by reservoir numerical simulator which limits its optimization efficiency.To improve optimization efficiency,a well production optimization method using streamline features-based objective function and Bayesian adaptive direct search optimization(BADS)algorithm is established.This new objective function,which represents the water flooding potential,is extracted from streamline features.It only needs to call the streamline simulator to run one time step,instead of calling the simulator to calculate the target value at the end of development,which greatly reduces the running time of the simulator.Then the well production optimization model is established and solved by the BADS algorithm.The feasibility of the new objective function and the efficiency of this optimization method are verified by three examples.Results demonstrate that the new objective function is positively correlated with the cumulative oil production.And the BADS algorithm is superior to other common algorithms in convergence speed,solution stability and optimization accuracy.Besides,this method can significantly accelerate the speed of well production optimization process compared with the objective function calculated by other conventional methods.It can provide a more effective basis for determining the optimal well production for actual oilfield development.
文摘The typical characteristic of the topology of Bayesian networks (BNs) is the interdependence among different nodes (variables), which makes it impossible to optimize one variable independently of others, and the learning of BNs structures by general genetic algorithms is liable to converge to local extremum. To resolve efficiently this problem, a self-organizing genetic algorithm (SGA) based method for constructing BNs from databases is presented. This method makes use of a self-organizing mechanism to develop a genetic algorithm that extended the crossover operator from one to two, providing mutual competition between them, even adjusting the numbers of parents in recombination (crossover/recomposition) schemes. With the K2 algorithm, this method also optimizes the genetic operators, and utilizes adequately the domain knowledge. As a result, with this method it is able to find a global optimum of the topology of BNs, avoiding premature convergence to local extremum. The experimental results proved to be and the convergence of the SGA was discussed.
基金National Natural Science Foundation of China(No.61203184)
文摘A system reliability model based on Bayesian network(BN)is built via an evolutionary strategy called dual genetic algorithm(DGA).BN is a probabilistic approach to analyze relationships between stochastic events.In contrast with traditional methods where BN model is built by professionals,DGA is proposed for the automatic analysis of historical data and construction of BN for the estimation of system reliability.The whole solution space of BN structures is searched by DGA and a more accurate BN model is obtained.Efficacy of the proposed method is shown by some literature examples.
基金the financial support of this work from the National Natural Science Foundation of China(Grant No.11972073,Grant No.51974357,and Grant No.52274027)supported by China Postdoctoral Science Foundation(Grant No.2022M713204)Scientific Research and Technology Development Project of China National Petroleum Corporation(Grant No.2121DJ2301).
文摘Production optimization is of significance for carbonate reservoirs,directly affecting the sustainability and profitability of reservoir development.Traditional physics-based numerical simulations suffer from insufficient calculation accuracy and excessive time consumption when performing production optimization.We establish an ensemble proxy-model-assisted optimization framework combining the Bayesian random forest(BRF)with the particle swarm optimization algorithm(PSO).The BRF method is implemented to construct a proxy model of the injectioneproduction system that can accurately predict the dynamic parameters of producers based on injection data and production measures.With the help of proxy model,PSO is applied to search the optimal injection pattern integrating Pareto front analysis.After experimental testing,the proxy model not only boasts higher prediction accuracy compared to deep learning,but it also requires 8 times less time for training.In addition,the injection mode adjusted by the PSO algorithm can effectively reduce the gaseoil ratio and increase the oil production by more than 10% for carbonate reservoirs.The proposed proxy-model-assisted optimization protocol brings new perspectives on the multi-objective optimization problems in the petroleum industry,which can provide more options for the project decision-makers to balance the oil production and the gaseoil ratio considering physical and operational constraints.
文摘We statistically validate the 2011-2022 earthquake prediction records of Ada, the sixth finalist of the 2nd China AETA in 2021, who made 147 earthquake predictions (including 60% of magnitude 5.5 earthquakes) with a prediction accuracy higher than 70% and a confidence level of 95% over a 12-year period. Since the reliable earthquake precursor signals described by Ada and the characteristics of Alfvén waves match quite well, this paper proposes a hypothesis on how earthquakes are triggered based on the Alfvén (Q G) torsional wave model of Gillette et al. When the plume of the upper mantle column intrudes into the magma and lithosphere of the soft flow layer during the exchange of hot and cold molten material masses deep inside the Earth’s interior during ascent and descent, it is possible to form body and surface plasma sheets under certain conditions to form Alfven nonlinear isolated waves, and Alfven waves often perturb the geomagnetic field, releasing huge heat and kinetic energy thus triggering earthquakes. To explain the complex phenomenon of how Ada senses Alvfen waves and how to locate epicenters, we venture to speculate that special magnetosensory cells in a few human bodies can sense earthquake precursors and attempt to hypothesize an algorithm that analyzes how the human biological nervous system encodes and decodes earthquake precursors and explains how human magnetosensory cells can solve complex problems such as predicting earthquake magnitude and locating epicenters.