This study aims to explore the application of Bayesian analysis based on neural networks and deep learning in data visualization.The research background is that with the increasing amount and complexity of data,tradit...This study aims to explore the application of Bayesian analysis based on neural networks and deep learning in data visualization.The research background is that with the increasing amount and complexity of data,traditional data analysis methods have been unable to meet the needs.Research methods include building neural networks and deep learning models,optimizing and improving them through Bayesian analysis,and applying them to the visualization of large-scale data sets.The results show that the neural network combined with Bayesian analysis and deep learning method can effectively improve the accuracy and efficiency of data visualization,and enhance the intuitiveness and depth of data interpretation.The significance of the research is that it provides a new solution for data visualization in the big data environment and helps to further promote the development and application of data science.展开更多
Bayesian networks are a powerful class of graphical decision models used to represent causal relationships among variables.However,the reliability and integrity of learned Bayesian network models are highly dependent ...Bayesian networks are a powerful class of graphical decision models used to represent causal relationships among variables.However,the reliability and integrity of learned Bayesian network models are highly dependent on the quality of incoming data streams.One of the primary challenges with Bayesian networks is their vulnerability to adversarial data poisoning attacks,wherein malicious data is injected into the training dataset to negatively influence the Bayesian network models and impair their performance.In this research paper,we propose an efficient framework for detecting data poisoning attacks against Bayesian network structure learning algorithms.Our framework utilizes latent variables to quantify the amount of belief between every two nodes in each causal model over time.We use our innovative methodology to tackle an important issue with data poisoning assaults in the context of Bayesian networks.With regard to four different forms of data poisoning attacks,we specifically aim to strengthen the security and dependability of Bayesian network structure learning techniques,such as the PC algorithm.By doing this,we explore the complexity of this area and offer workablemethods for identifying and reducing these sneaky dangers.Additionally,our research investigates one particular use case,the“Visit to Asia Network.”The practical consequences of using uncertainty as a way to spot cases of data poisoning are explored in this inquiry,which is of utmost relevance.Our results demonstrate the promising efficacy of latent variables in detecting and mitigating the threat of data poisoning attacks.Additionally,our proposed latent-based framework proves to be sensitive in detecting malicious data poisoning attacks in the context of stream data.展开更多
When the training data are insufficient, especially when only a small sample size of data is available, domain knowledge will be taken into the process of learning parameters to improve the performance of the Bayesian...When the training data are insufficient, especially when only a small sample size of data is available, domain knowledge will be taken into the process of learning parameters to improve the performance of the Bayesian networks. In this paper, a new monotonic constraint model is proposed to represent a type of common domain knowledge. And then, the monotonic constraint estimation algorithm is proposed to learn the parameters with the monotonic constraint model. In order to demonstrate the superiority of the proposed algorithm, series of experiments are carried out. The experiment results show that the proposed algorithm is able to obtain more accurate parameters compared to some existing algorithms while the complexity is not the highest.展开更多
It is unpractical to learn the optimal structure of a big Bayesian network(BN)by exhausting the feasible structures,since the number of feasible structures is super exponential on the number of nodes.This paper propos...It is unpractical to learn the optimal structure of a big Bayesian network(BN)by exhausting the feasible structures,since the number of feasible structures is super exponential on the number of nodes.This paper proposes an approach to layer nodes of a BN by using the conditional independence testing.The parents of a node layer only belong to the layer,or layers who have priority over the layer.When a set of nodes has been layered,the number of feasible structures over the nodes can be remarkably reduced,which makes it possible to learn optimal BN structures for bigger sizes of nodes by accurate algorithms.Integrating the dynamic programming(DP)algorithm with the layering approach,we propose a hybrid algorithm—layered optimal learning(LOL)to learn BN structures.Benefitted by the layering approach,the complexity of the DP algorithm reduces to O(ρ2^n?1)from O(n2^n?1),whereρ<n.Meanwhile,the memory requirements for storing intermediate results are limited to O(C k#/k#^2 )from O(Cn/n^2 ),where k#<n.A case study on learning a standard BN with 50 nodes is conducted.The results demonstrate the superiority of the LOL algorithm,with respect to the Bayesian information criterion(BIC)score criterion,over the hill-climbing,max-min hill-climbing,PC,and three-phrase dependency analysis algorithms.展开更多
Ordering based search methods have advantages over graph based search methods for structure learning of Bayesian networks in terms on the efficiency. With the aim of further increasing the accuracy of ordering based s...Ordering based search methods have advantages over graph based search methods for structure learning of Bayesian networks in terms on the efficiency. With the aim of further increasing the accuracy of ordering based search methods, we first propose to increase the search space, which can facilitate escaping from the local optima. We present our search operators with majorizations, which are easy to implement. Experiments show that the proposed algorithm can obtain significantly more accurate results. With regard to the problem of the decrease on efficiency due to the increase of the search space, we then propose to add path priors as constraints into the swap process. We analyze the coefficient which may influence the performance of the proposed algorithm, the experiments show that the constraints can enhance the efficiency greatly, while has little effect on the accuracy. The final experiments show that, compared to other competitive methods, the proposed algorithm can find better solutions while holding high efficiency at the same time on both synthetic and real data sets.展开更多
Finding out reasonable structures from bulky data is one of the difficulties in modeling of Bayesian network (BN), which is also necessary in promoting the application of BN. This pa- per proposes an immune algorith...Finding out reasonable structures from bulky data is one of the difficulties in modeling of Bayesian network (BN), which is also necessary in promoting the application of BN. This pa- per proposes an immune algorithm based method (BN-IA) for the learning of the BN structure with the idea of vaccination. Further- more, the methods on how to extract the effective vaccines from local optimal structure and root nodes are also described in details. Finally, the simulation studies are implemented with the helicopter convertor BN model and the car start BN model. The comparison results show that the proposed vaccines and the BN-IA can learn the BN structure effectively and efficiently.展开更多
In the post-genomic biology era,the reconstruction of gene regulatory networks from microarray gene expression data is very important to understand the underlying biological system,and it has been a challenging task i...In the post-genomic biology era,the reconstruction of gene regulatory networks from microarray gene expression data is very important to understand the underlying biological system,and it has been a challenging task in bioinformatics.The Bayesian network model has been used in reconstructing the gene regulatory network for its advantages,but how to determine the network structure and parameters is still important to be explored.This paper proposes a two-stage structure learning algorithm which integrates immune evolution algorithm to build a Bayesian network.The new algorithm is evaluated with the use of both simulated and yeast cell cycle data.The experimental results indicate that the proposed algorithm can find many of the known real regulatory relationships from literature and predict the others unknown with high validity and accuracy.展开更多
Bayesian networks (BNs) have become increasingly popular in recent years due to their wide-ranging applications in modeling uncertain knowledge. An essential problem about discrete BNs is learning conditional probabil...Bayesian networks (BNs) have become increasingly popular in recent years due to their wide-ranging applications in modeling uncertain knowledge. An essential problem about discrete BNs is learning conditional probability table (CPT) parameters. If training data are sparse, purely data-driven methods often fail to learn accurate parameters. Then, expert judgments can be introduced to overcome this challenge. Parameter constraints deduced from expert judgments can cause parameter estimates to be consistent with domain knowledge. In addition, Dirichlet priors contain information that helps improve learning accuracy. This paper proposes a constrained Bayesian estimation approach to learn CPTs by incorporating constraints and Dirichlet priors. First, a posterior distribution of BN parameters is developed over a restricted parameter space based on training data and Dirichlet priors. Then, the expectation of the posterior distribution is taken as a parameter estimation. As it is difficult to directly compute the expectation for a continuous distribution with an irregular feasible domain, we apply the Monte Carlo method to approximate it. In the experiments on learning standard BNs, the proposed method outperforms competing methods. It suggests that the proposed method can facilitate solving real-world problems. Additionally, a case study of Wine data demonstrates that the proposed method achieves the highest classification accuracy.展开更多
A new method to evaluate the fitness of the Bayesian networks according to the observed data is provided. The main advantage of this criterion is that it is suitable for both the complete and incomplete cases while th...A new method to evaluate the fitness of the Bayesian networks according to the observed data is provided. The main advantage of this criterion is that it is suitable for both the complete and incomplete cases while the others not. Moreover it facilitates the computation greatly. In order to reduce the search space, the notation of equivalent class proposed by David Chickering is adopted. Instead of using the method directly, the novel criterion, variable ordering, and equivalent class are combined,moreover the proposed mthod avoids some problems caused by the previous one. Later, the genetic algorithm which allows global convergence, lack in the most of the methods searching for Bayesian network is applied to search for a good model in thisspace. To speed up the convergence, the genetic algorithm is combined with the greedy algorithm. Finally, the simulation shows the validity of the proposed approach.展开更多
Structure learning of Bayesian networks is a wellresearched but computationally hard task.For learning Bayesian networks,this paper proposes an improved algorithm based on unconstrained optimization and ant colony opt...Structure learning of Bayesian networks is a wellresearched but computationally hard task.For learning Bayesian networks,this paper proposes an improved algorithm based on unconstrained optimization and ant colony optimization(U-ACO-B) to solve the drawbacks of the ant colony optimization(ACO-B).In this algorithm,firstly,an unconstrained optimization problem is solved to obtain an undirected skeleton,and then the ACO algorithm is used to orientate the edges,thus returning the final structure.In the experimental part of the paper,we compare the performance of the proposed algorithm with ACO-B algorithm.The experimental results show that our method is effective and greatly enhance convergence speed than ACO-B algorithm.展开更多
Learning Bayesian network is an NP-hard problem. When the number of variables is large, the process of searching optimal network structure could be very time consuming and tends to return a structure which is local op...Learning Bayesian network is an NP-hard problem. When the number of variables is large, the process of searching optimal network structure could be very time consuming and tends to return a structure which is local optimal.The particle swarm optimization (PSO) was introduced to the problem of learning Bayesian networks and a novel structure learning algorithm using PSO was proposed. To search in directed acyclic graphs spaces efficiently, a discrete PSO algorithm especially for structure learning was proposed based on the characteristics of Bayesian networks. The results of experiments show that our PSO based algorithm is fast for convergence and can obtain better structures compared with genetic algorithm based algorithms.展开更多
In view of the shortcomings of traditional Bayesian network(BN)structure learning algorithm,such as low efficiency,premature algorithm and poor learning effect,the intelligent algorithm of cuckoo search(CS)and particl...In view of the shortcomings of traditional Bayesian network(BN)structure learning algorithm,such as low efficiency,premature algorithm and poor learning effect,the intelligent algorithm of cuckoo search(CS)and particle swarm optimization(PSO)is selected.Combined with the characteristics of BN structure,a BN structure learning algorithm of CS-PSO is proposed.Firstly,the CS algorithm is improved from the following three aspects:the maximum spanning tree is used to guide the initialization direction of the CS algorithm,the fitness of the solution is used to adjust the optimization and abandoning process of the solution,and PSO algorithm is used to update the position of the CS algorithm.Secondly,according to the structure characteristics of BN,the CS-PSO algorithm is applied to the structure learning of BN.Finally,chest clinic,credit and car diagnosis classic network are utilized as the simulation model,and the modeling and simulation comparison of greedy algorithm,K2 algorithm,CS algorithm and CS-PSO algorithm are carried out.The results show that the CS-PSO algorithm has fast convergence speed,high convergence accuracy and good stability in the structure learning of BN,and it can get the accurate BN structure model faster and better.展开更多
The learning Bayesian network (BN) structure from data is an NP-hard problem and still one of the most exciting chal- lenges in the machine learning. In this work, a novel algorithm is presented which combines ideas...The learning Bayesian network (BN) structure from data is an NP-hard problem and still one of the most exciting chal- lenges in the machine learning. In this work, a novel algorithm is presented which combines ideas from local learning, constraint- based, and search-and-score techniques in a principled and ef- fective way. It first reconstructs the junction tree of a BN and then performs a K2-scoring greedy search to orientate the local edges in the cliques of junction tree. Theoretical and experimental results show the proposed algorithm is capable of handling networks with a large number of variables. Its comparison with the well-known K2 algorithm is also presented.展开更多
How to improve the efficiency of exact learning of the Bayesian network structure is a challenging issue.In this paper,four different causal constraints algorithms are added into score calculations to prune possible p...How to improve the efficiency of exact learning of the Bayesian network structure is a challenging issue.In this paper,four different causal constraints algorithms are added into score calculations to prune possible parent sets,improving state-ofthe-art learning algorithms’efficiency.Experimental results indicate that exact learning algorithms can significantly improve the efficiency with only a slight loss of accuracy.Under causal constraints,these exact learning algorithms can prune about 70%possible parent sets and reduce about 60%running time while only losing no more than 2%accuracy on average.Additionally,with sufficient samples,exact learning algorithms with causal constraints can also obtain the optimal network.In general,adding max-min parents and children constraints has better results in terms of efficiency and accuracy among these four causal constraints algorithms.展开更多
Frequent counting is a very so often required operation in machine learning algorithms. A typical machine learning task, learning the structure of Bayesian network (BN) based on metric scoring, is introduced as an e...Frequent counting is a very so often required operation in machine learning algorithms. A typical machine learning task, learning the structure of Bayesian network (BN) based on metric scoring, is introduced as an example that heavily relies on frequent counting. A fast calculation method for frequent counting enhanced with two cache layers is then presented for learning BN. The main contribution of our approach is to eliminate comparison operations for frequent counting by introducing a multi-radix number system calculation. Both mathematical analysis and empirical comparison between our method and state-of-the-art solution are conducted. The results show that our method is dominantly superior to state-of-the-art solution in solving the problem of learning BN.展开更多
At present Bayesian Networks(BN)are being used widely for demonstrating uncertain knowledge in many disciplines,including biology,computer science,risk analysis,service quality analysis,and business.But they suffer fr...At present Bayesian Networks(BN)are being used widely for demonstrating uncertain knowledge in many disciplines,including biology,computer science,risk analysis,service quality analysis,and business.But they suffer from the problem that when the nodes and edges increase,the structure learning difficulty increases and algorithms become inefficient.To solve this problem,heuristic optimization algorithms are used,which tend to find a near-optimal answer rather than an exact one,with particle swarm optimization(PSO)being one of them.PSO is a swarm intelligence-based algorithm having basic inspiration from flocks of birds(how they search for food).PSO is employed widely because it is easier to code,converges quickly,and can be parallelized easily.We use a recently proposed version of PSO called generalized particle swarm optimization(GEPSO)to learn bayesian network structure.We construct an initial directed acyclic graph(DAG)by using the max-min parent’s children(MMPC)algorithm and cross relative average entropy.ThisDAGis used to create a population for theGEPSO optimization procedure.Moreover,we propose a velocity update procedure to increase the efficiency of the algorithmic search process.Results of the experiments show that as the complexity of the dataset increases,our algorithm Bayesian network generalized particle swarm optimization(BN-GEPSO)outperforms the PSO algorithm in terms of the Bayesian information criterion(BIC)score.展开更多
Creating new molecules with desired properties is a fundamental and challenging problem in chemistry. Reinforcement learning (RL) has shown its utility in this area where the target chemical property values can serve ...Creating new molecules with desired properties is a fundamental and challenging problem in chemistry. Reinforcement learning (RL) has shown its utility in this area where the target chemical property values can serve as a reward signal. At each step of making a new molecule, the RL agent learns selecting an action from a list of many chemically valid actions for a given molecule, implying a great uncertainty associated with its learning. In a traditional implementation of deep RL algorithms, deterministic neural networks are typically employed, thus allowing the agent to choose one action from one sampled action at each step. In this paper, we proposed a new strategy of applying Bayesian neural networks to RL to reduce uncertainty so that the agent can choose one action from a pool of sampled actions at each step, and investigated its benefits in molecule design. Our experiments suggested the Bayesian approach could create molecules of desirable chemical quality while maintained their diversity, a very difficult goal to achieve in machine learning of molecules. We further exploited their diversity by using them to train a generative model to yield more novel drug-like molecules, which were absent in the training molecules as we know novelty is essential for drug candidate molecules. In conclusion, Bayesian approach could offer a balance between exploitation and exploration in RL, and a balance between optimization and diversity in molecule design.展开更多
In many applications,flow measurements are usually sparse and possibly noisy.The reconstruction of a high-resolution flow field from limited and imperfect flow information is significant yet challenging.In this work,w...In many applications,flow measurements are usually sparse and possibly noisy.The reconstruction of a high-resolution flow field from limited and imperfect flow information is significant yet challenging.In this work,we propose an innovative physics-constrained Bayesian deep learning approach to reconstruct flow fields from sparse,noisy velocity data,where equationbased constraints are imposed through the likelihood function and uncertainty of the reconstructed flow can be estimated.Specifically,a Bayesian deep neural network is trained on sparse measurement data to capture the flow field.In the meantime,the violation of physical laws will be penalized on a large number of spatiotemporal points where measurements are not available.A non-parametric variational inference approach is applied to enable efficient physicsconstrained Bayesian learning.Several test cases on idealized vascular flows with synthetic measurement data are studied to demonstrate the merit of the proposed method.展开更多
Discernment of seismic soil liquefaction is a complex and non-linear procedure that is affected by diversified factors of uncertainties and complexity.The Bayesian belief network(BBN)is an effective tool to present a ...Discernment of seismic soil liquefaction is a complex and non-linear procedure that is affected by diversified factors of uncertainties and complexity.The Bayesian belief network(BBN)is an effective tool to present a suitable framework to handle insights into such uncertainties and cause–effect relationships.The intention of this study is to use a hybrid approach methodology for the development of BBN model based on cone penetration test(CPT)case history records to evaluate seismic soil liquefaction potential.In this hybrid approach,naive model is developed initially only by an interpretive structural modeling(ISM)technique using domain knowledge(DK).Subsequently,some useful information about the naive model are embedded as DK in the K2 algorithm to develop a BBN-K2 and DK model.The results of the BBN models are compared and validated with the available artificial neural network(ANN)and C4.5 decision tree(DT)models and found that the BBN model developed by hybrid approach showed compatible and promising results for liquefaction potential assessment.The BBN model developed by hybrid approach provides a viable tool for geotechnical engineers to assess sites conditions susceptible to seismic soil liquefaction.This study also presents sensitivity analysis of the BBN model based on hybrid approach and the most probable explanation of liquefied sites,owing to know the most likely scenario of the liquefaction phenomenon.展开更多
基金Supported by National Natural Science Foundation of China (60496322), Natural Science Foundation of Beijing (4083034), and Scientific Research Common Program of Beijing Municipal Commission.of Education (KM200610005020)_ _ _
文摘This study aims to explore the application of Bayesian analysis based on neural networks and deep learning in data visualization.The research background is that with the increasing amount and complexity of data,traditional data analysis methods have been unable to meet the needs.Research methods include building neural networks and deep learning models,optimizing and improving them through Bayesian analysis,and applying them to the visualization of large-scale data sets.The results show that the neural network combined with Bayesian analysis and deep learning method can effectively improve the accuracy and efficiency of data visualization,and enhance the intuitiveness and depth of data interpretation.The significance of the research is that it provides a new solution for data visualization in the big data environment and helps to further promote the development and application of data science.
文摘Bayesian networks are a powerful class of graphical decision models used to represent causal relationships among variables.However,the reliability and integrity of learned Bayesian network models are highly dependent on the quality of incoming data streams.One of the primary challenges with Bayesian networks is their vulnerability to adversarial data poisoning attacks,wherein malicious data is injected into the training dataset to negatively influence the Bayesian network models and impair their performance.In this research paper,we propose an efficient framework for detecting data poisoning attacks against Bayesian network structure learning algorithms.Our framework utilizes latent variables to quantify the amount of belief between every two nodes in each causal model over time.We use our innovative methodology to tackle an important issue with data poisoning assaults in the context of Bayesian networks.With regard to four different forms of data poisoning attacks,we specifically aim to strengthen the security and dependability of Bayesian network structure learning techniques,such as the PC algorithm.By doing this,we explore the complexity of this area and offer workablemethods for identifying and reducing these sneaky dangers.Additionally,our research investigates one particular use case,the“Visit to Asia Network.”The practical consequences of using uncertainty as a way to spot cases of data poisoning are explored in this inquiry,which is of utmost relevance.Our results demonstrate the promising efficacy of latent variables in detecting and mitigating the threat of data poisoning attacks.Additionally,our proposed latent-based framework proves to be sensitive in detecting malicious data poisoning attacks in the context of stream data.
基金supported by the National Natural Science Foundation of China(6130513361573285)the Fundamental Research Funds for the Central Universities(3102016CG002)
文摘When the training data are insufficient, especially when only a small sample size of data is available, domain knowledge will be taken into the process of learning parameters to improve the performance of the Bayesian networks. In this paper, a new monotonic constraint model is proposed to represent a type of common domain knowledge. And then, the monotonic constraint estimation algorithm is proposed to learn the parameters with the monotonic constraint model. In order to demonstrate the superiority of the proposed algorithm, series of experiments are carried out. The experiment results show that the proposed algorithm is able to obtain more accurate parameters compared to some existing algorithms while the complexity is not the highest.
基金supported by the National Natural Science Foundation of China(61573285)
文摘It is unpractical to learn the optimal structure of a big Bayesian network(BN)by exhausting the feasible structures,since the number of feasible structures is super exponential on the number of nodes.This paper proposes an approach to layer nodes of a BN by using the conditional independence testing.The parents of a node layer only belong to the layer,or layers who have priority over the layer.When a set of nodes has been layered,the number of feasible structures over the nodes can be remarkably reduced,which makes it possible to learn optimal BN structures for bigger sizes of nodes by accurate algorithms.Integrating the dynamic programming(DP)algorithm with the layering approach,we propose a hybrid algorithm—layered optimal learning(LOL)to learn BN structures.Benefitted by the layering approach,the complexity of the DP algorithm reduces to O(ρ2^n?1)from O(n2^n?1),whereρ<n.Meanwhile,the memory requirements for storing intermediate results are limited to O(C k#/k#^2 )from O(Cn/n^2 ),where k#<n.A case study on learning a standard BN with 50 nodes is conducted.The results demonstrate the superiority of the LOL algorithm,with respect to the Bayesian information criterion(BIC)score criterion,over the hill-climbing,max-min hill-climbing,PC,and three-phrase dependency analysis algorithms.
基金supported by the National Natural Science Fundation of China(61573285)the Doctoral Fundation of China(2013ZC53037)
文摘Ordering based search methods have advantages over graph based search methods for structure learning of Bayesian networks in terms on the efficiency. With the aim of further increasing the accuracy of ordering based search methods, we first propose to increase the search space, which can facilitate escaping from the local optima. We present our search operators with majorizations, which are easy to implement. Experiments show that the proposed algorithm can obtain significantly more accurate results. With regard to the problem of the decrease on efficiency due to the increase of the search space, we then propose to add path priors as constraints into the swap process. We analyze the coefficient which may influence the performance of the proposed algorithm, the experiments show that the constraints can enhance the efficiency greatly, while has little effect on the accuracy. The final experiments show that, compared to other competitive methods, the proposed algorithm can find better solutions while holding high efficiency at the same time on both synthetic and real data sets.
基金supported by the National Natural Science Foundation of China(7110111671271170)+1 种基金the Program for New Century Excellent Talents in University(NCET-13-0475)the Basic Research Foundation of NPU(JC20120228)
文摘Finding out reasonable structures from bulky data is one of the difficulties in modeling of Bayesian network (BN), which is also necessary in promoting the application of BN. This pa- per proposes an immune algorithm based method (BN-IA) for the learning of the BN structure with the idea of vaccination. Further- more, the methods on how to extract the effective vaccines from local optimal structure and root nodes are also described in details. Finally, the simulation studies are implemented with the helicopter convertor BN model and the car start BN model. The comparison results show that the proposed vaccines and the BN-IA can learn the BN structure effectively and efficiently.
基金supported by National Natural Science Foundation of China (Grant Nos. 60433020, 60175024 and 60773095)European Commission under grant No. TH/Asia Link/010 (111084)the Key Science-Technology Project of the National Education Ministry of China (Grant No. 02090),and the Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, Jilin University, P. R. China
文摘In the post-genomic biology era,the reconstruction of gene regulatory networks from microarray gene expression data is very important to understand the underlying biological system,and it has been a challenging task in bioinformatics.The Bayesian network model has been used in reconstructing the gene regulatory network for its advantages,but how to determine the network structure and parameters is still important to be explored.This paper proposes a two-stage structure learning algorithm which integrates immune evolution algorithm to build a Bayesian network.The new algorithm is evaluated with the use of both simulated and yeast cell cycle data.The experimental results indicate that the proposed algorithm can find many of the known real regulatory relationships from literature and predict the others unknown with high validity and accuracy.
基金supported by the National Natural Science Foundation of China(61573285)the Innovation Foundation for Doctor Dissertation of Northwestern Polytechnical University,China(CX201619)
文摘Bayesian networks (BNs) have become increasingly popular in recent years due to their wide-ranging applications in modeling uncertain knowledge. An essential problem about discrete BNs is learning conditional probability table (CPT) parameters. If training data are sparse, purely data-driven methods often fail to learn accurate parameters. Then, expert judgments can be introduced to overcome this challenge. Parameter constraints deduced from expert judgments can cause parameter estimates to be consistent with domain knowledge. In addition, Dirichlet priors contain information that helps improve learning accuracy. This paper proposes a constrained Bayesian estimation approach to learn CPTs by incorporating constraints and Dirichlet priors. First, a posterior distribution of BN parameters is developed over a restricted parameter space based on training data and Dirichlet priors. Then, the expectation of the posterior distribution is taken as a parameter estimation. As it is difficult to directly compute the expectation for a continuous distribution with an irregular feasible domain, we apply the Monte Carlo method to approximate it. In the experiments on learning standard BNs, the proposed method outperforms competing methods. It suggests that the proposed method can facilitate solving real-world problems. Additionally, a case study of Wine data demonstrates that the proposed method achieves the highest classification accuracy.
基金This project was supported by the National Natural Science Foundation of China (70572045).
文摘A new method to evaluate the fitness of the Bayesian networks according to the observed data is provided. The main advantage of this criterion is that it is suitable for both the complete and incomplete cases while the others not. Moreover it facilitates the computation greatly. In order to reduce the search space, the notation of equivalent class proposed by David Chickering is adopted. Instead of using the method directly, the novel criterion, variable ordering, and equivalent class are combined,moreover the proposed mthod avoids some problems caused by the previous one. Later, the genetic algorithm which allows global convergence, lack in the most of the methods searching for Bayesian network is applied to search for a good model in thisspace. To speed up the convergence, the genetic algorithm is combined with the greedy algorithm. Finally, the simulation shows the validity of the proposed approach.
基金supported by the National Natural Science Foundation of China (60974082,11171094)the Fundamental Research Funds for the Central Universities (K50510700004)+1 种基金the Foundation and Advanced Technology Research Program of Henan Province (102300410264)the Basic Research Program of the Education Department of Henan Province (2010A110010)
文摘Structure learning of Bayesian networks is a wellresearched but computationally hard task.For learning Bayesian networks,this paper proposes an improved algorithm based on unconstrained optimization and ant colony optimization(U-ACO-B) to solve the drawbacks of the ant colony optimization(ACO-B).In this algorithm,firstly,an unconstrained optimization problem is solved to obtain an undirected skeleton,and then the ACO algorithm is used to orientate the edges,thus returning the final structure.In the experimental part of the paper,we compare the performance of the proposed algorithm with ACO-B algorithm.The experimental results show that our method is effective and greatly enhance convergence speed than ACO-B algorithm.
基金National Natural Science Foundation of Chi-na (No.60374071)Zhenjiang Commissionof Science and Technology ( No.2003C11009)
文摘Learning Bayesian network is an NP-hard problem. When the number of variables is large, the process of searching optimal network structure could be very time consuming and tends to return a structure which is local optimal.The particle swarm optimization (PSO) was introduced to the problem of learning Bayesian networks and a novel structure learning algorithm using PSO was proposed. To search in directed acyclic graphs spaces efficiently, a discrete PSO algorithm especially for structure learning was proposed based on the characteristics of Bayesian networks. The results of experiments show that our PSO based algorithm is fast for convergence and can obtain better structures compared with genetic algorithm based algorithms.
基金National Natural Science Foundation of China(Nos.61164010,61233003)。
文摘In view of the shortcomings of traditional Bayesian network(BN)structure learning algorithm,such as low efficiency,premature algorithm and poor learning effect,the intelligent algorithm of cuckoo search(CS)and particle swarm optimization(PSO)is selected.Combined with the characteristics of BN structure,a BN structure learning algorithm of CS-PSO is proposed.Firstly,the CS algorithm is improved from the following three aspects:the maximum spanning tree is used to guide the initialization direction of the CS algorithm,the fitness of the solution is used to adjust the optimization and abandoning process of the solution,and PSO algorithm is used to update the position of the CS algorithm.Secondly,according to the structure characteristics of BN,the CS-PSO algorithm is applied to the structure learning of BN.Finally,chest clinic,credit and car diagnosis classic network are utilized as the simulation model,and the modeling and simulation comparison of greedy algorithm,K2 algorithm,CS algorithm and CS-PSO algorithm are carried out.The results show that the CS-PSO algorithm has fast convergence speed,high convergence accuracy and good stability in the structure learning of BN,and it can get the accurate BN structure model faster and better.
基金supported by the National Natural Science Fundation of China (6097408261075055)the Fundamental Research Funds for the Central Universities (K50510700004)
文摘The learning Bayesian network (BN) structure from data is an NP-hard problem and still one of the most exciting chal- lenges in the machine learning. In this work, a novel algorithm is presented which combines ideas from local learning, constraint- based, and search-and-score techniques in a principled and ef- fective way. It first reconstructs the junction tree of a BN and then performs a K2-scoring greedy search to orientate the local edges in the cliques of junction tree. Theoretical and experimental results show the proposed algorithm is capable of handling networks with a large number of variables. Its comparison with the well-known K2 algorithm is also presented.
基金supported by the National Natural Science Foundation of China(61573285).
文摘How to improve the efficiency of exact learning of the Bayesian network structure is a challenging issue.In this paper,four different causal constraints algorithms are added into score calculations to prune possible parent sets,improving state-ofthe-art learning algorithms’efficiency.Experimental results indicate that exact learning algorithms can significantly improve the efficiency with only a slight loss of accuracy.Under causal constraints,these exact learning algorithms can prune about 70%possible parent sets and reduce about 60%running time while only losing no more than 2%accuracy on average.Additionally,with sufficient samples,exact learning algorithms with causal constraints can also obtain the optimal network.In general,adding max-min parents and children constraints has better results in terms of efficiency and accuracy among these four causal constraints algorithms.
基金supported by National Natural Science Foundation of China (No.60970055)
文摘Frequent counting is a very so often required operation in machine learning algorithms. A typical machine learning task, learning the structure of Bayesian network (BN) based on metric scoring, is introduced as an example that heavily relies on frequent counting. A fast calculation method for frequent counting enhanced with two cache layers is then presented for learning BN. The main contribution of our approach is to eliminate comparison operations for frequent counting by introducing a multi-radix number system calculation. Both mathematical analysis and empirical comparison between our method and state-of-the-art solution are conducted. The results show that our method is dominantly superior to state-of-the-art solution in solving the problem of learning BN.
基金The authors extended their appreciation to the Deanship of Scientific Research at King Khalid University for funding this work through the Large Groups Project under grant number RGP.2/132/43。
文摘At present Bayesian Networks(BN)are being used widely for demonstrating uncertain knowledge in many disciplines,including biology,computer science,risk analysis,service quality analysis,and business.But they suffer from the problem that when the nodes and edges increase,the structure learning difficulty increases and algorithms become inefficient.To solve this problem,heuristic optimization algorithms are used,which tend to find a near-optimal answer rather than an exact one,with particle swarm optimization(PSO)being one of them.PSO is a swarm intelligence-based algorithm having basic inspiration from flocks of birds(how they search for food).PSO is employed widely because it is easier to code,converges quickly,and can be parallelized easily.We use a recently proposed version of PSO called generalized particle swarm optimization(GEPSO)to learn bayesian network structure.We construct an initial directed acyclic graph(DAG)by using the max-min parent’s children(MMPC)algorithm and cross relative average entropy.ThisDAGis used to create a population for theGEPSO optimization procedure.Moreover,we propose a velocity update procedure to increase the efficiency of the algorithmic search process.Results of the experiments show that as the complexity of the dataset increases,our algorithm Bayesian network generalized particle swarm optimization(BN-GEPSO)outperforms the PSO algorithm in terms of the Bayesian information criterion(BIC)score.
文摘Creating new molecules with desired properties is a fundamental and challenging problem in chemistry. Reinforcement learning (RL) has shown its utility in this area where the target chemical property values can serve as a reward signal. At each step of making a new molecule, the RL agent learns selecting an action from a list of many chemically valid actions for a given molecule, implying a great uncertainty associated with its learning. In a traditional implementation of deep RL algorithms, deterministic neural networks are typically employed, thus allowing the agent to choose one action from one sampled action at each step. In this paper, we proposed a new strategy of applying Bayesian neural networks to RL to reduce uncertainty so that the agent can choose one action from a pool of sampled actions at each step, and investigated its benefits in molecule design. Our experiments suggested the Bayesian approach could create molecules of desirable chemical quality while maintained their diversity, a very difficult goal to achieve in machine learning of molecules. We further exploited their diversity by using them to train a generative model to yield more novel drug-like molecules, which were absent in the training molecules as we know novelty is essential for drug candidate molecules. In conclusion, Bayesian approach could offer a balance between exploitation and exploration in RL, and a balance between optimization and diversity in molecule design.
基金support from the National Science Foundation (Grant CMMI-1934300)Defense Advanced Research Projects Agency (DARPA) under the Physics of Artificial Intelligence (PAI) program (Grant HR00111890034)partial funding support by graduate fellowship from China Scholarship Council (CSC) in this effort
文摘In many applications,flow measurements are usually sparse and possibly noisy.The reconstruction of a high-resolution flow field from limited and imperfect flow information is significant yet challenging.In this work,we propose an innovative physics-constrained Bayesian deep learning approach to reconstruct flow fields from sparse,noisy velocity data,where equationbased constraints are imposed through the likelihood function and uncertainty of the reconstructed flow can be estimated.Specifically,a Bayesian deep neural network is trained on sparse measurement data to capture the flow field.In the meantime,the violation of physical laws will be penalized on a large number of spatiotemporal points where measurements are not available.A non-parametric variational inference approach is applied to enable efficient physicsconstrained Bayesian learning.Several test cases on idealized vascular flows with synthetic measurement data are studied to demonstrate the merit of the proposed method.
基金Projects(2016YFE0200100,2018YFC1505300-5.3)supported by the National Key Research&Development Plan of ChinaProject(51639002)supported by the Key Program of National Natural Science Foundation of China
文摘Discernment of seismic soil liquefaction is a complex and non-linear procedure that is affected by diversified factors of uncertainties and complexity.The Bayesian belief network(BBN)is an effective tool to present a suitable framework to handle insights into such uncertainties and cause–effect relationships.The intention of this study is to use a hybrid approach methodology for the development of BBN model based on cone penetration test(CPT)case history records to evaluate seismic soil liquefaction potential.In this hybrid approach,naive model is developed initially only by an interpretive structural modeling(ISM)technique using domain knowledge(DK).Subsequently,some useful information about the naive model are embedded as DK in the K2 algorithm to develop a BBN-K2 and DK model.The results of the BBN models are compared and validated with the available artificial neural network(ANN)and C4.5 decision tree(DT)models and found that the BBN model developed by hybrid approach showed compatible and promising results for liquefaction potential assessment.The BBN model developed by hybrid approach provides a viable tool for geotechnical engineers to assess sites conditions susceptible to seismic soil liquefaction.This study also presents sensitivity analysis of the BBN model based on hybrid approach and the most probable explanation of liquefied sites,owing to know the most likely scenario of the liquefaction phenomenon.