The conditional kernel correlation is proposed to measure the relationship between two random variables under covariates for multivariate data.Relying on the framework of reproducing kernel Hilbert spaces,we give the ...The conditional kernel correlation is proposed to measure the relationship between two random variables under covariates for multivariate data.Relying on the framework of reproducing kernel Hilbert spaces,we give the definitions of the conditional kernel covariance and conditional kernel correlation.We also provide their respective sample estimators and give the asymptotic properties,which help us construct a conditional independence test.According to the numerical results,the proposed test is more effective compared to the existing one under the considered scenarios.A real data is further analyzed to illustrate the efficacy of the proposed method.展开更多
Bayesian network is a popular approach to uncertainty knowledge representation and reasoning. Structure learning is the first step to learn a Bayesian network. Score-based methods are one of the most popular ways of l...Bayesian network is a popular approach to uncertainty knowledge representation and reasoning. Structure learning is the first step to learn a Bayesian network. Score-based methods are one of the most popular ways of learning the structure. In most cases, the score of Bayesian network is defined as adding the log-likelihood score and complexity score by using the penalty function. If the penalty function is set unreasonably, it may hurt the performance of structure search. Thus, Bayesian network structure learning is essentially a bi-objective optimization problem. However, the existing bi-objective structure learning algorithms can only be applied to small-scale networks. To this end, this paper proposes a bi-objective evolutionary Bayesian network structure learning algorithm via skeleton constraint (BBS) for the medium-scale networks. To boost the performance of searching, BBS introduces the random order prior (ROP) initial operator. ROP generates a skeleton to constrain the searching space, which is the key to expanding the scale of structure learning problems. Then, the acyclic structures are guaranteed by adding the orders of variables in the initial skeleton. After that, BBS designs the Pareto rank based crossover and skeleton guided mutation operators. The operators operate on the skeleton obtained in ROP to make the search more targeted. Finally, BBS provides a strategy to choose the final solution. The experimental results show that BBS can always find the structure which is closer to the ground truth compared with the single-objective structure learning methods. Furthermore, compared with the existing bi-objective structure learning methods, BBS is scalable and can be applied to medium-scale Bayesian network datasets. On the educational problem of discovering the influencing factors of students’ academic performance, BBS provides higher quality solutions and is featured with the flexibility of solution selection compared with the widely-used Bayesian network structure learning methods.展开更多
Learning Bayesian network structure is one of the most exciting challenges in machine learning. Discovering a correct skeleton of a directed acyclic graph(DAG) is the foundation for dependency analysis algorithms fo...Learning Bayesian network structure is one of the most exciting challenges in machine learning. Discovering a correct skeleton of a directed acyclic graph(DAG) is the foundation for dependency analysis algorithms for this problem. Considering the unreliability of high order condition independence(CI) tests, and to improve the efficiency of a dependency analysis algorithm, the key steps are to use few numbers of CI tests and reduce the sizes of conditioning sets as much as possible. Based on these reasons and inspired by the algorithm PC, we present an algorithm, named fast and efficient PC(FEPC), for learning the adjacent neighbourhood of every variable. FEPC implements the CI tests by three kinds of orders, which reduces the high order CI tests significantly. Compared with current algorithm proposals, the experiment results show that FEPC has better accuracy with fewer numbers of condition independence tests and smaller size of conditioning sets. The highest reduction percentage of CI test is 83.3% by EFPC compared with PC algorithm.展开更多
Inferring gene regulatory networks (GRNs) is a challenging task in Bioinformatics. In this paper, an algorithm, PCHMS, is introduced to infer GRNs. This method applies the path consistency (PC) algorithm based on ...Inferring gene regulatory networks (GRNs) is a challenging task in Bioinformatics. In this paper, an algorithm, PCHMS, is introduced to infer GRNs. This method applies the path consistency (PC) algorithm based on conditional mutual information test (PCA-CMI). In the PC-based algorithms the separator set is determined to detect the dependency between variables. The PCHMS algorithm attempts to select the set in the smart way. For this purpose, the edges of resulted skeleton are directed based on PC algorithm direction rule and mutual information test (MIT) score. Then the separator set is selected according to the directed network by considering a suitable sequential order of genes. The effectiveness of this method is benchmarked through several networks from the DREAM challenge and the widely used SOS DNA repair network of Escherichia coll. Results show that applying the PCHMS algorithm improves the precision of learning the structure of the GRNs in comparison with current popular approaches.展开更多
基金partially supported by Knowledge Innovation Program of Hubei Province(No.2019CFB810)partially supported by NSFC(No.12325110)the CAS Project for Young Scientists in Basic Research(No.YSBR-034)。
文摘The conditional kernel correlation is proposed to measure the relationship between two random variables under covariates for multivariate data.Relying on the framework of reproducing kernel Hilbert spaces,we give the definitions of the conditional kernel covariance and conditional kernel correlation.We also provide their respective sample estimators and give the asymptotic properties,which help us construct a conditional independence test.According to the numerical results,the proposed test is more effective compared to the existing one under the considered scenarios.A real data is further analyzed to illustrate the efficacy of the proposed method.
基金supported by the Fundamental Research Funds for the Central Universities,the Science and Technology Commission of Shanghai Municipality(No.19511120601)the Scientific and Technological Innovation 2030 Major Projects(No.2018AAA0100902)+1 种基金the CCF-AFSG Research Fund(No.CCF-AFSG RF20220205)the“Chenguang Program”sponsored by Shanghai Education Development Foundation and Shanghai Municipal Education Commission(No.21CGA32).
文摘Bayesian network is a popular approach to uncertainty knowledge representation and reasoning. Structure learning is the first step to learn a Bayesian network. Score-based methods are one of the most popular ways of learning the structure. In most cases, the score of Bayesian network is defined as adding the log-likelihood score and complexity score by using the penalty function. If the penalty function is set unreasonably, it may hurt the performance of structure search. Thus, Bayesian network structure learning is essentially a bi-objective optimization problem. However, the existing bi-objective structure learning algorithms can only be applied to small-scale networks. To this end, this paper proposes a bi-objective evolutionary Bayesian network structure learning algorithm via skeleton constraint (BBS) for the medium-scale networks. To boost the performance of searching, BBS introduces the random order prior (ROP) initial operator. ROP generates a skeleton to constrain the searching space, which is the key to expanding the scale of structure learning problems. Then, the acyclic structures are guaranteed by adding the orders of variables in the initial skeleton. After that, BBS designs the Pareto rank based crossover and skeleton guided mutation operators. The operators operate on the skeleton obtained in ROP to make the search more targeted. Finally, BBS provides a strategy to choose the final solution. The experimental results show that BBS can always find the structure which is closer to the ground truth compared with the single-objective structure learning methods. Furthermore, compared with the existing bi-objective structure learning methods, BBS is scalable and can be applied to medium-scale Bayesian network datasets. On the educational problem of discovering the influencing factors of students’ academic performance, BBS provides higher quality solutions and is featured with the flexibility of solution selection compared with the widely-used Bayesian network structure learning methods.
基金Supported by the National Natural Science Foundation of China(61403290,11301408,11401454)the Foundation for Youths of Shaanxi Province(2014JQ1020)+1 种基金the Foundation of Baoji City(2013R7-3)the Foundation of Baoji University of Arts and Sciences(ZK15081)
文摘Learning Bayesian network structure is one of the most exciting challenges in machine learning. Discovering a correct skeleton of a directed acyclic graph(DAG) is the foundation for dependency analysis algorithms for this problem. Considering the unreliability of high order condition independence(CI) tests, and to improve the efficiency of a dependency analysis algorithm, the key steps are to use few numbers of CI tests and reduce the sizes of conditioning sets as much as possible. Based on these reasons and inspired by the algorithm PC, we present an algorithm, named fast and efficient PC(FEPC), for learning the adjacent neighbourhood of every variable. FEPC implements the CI tests by three kinds of orders, which reduces the high order CI tests significantly. Compared with current algorithm proposals, the experiment results show that FEPC has better accuracy with fewer numbers of condition independence tests and smaller size of conditioning sets. The highest reduction percentage of CI test is 83.3% by EFPC compared with PC algorithm.
文摘Inferring gene regulatory networks (GRNs) is a challenging task in Bioinformatics. In this paper, an algorithm, PCHMS, is introduced to infer GRNs. This method applies the path consistency (PC) algorithm based on conditional mutual information test (PCA-CMI). In the PC-based algorithms the separator set is determined to detect the dependency between variables. The PCHMS algorithm attempts to select the set in the smart way. For this purpose, the edges of resulted skeleton are directed based on PC algorithm direction rule and mutual information test (MIT) score. Then the separator set is selected according to the directed network by considering a suitable sequential order of genes. The effectiveness of this method is benchmarked through several networks from the DREAM challenge and the widely used SOS DNA repair network of Escherichia coll. Results show that applying the PCHMS algorithm improves the precision of learning the structure of the GRNs in comparison with current popular approaches.