Numerous models have been proposed to reduce the classification error of Naive Bayes by weakening its attribute independence assumption and some have demonstrated remarkable error performance. Considering that ensembl...Numerous models have been proposed to reduce the classification error of Naive Bayes by weakening its attribute independence assumption and some have demonstrated remarkable error performance. Considering that ensemble learning is an effective method of reducing the classifmation error of the classifier, this paper proposes a double-layer Bayesian classifier ensembles (DLBCE) algorithm based on frequent itemsets. DLBCE constructs a double-layer Bayesian classifier (DLBC) for each frequent itemset the new instance contained and finally ensembles all the classifiers by assigning different weight to different classifier according to the conditional mutual information. The experimental results show that the proposed algorithm outperforms other outstanding algorithms.展开更多
Detection and clarification of cause-effect relationships among variables is an important problem in time series analysis.This paper provides a method that employs both mutual information and conditional mutual inform...Detection and clarification of cause-effect relationships among variables is an important problem in time series analysis.This paper provides a method that employs both mutual information and conditional mutual information to identify the causal structure of multivariate time series causal graphical models.A three-step procedure is developed to learn the contemporaneous and the lagged causal relationships of time series causal graphs.Contrary to conventional constraint-based algorithm, the proposed algorithm does not involve any special kinds of distribution and is nonparametric.These properties are especially appealing for inference of time series causal graphs when the prior knowledge about the data model is not available.Simulations and case analysis demonstrate the effectiveness of the method.展开更多
The general mutual information (GMI) and general conditional mutual information (GCMI) are considered to measure lag dependences in nonlinear time series. Both of the measures have the property of invariance with ...The general mutual information (GMI) and general conditional mutual information (GCMI) are considered to measure lag dependences in nonlinear time series. Both of the measures have the property of invariance with transform. The statistics based on GMI and GCMI are estimated using the correlation integral. Under the hypothesis of independent series, the estimators have Gaussian asymptotic distributions. Simulations applied to generated nonlinear series demonstrate that the methods appear to find frequently the correct lags.展开更多
The conditional mutualism between scatterhoarders and trees varies on a continuum from mutualism to antagonism and can change across time and space,and among species.We examined 4 tree species(red oak[Quercus rubra],w...The conditional mutualism between scatterhoarders and trees varies on a continuum from mutualism to antagonism and can change across time and space,and among species.We examined 4 tree species(red oak[Quercus rubra],white oak[Quercus alba],American chestnut[Castanea dentata]and hybrid chestnut[C.dentata×Castanea] mollissima)across 5 sites and 3 years to quantify the variability in this conditional mutualism.We used a published model to compare the rates of seed emergence with and without burial to the probability that seeds will be cached and left uneaten by scatterhoarders to quantify variation in the conditional mutualism that can be explained by environmental variation among sites,years,species,and seed provenance within species.All species tested had increased emergence when buried.However,comparing benefits of burial to the probability of caching by scatterhoarders indicated a mutualism in red oak,while white oak was nearly always antagonistic.Chestnut was variable around the boundary between mutualism and antagonism,indicating a high degree of context dependence in the relationship with scatterhoarders.We found that different seed provenances did not vary in their potential for mutualism.Temperature did not explain microsite differences in seed emergence in any of the species tested.In hybrid chestnut only,emergence on the surface declined with soil moisture in the fall.By quantifying the variation in the conditional mutualism that was not caused by changes in scatterhoarder behavior,we show that environmental conditions and seed traits are an important and underappreciated component of the variation in the relationship between trees and scatterhoarders.展开更多
An information theory method is proposed to test the. Granger causality and contemporaneous conditional independence in Granger causality graph models. In the graphs, the vertex set denotes the component series of the...An information theory method is proposed to test the. Granger causality and contemporaneous conditional independence in Granger causality graph models. In the graphs, the vertex set denotes the component series of the multivariate time series, and the directed edges denote causal dependence, while the undirected edges reflect the instantaneous dependence. The presence of the edges is measured by a statistics based on conditional mutual information and tested by a permutation procedure. Furthermore, for the existed relations, a statistics based on the difference between general conditional mutual information and linear conditional mutual information is proposed to test the nonlinearity. The significance of the nonlinear test statistics is determined by a bootstrap method based on surrogate data. We investigate the finite sample behavior of the procedure through simulation time series with different dependence structures, including linear and nonlinear relations.展开更多
Detection and clarification of cause-effect relationships among variables is an important problem in time series analysis. Traditional causality inference methods have a salient limitation that the model must be linea...Detection and clarification of cause-effect relationships among variables is an important problem in time series analysis. Traditional causality inference methods have a salient limitation that the model must be linear and with Gaussian noise. Although additive model regression can effectively infer the nonlinear causal relationships of additive nonlinear time series, it suffers from the limitation that contemporaneous causal relationships of variables must be linear and not always valid to test conditional independence relations. This paper provides a nonparametric method that employs both mutual information and conditional mutual information to identify causal structure of a class of nonlinear time series models, which extends the additive nonlinear times series to nonlinear structural vector autoregressive models. An algorithm is developed to learn the contemporaneous and the lagged causal relationships of variables. Simulations demonstrate the effectiveness of the nroosed method.展开更多
Inferring gene regulatory networks (GRNs) is a challenging task in Bioinformatics. In this paper, an algorithm, PCHMS, is introduced to infer GRNs. This method applies the path consistency (PC) algorithm based on ...Inferring gene regulatory networks (GRNs) is a challenging task in Bioinformatics. In this paper, an algorithm, PCHMS, is introduced to infer GRNs. This method applies the path consistency (PC) algorithm based on conditional mutual information test (PCA-CMI). In the PC-based algorithms the separator set is determined to detect the dependency between variables. The PCHMS algorithm attempts to select the set in the smart way. For this purpose, the edges of resulted skeleton are directed based on PC algorithm direction rule and mutual information test (MIT) score. Then the separator set is selected according to the directed network by considering a suitable sequential order of genes. The effectiveness of this method is benchmarked through several networks from the DREAM challenge and the widely used SOS DNA repair network of Escherichia coll. Results show that applying the PCHMS algorithm improves the precision of learning the structure of the GRNs in comparison with current popular approaches.展开更多
基金supported by National Natural Science Foundation of China (Nos. 61073133, 60973067, and 61175053)Fundamental Research Funds for the Central Universities of China(No. 2011ZD010)
文摘Numerous models have been proposed to reduce the classification error of Naive Bayes by weakening its attribute independence assumption and some have demonstrated remarkable error performance. Considering that ensemble learning is an effective method of reducing the classifmation error of the classifier, this paper proposes a double-layer Bayesian classifier ensembles (DLBCE) algorithm based on frequent itemsets. DLBCE constructs a double-layer Bayesian classifier (DLBC) for each frequent itemset the new instance contained and finally ensembles all the classifiers by assigning different weight to different classifier according to the conditional mutual information. The experimental results show that the proposed algorithm outperforms other outstanding algorithms.
基金supported by the National Natural Science Foundation of China under Grant Nos.60972150, 10926197,61201323
文摘Detection and clarification of cause-effect relationships among variables is an important problem in time series analysis.This paper provides a method that employs both mutual information and conditional mutual information to identify the causal structure of multivariate time series causal graphical models.A three-step procedure is developed to learn the contemporaneous and the lagged causal relationships of time series causal graphs.Contrary to conventional constraint-based algorithm, the proposed algorithm does not involve any special kinds of distribution and is nonparametric.These properties are especially appealing for inference of time series causal graphs when the prior knowledge about the data model is not available.Simulations and case analysis demonstrate the effectiveness of the method.
基金Supported by the National Natural Science Foundation of China (Grant Nos.60375003 60972150)the Science and Technology Innovation Foundation of Northwestern Polytechnical University (Grant No.2007KJ01033)
文摘The general mutual information (GMI) and general conditional mutual information (GCMI) are considered to measure lag dependences in nonlinear time series. Both of the measures have the property of invariance with transform. The statistics based on GMI and GCMI are estimated using the correlation integral. Under the hypothesis of independent series, the estimators have Gaussian asymptotic distributions. Simulations applied to generated nonlinear series demonstrate that the methods appear to find frequently the correct lags.
基金This work was supported by a Charles Center Honors Fellowship to ASG and Ferguson Fund Awards for Undergraduate Research to GMS and ASGMAS recognizes the support of a Bullard Fellowship from Harvard Forest,Harvard University,current support from the U.S.National Science Foundation(DEB 15556707)the H.Fenner Research Endowment of Wilkes University.
文摘The conditional mutualism between scatterhoarders and trees varies on a continuum from mutualism to antagonism and can change across time and space,and among species.We examined 4 tree species(red oak[Quercus rubra],white oak[Quercus alba],American chestnut[Castanea dentata]and hybrid chestnut[C.dentata×Castanea] mollissima)across 5 sites and 3 years to quantify the variability in this conditional mutualism.We used a published model to compare the rates of seed emergence with and without burial to the probability that seeds will be cached and left uneaten by scatterhoarders to quantify variation in the conditional mutualism that can be explained by environmental variation among sites,years,species,and seed provenance within species.All species tested had increased emergence when buried.However,comparing benefits of burial to the probability of caching by scatterhoarders indicated a mutualism in red oak,while white oak was nearly always antagonistic.Chestnut was variable around the boundary between mutualism and antagonism,indicating a high degree of context dependence in the relationship with scatterhoarders.We found that different seed provenances did not vary in their potential for mutualism.Temperature did not explain microsite differences in seed emergence in any of the species tested.In hybrid chestnut only,emergence on the surface declined with soil moisture in the fall.By quantifying the variation in the conditional mutualism that was not caused by changes in scatterhoarder behavior,we show that environmental conditions and seed traits are an important and underappreciated component of the variation in the relationship between trees and scatterhoarders.
基金supported by the National Natural Science Foundation of China(Grant No.60375003)the Chinese Aviation Foundation(Grant No.03153059).
文摘An information theory method is proposed to test the. Granger causality and contemporaneous conditional independence in Granger causality graph models. In the graphs, the vertex set denotes the component series of the multivariate time series, and the directed edges denote causal dependence, while the undirected edges reflect the instantaneous dependence. The presence of the edges is measured by a statistics based on conditional mutual information and tested by a permutation procedure. Furthermore, for the existed relations, a statistics based on the difference between general conditional mutual information and linear conditional mutual information is proposed to test the nonlinearity. The significance of the nonlinear test statistics is determined by a bootstrap method based on surrogate data. We investigate the finite sample behavior of the procedure through simulation time series with different dependence structures, including linear and nonlinear relations.
基金supported by the National Natural Science Foundation of China under Grant Nos.60972150 and 10926197
文摘Detection and clarification of cause-effect relationships among variables is an important problem in time series analysis. Traditional causality inference methods have a salient limitation that the model must be linear and with Gaussian noise. Although additive model regression can effectively infer the nonlinear causal relationships of additive nonlinear time series, it suffers from the limitation that contemporaneous causal relationships of variables must be linear and not always valid to test conditional independence relations. This paper provides a nonparametric method that employs both mutual information and conditional mutual information to identify causal structure of a class of nonlinear time series models, which extends the additive nonlinear times series to nonlinear structural vector autoregressive models. An algorithm is developed to learn the contemporaneous and the lagged causal relationships of variables. Simulations demonstrate the effectiveness of the nroosed method.
文摘Inferring gene regulatory networks (GRNs) is a challenging task in Bioinformatics. In this paper, an algorithm, PCHMS, is introduced to infer GRNs. This method applies the path consistency (PC) algorithm based on conditional mutual information test (PCA-CMI). In the PC-based algorithms the separator set is determined to detect the dependency between variables. The PCHMS algorithm attempts to select the set in the smart way. For this purpose, the edges of resulted skeleton are directed based on PC algorithm direction rule and mutual information test (MIT) score. Then the separator set is selected according to the directed network by considering a suitable sequential order of genes. The effectiveness of this method is benchmarked through several networks from the DREAM challenge and the widely used SOS DNA repair network of Escherichia coll. Results show that applying the PCHMS algorithm improves the precision of learning the structure of the GRNs in comparison with current popular approaches.