A new procedure of learning in Gaussian graphical models is proposed under the assumption that samples are possibly dependent.This assumption,which is pragmatically applied in various areas of multivariate analysis ra...A new procedure of learning in Gaussian graphical models is proposed under the assumption that samples are possibly dependent.This assumption,which is pragmatically applied in various areas of multivariate analysis ranging from bioinformatics to finance,makes standard Gaussian graphical models(GGMs) unsuitable.We demonstrate that the advantage of modeling dependence among samples is that the true discovery rate and positive predictive value are improved substantially than if standard GGMs are applied and the dependence among samples is ignored.The new method,called matrix-variate Gaussian graphical models(MGGMs),involves simultaneously modeling variable and sample dependencies with the matrix-normal distribution.The computation is carried out using a Markov chain Monte Carlo(MCMC) sampling scheme for graphical model determination and parameter estimation.Simulation studies and two real-world examples in biology and finance further illustrate the benefits of the new models.展开更多
The main aim of this paper is to compare the stability, in terms of systemic risk, of conventional and Islamic banking systems. To this aim, we propose correlation network models for stock market returns based on grap...The main aim of this paper is to compare the stability, in terms of systemic risk, of conventional and Islamic banking systems. To this aim, we propose correlation network models for stock market returns based on graphical Gaussian distributions, which allows us to capture the contagion effects that move along countries. We also consider Bayesian graphical models, to account for model uncertainty in the measurement of financial systems interconnectedness. Our proposed model is applied to the Middle East and North Africa (MENA) region banking sector, characterized by the presence of both conventional and Islamic banks, for the period from 2007 to the beginning of 2014. Our empirical findings show that there are differences in the systemic risk and stability of the two banking systems during crisis times. In addition, the differences are subject to country specific effects that are amplified during crisis period.展开更多
Sensors are ubiquitous in the Internet of Things for measuring and collecting data. Analyzing these data derived from sensors is an essential task and can reveal useful latent information besides the data. Since the I...Sensors are ubiquitous in the Internet of Things for measuring and collecting data. Analyzing these data derived from sensors is an essential task and can reveal useful latent information besides the data. Since the Internet of Things contains many sorts of sensors, the measurement data collected by these sensors are multi-type data, sometimes contai- ning temporal series information. If we separately deal with different sorts of data, we will miss useful information. This paper proposes a method to dis- cover the correlation in multi-faceted data, which contains many types of data with temporal informa- tion, and our method can simultaneously deal with multi-faceted data. We transform high-dimensional multi-faeeted data into lower-dimensional data which is set as multivariate Gaussian Graphical Models, then mine the correlation in multi-faceted data by discover the structure of the multivariate Gausslan Graphical Models. With a real data set, we verifies our method, and the experiment demonstrates that the method we propose can correctly fred out the correlation among multi-faceted meas- urement data.展开更多
Magnetic resonance imaging(MRI)is a clinically relevant,real-time imaging modality that is frequently utilized to assess stroke type and severity.However,specific MRI biomarkers that can be used to predict long-term f...Magnetic resonance imaging(MRI)is a clinically relevant,real-time imaging modality that is frequently utilized to assess stroke type and severity.However,specific MRI biomarkers that can be used to predict long-term functional recovery are still a critical need.Consequently,the present study sought to examine the prognostic value of commonly utilized MRI parameters to predict functional outcomes in a porcine model of ischemic stroke.Stroke was induced via permanent middle cerebral artery occlusion.At 24 hours post-stroke,MRI analysis revealed focal ischemic lesions,decreased diffusivity,hemispheric swelling,and white matter degradation.Functional deficits including behavioral abnormalities in open field and novel object exploration as well as spatiotemporal gait impairments were observed at 4 weeks post-stroke.Gaussian graphical models identified specific MRI outputs and functional recovery variables,including white matter integrity and gait performance,that exhibited strong conditional dependencies.Canonical correlation analysis revealed a prognostic relationship between lesion volume and white matter integrity and novel object exploration and gait performance.Consequently,these analyses may also have the potential of predicting patient recovery at chronic time points as pigs and humans share many anatomical similarities(e.g.,white matter composition)that have proven to be critical in ischemic stroke pathophysiology.The study was approved by the University of Georgia(UGA)Institutional Animal Care and Use Committee(IACUC;Protocol Number:A2014-07-021-Y3-A11 and 2018-01-029-Y1-A5)on November 22,2017.展开更多
In this paper, we consider the problem of estimating a high dimensional precision matrix of Gaussian graphical model. Taking advantage of the connection between multivariate linear regression and entries of the precis...In this paper, we consider the problem of estimating a high dimensional precision matrix of Gaussian graphical model. Taking advantage of the connection between multivariate linear regression and entries of the precision matrix, we propose Bayesian Lasso together with neighborhood regression estimate for Gaussian graphical model. This method can obtain parameter estimation and model selection simultaneously. Moreover, the proposed method can provide symmetric confidence intervals of all entries of the precision matrix.展开更多
The explosive growth of mobile data demand is becoming an increasing burden on current cellular network.To address this issue,we propose a solution of opportunistic data offloading for alleviating overloaded cellular ...The explosive growth of mobile data demand is becoming an increasing burden on current cellular network.To address this issue,we propose a solution of opportunistic data offloading for alleviating overloaded cellular traffic.The principle behind it is to select a few important users as seeds for data sharing.The three critical steps are detailed as follows.We first explore individual interests of users by the construction of user profiles,on which an interest graph is built by Gaussian graphical modeling.We then apply the extreme value theory to threshold the encounter duration of user pairs.So,a contact graph is generated to indicate the social relationships of users.Moreover,a contact-interest graph is developed on the basis of the social ties and individual interests of users.Corresponding on different graphs,three strategies are finally proposed for seed selection in an aim to maximize overloaded cellular data.We evaluate the performance of our algorithms by the trace data of real-word mobility.It demonstrates the effectiveness of the strategy of taking social relationships and individual interests into account.展开更多
This paper focuses on the support recovery of the Gaussian graphical model(GGM)with false discovery rate(FDR)control.The graceful symmetrized data aggregation(SDA)technique which involves sample splitting,data screeni...This paper focuses on the support recovery of the Gaussian graphical model(GGM)with false discovery rate(FDR)control.The graceful symmetrized data aggregation(SDA)technique which involves sample splitting,data screening and information pooling is exploited via a node-based way.A matrix of test statistics with symmetry property is constructed and a data-driven threshold is chosen to control the FDR for the support recovery of GGM.The proposed method is shown to control the FDR asymptotically under some mild conditions.Extensive simulation studies and a real-data example demonstrate that it yields a better FDR control while offering reasonable power in most cases.展开更多
Gene co-expression networks provide an important tool for systems biology studies. Using microarray data from the Array Express database, we constructed an Arabidopsis gene co-expression network, termed At GGM2014, ba...Gene co-expression networks provide an important tool for systems biology studies. Using microarray data from the Array Express database, we constructed an Arabidopsis gene co-expression network, termed At GGM2014, based on the graphical Gaussian model, which contains 102,644 co-expression gene pairs among 18,068 genes. The network was grouped into 622 gene co-expression modules. These modules function in diverse house-keeping, cell cycle, development, hormone response, metabolism, and stress response pathways. We developed a tool to facilitate easy visualization of the expression patterns of these modules either in a tissue context or their regulation under different treatment conditions. The results indicate that at least six modules with tissue-specific expression pattern failed to record modular regulation under various stress conditions. This discrepancy could be best explained by the fact that experiments to study plant stress responses focused mainly on leaves and less on roots, and thus failed to recover specific regulation pattern in other tissues. Overall, the modular structures revealed by our network provide extensive information to generate testable hypotheses about diverse plant signaling pathways. At GGM2014 offers a constructive tool for plant systems biology studies.展开更多
Transcription factors(TFs)regulate cellular activities by controlling gene expression,but a predictive model describing how TFs quantitatively modulate human transcriptomes is lacking.We construct a universal human ge...Transcription factors(TFs)regulate cellular activities by controlling gene expression,but a predictive model describing how TFs quantitatively modulate human transcriptomes is lacking.We construct a universal human gene expression predictor named EXPLICIT-Human and utilize it to decode transcriptional regulation.Using the expression of 1613 TFs,the predictor reconstitutes highly accurate transcriptomes for samples derived from a wide range of tissues and conditions.The broad applicability of the predictor indicates that it recapitulates the quantitative relationships between TFs and target genes ubiquitous across tissues.Significant interacting TF-target gene pairs are extracted from the predictor and enable downstream inference of TF regulators for diverse pathways involved in development,immunity,metabolism,and stress response.A detailed analysis of the hematopoiesis process reveals an atlas of key TFs regulating the development of different hematopoietic cell lineages,and a portion of these TFs are conserved between humans and mice.The results demonstrate that our method is capable of delineating the TFs responsible for fate determination.Compared to other existing tools,EXPLICIT-Human shows a better performance in recovering the correct TF regulators.展开更多
Existing groupwise dimension reduction requires given group structure to be non-overlapped. This confines its application scope. We aim at groupwise dimension reduction with overlapped group structure or even unknown ...Existing groupwise dimension reduction requires given group structure to be non-overlapped. This confines its application scope. We aim at groupwise dimension reduction with overlapped group structure or even unknown group structure. To this end, existing groupwise dimension reduction concept is extended to be compatible with overlapped group structure. Then, the envelope method is ameliorated to deal with overlapped groupwise dimension reduction. As an application, Gaussian graphic model is employed to estimate the structure between predictors when the group structure is not given, and the amended envelope method is used for groupwise dimension reduction with graphic structure. Furthermore, the rationale of the proposed estimation procedure is explained at the population level and the estimation consistency is proved at the sample level. Finally, the finite sample performance of the proposed methods is examined via numerical simulations and a body fat data analysis.展开更多
文摘A new procedure of learning in Gaussian graphical models is proposed under the assumption that samples are possibly dependent.This assumption,which is pragmatically applied in various areas of multivariate analysis ranging from bioinformatics to finance,makes standard Gaussian graphical models(GGMs) unsuitable.We demonstrate that the advantage of modeling dependence among samples is that the true discovery rate and positive predictive value are improved substantially than if standard GGMs are applied and the dependence among samples is ignored.The new method,called matrix-variate Gaussian graphical models(MGGMs),involves simultaneously modeling variable and sample dependencies with the matrix-normal distribution.The computation is carried out using a Markov chain Monte Carlo(MCMC) sampling scheme for graphical model determination and parameter estimation.Simulation studies and two real-world examples in biology and finance further illustrate the benefits of the new models.
文摘The main aim of this paper is to compare the stability, in terms of systemic risk, of conventional and Islamic banking systems. To this aim, we propose correlation network models for stock market returns based on graphical Gaussian distributions, which allows us to capture the contagion effects that move along countries. We also consider Bayesian graphical models, to account for model uncertainty in the measurement of financial systems interconnectedness. Our proposed model is applied to the Middle East and North Africa (MENA) region banking sector, characterized by the presence of both conventional and Islamic banks, for the period from 2007 to the beginning of 2014. Our empirical findings show that there are differences in the systemic risk and stability of the two banking systems during crisis times. In addition, the differences are subject to country specific effects that are amplified during crisis period.
基金the Project"The Basic Research on Internet of Things Architecture"supported by National Key Basic Research Program of China(No.2011CB302704)supported by National Natural Science Foundation of China(No.60802034)+2 种基金Specialized Research Fund for the Doctoral Program of Higher Education(No.20070013026)Beijing Nova Program(No.2008B50)"New generation broadband wireless mobile communication network"Key Projects for Science and Technology Development(No.2011ZX03002-002-01)
文摘Sensors are ubiquitous in the Internet of Things for measuring and collecting data. Analyzing these data derived from sensors is an essential task and can reveal useful latent information besides the data. Since the Internet of Things contains many sorts of sensors, the measurement data collected by these sensors are multi-type data, sometimes contai- ning temporal series information. If we separately deal with different sorts of data, we will miss useful information. This paper proposes a method to dis- cover the correlation in multi-faceted data, which contains many types of data with temporal informa- tion, and our method can simultaneously deal with multi-faceted data. We transform high-dimensional multi-faeeted data into lower-dimensional data which is set as multivariate Gaussian Graphical Models, then mine the correlation in multi-faceted data by discover the structure of the multivariate Gausslan Graphical Models. With a real data set, we verifies our method, and the experiment demonstrates that the method we propose can correctly fred out the correlation among multi-faceted meas- urement data.
基金This work was supported by the National Institutes of Health,National Institute of Neurological Disorders and Stroke grant R01NS093314 as well as Small Business Innovation Research grant 1R43NS103596-01.
文摘Magnetic resonance imaging(MRI)is a clinically relevant,real-time imaging modality that is frequently utilized to assess stroke type and severity.However,specific MRI biomarkers that can be used to predict long-term functional recovery are still a critical need.Consequently,the present study sought to examine the prognostic value of commonly utilized MRI parameters to predict functional outcomes in a porcine model of ischemic stroke.Stroke was induced via permanent middle cerebral artery occlusion.At 24 hours post-stroke,MRI analysis revealed focal ischemic lesions,decreased diffusivity,hemispheric swelling,and white matter degradation.Functional deficits including behavioral abnormalities in open field and novel object exploration as well as spatiotemporal gait impairments were observed at 4 weeks post-stroke.Gaussian graphical models identified specific MRI outputs and functional recovery variables,including white matter integrity and gait performance,that exhibited strong conditional dependencies.Canonical correlation analysis revealed a prognostic relationship between lesion volume and white matter integrity and novel object exploration and gait performance.Consequently,these analyses may also have the potential of predicting patient recovery at chronic time points as pigs and humans share many anatomical similarities(e.g.,white matter composition)that have proven to be critical in ischemic stroke pathophysiology.The study was approved by the University of Georgia(UGA)Institutional Animal Care and Use Committee(IACUC;Protocol Number:A2014-07-021-Y3-A11 and 2018-01-029-Y1-A5)on November 22,2017.
基金Supported by the National Natural Science Foundation of China(No.11571080)
文摘In this paper, we consider the problem of estimating a high dimensional precision matrix of Gaussian graphical model. Taking advantage of the connection between multivariate linear regression and entries of the precision matrix, we propose Bayesian Lasso together with neighborhood regression estimate for Gaussian graphical model. This method can obtain parameter estimation and model selection simultaneously. Moreover, the proposed method can provide symmetric confidence intervals of all entries of the precision matrix.
基金This work was supported in part by National Natural Science Foundation of China under Grant No.61502261,61572457,61379132Key Research and Development Plan Project of Shandong Province under Grant No.2016GGX101032+1 种基金Science,Technology Plan Project for Colleges and Universities of Shandong Province under Grant No.J14LN85the Natural Science Foundation of Shandong Province under Grant No.ZR2017PF013.
文摘The explosive growth of mobile data demand is becoming an increasing burden on current cellular network.To address this issue,we propose a solution of opportunistic data offloading for alleviating overloaded cellular traffic.The principle behind it is to select a few important users as seeds for data sharing.The three critical steps are detailed as follows.We first explore individual interests of users by the construction of user profiles,on which an interest graph is built by Gaussian graphical modeling.We then apply the extreme value theory to threshold the encounter duration of user pairs.So,a contact graph is generated to indicate the social relationships of users.Moreover,a contact-interest graph is developed on the basis of the social ties and individual interests of users.Corresponding on different graphs,three strategies are finally proposed for seed selection in an aim to maximize overloaded cellular data.We evaluate the performance of our algorithms by the trace data of real-word mobility.It demonstrates the effectiveness of the strategy of taking social relationships and individual interests into account.
基金supported partially by the China National Key R&D Program under Grant Nos.2019YFC1908502,2022YFA1003703,2022YFA1003802,and 2022YFA1003803the National Natural Science Foundation of China under Grant Nos.11925106,12231011,11931001,and 11971247。
文摘This paper focuses on the support recovery of the Gaussian graphical model(GGM)with false discovery rate(FDR)control.The graceful symmetrized data aggregation(SDA)technique which involves sample splitting,data screening and information pooling is exploited via a node-based way.A matrix of test statistics with symmetry property is constructed and a data-driven threshold is chosen to control the FDR for the support recovery of GGM.The proposed method is shown to control the FDR asymptotically under some mild conditions.Extensive simulation studies and a real-data example demonstrate that it yields a better FDR control while offering reasonable power in most cases.
基金supported by US National Science Foundation grants DBI-0723722 and DBI-1042344 to SPDKUC Davis funds to SPDK
文摘Gene co-expression networks provide an important tool for systems biology studies. Using microarray data from the Array Express database, we constructed an Arabidopsis gene co-expression network, termed At GGM2014, based on the graphical Gaussian model, which contains 102,644 co-expression gene pairs among 18,068 genes. The network was grouped into 622 gene co-expression modules. These modules function in diverse house-keeping, cell cycle, development, hormone response, metabolism, and stress response pathways. We developed a tool to facilitate easy visualization of the expression patterns of these modules either in a tissue context or their regulation under different treatment conditions. The results indicate that at least six modules with tissue-specific expression pattern failed to record modular regulation under various stress conditions. This discrepancy could be best explained by the fact that experiments to study plant stress responses focused mainly on leaves and less on roots, and thus failed to recover specific regulation pattern in other tissues. Overall, the modular structures revealed by our network provide extensive information to generate testable hypotheses about diverse plant signaling pathways. At GGM2014 offers a constructive tool for plant systems biology studies.
基金supported by grants from the National Natural Science Foundation of China(31770268)the Strategic Priority Research Program of the Chinese Academy of Sciences(XDA24010303)+1 种基金the Fundamental Research Funds for the Central Universities(WK2070000091)University of Science and Technology of China(Start-up fund to S.M.).
文摘Transcription factors(TFs)regulate cellular activities by controlling gene expression,but a predictive model describing how TFs quantitatively modulate human transcriptomes is lacking.We construct a universal human gene expression predictor named EXPLICIT-Human and utilize it to decode transcriptional regulation.Using the expression of 1613 TFs,the predictor reconstitutes highly accurate transcriptomes for samples derived from a wide range of tissues and conditions.The broad applicability of the predictor indicates that it recapitulates the quantitative relationships between TFs and target genes ubiquitous across tissues.Significant interacting TF-target gene pairs are extracted from the predictor and enable downstream inference of TF regulators for diverse pathways involved in development,immunity,metabolism,and stress response.A detailed analysis of the hematopoiesis process reveals an atlas of key TFs regulating the development of different hematopoietic cell lineages,and a portion of these TFs are conserved between humans and mice.The results demonstrate that our method is capable of delineating the TFs responsible for fate determination.Compared to other existing tools,EXPLICIT-Human shows a better performance in recovering the correct TF regulators.
基金supported by a grant from the University Grant Council of Hong Kong of ChinaNational Natural Science Foundation of China (Grant No. 11371013)Tian Yuan Foundation for Mathematics
文摘Existing groupwise dimension reduction requires given group structure to be non-overlapped. This confines its application scope. We aim at groupwise dimension reduction with overlapped group structure or even unknown group structure. To this end, existing groupwise dimension reduction concept is extended to be compatible with overlapped group structure. Then, the envelope method is ameliorated to deal with overlapped groupwise dimension reduction. As an application, Gaussian graphic model is employed to estimate the structure between predictors when the group structure is not given, and the amended envelope method is used for groupwise dimension reduction with graphic structure. Furthermore, the rationale of the proposed estimation procedure is explained at the population level and the estimation consistency is proved at the sample level. Finally, the finite sample performance of the proposed methods is examined via numerical simulations and a body fat data analysis.