Proper understanding of global distribution of infectious diseases is an important part of disease management and policy making. However, data are subject to complexities caused by heterogeneities across host classes ...Proper understanding of global distribution of infectious diseases is an important part of disease management and policy making. However, data are subject to complexities caused by heterogeneities across host classes and space-time epidemic processes. This paper seeks to suggest or propose Bayesian spatio-temporal model for modeling and mapping tuberculosis relative risks in space and time as well identify risks factors associated with the tuberculosis and counties in Kenya with high tuberculosis relative risks. In this paper, we used spatio-temporal Bayesian hierarchical models to study the pattern of tuberculosis relative risks in Kenya. The Markov Chain Monte Carlo method via WinBUGS and R packages were used for simulations and estimation of the parameter estimates. The best fitting model is selected using the Deviance Information Criterion proposed by Spiegelhalter and colleagues. Among the spatio-temporal models used, the Knorr-Held model with space-time interaction type III and IV fit the data well but type IV appears better than type III. Variation in tuberculosis risk is observed among Kenya counties and clustering among counties with high tuberculosis relative risks. The prevalence of HIV is identified as the determinant of TB. We found clustering and heterogeneity of TB risk among high rate counties and the overall tuberculosis risk is slightly decreasing from 2002-2009. We proposed that the Knorr-Held model with interaction type IV should be used to model and map Kenyan tuberculosis relative risks. Interaction of TB relative risk in space and time increases among rural counties that share boundaries with urban counties with high tuberculosis risk. This is due to the ability of models to borrow strength from neighboring counties, such that nearby counties have similar risk. Although the approaches are less than ideal, we hope that our study provide a useful stepping stone in the development of spatial and spatio-temporal methodology for the statistical analysis of risk from tuberculosis in Kenya.展开更多
This study developed a hierarchical Bayesian(HB)model for local and regional flood frequency analysis in the Dongting Lake Basin,in China.The annual maximum daily flows from 15 streamflow-gauged sites in the study are...This study developed a hierarchical Bayesian(HB)model for local and regional flood frequency analysis in the Dongting Lake Basin,in China.The annual maximum daily flows from 15 streamflow-gauged sites in the study area were analyzed with the HB model.The generalized extreme value(GEV)distribution was selected as the extreme flood distribution,and the GEV distribution location and scale parameters were spatially modeled through a regression approach with the drainage area as a covariate.The Markov chain Monte Carlo(MCMC)method with Gibbs sampling was employed to calculate the posterior distribution in the HB model.The results showed that the proposed HB model provided satisfactory Bayesian credible intervals for flood quantiles,while the traditional delta method could not provide reliable uncertainty estimations for large flood quantiles,due to the fact that the lower confidence bounds tended to decrease as the return periods increased.Furthermore,the HB model for regional analysis allowed for a reduction in the value of some restrictive assumptions in the traditional index flood method,such as the homogeneity region assumption and the scale invariance assumption.The HB model can also provide an uncertainty band of flood quantile prediction at a poorly gauged or ungauged site,but the index flood method with L-moments does not demonstrate this uncertainty directly.Therefore,the HB model is an effective method of implementing the flexible local and regional frequency analysis scheme,and of quantifying the associated predictive uncertainty.展开更多
Small area estimation (SAE) tackles the problem of providing reliable estimates for small areas, i.e., subsets of the population for which sample information is not sufficient to warrant the use of a direct estimator....Small area estimation (SAE) tackles the problem of providing reliable estimates for small areas, i.e., subsets of the population for which sample information is not sufficient to warrant the use of a direct estimator. Hierarchical Bayesian approach to SAE problems offers several advantages over traditional SAE models including the ability of appropriately accounting for the type of surveyed variable. In this paper, a number of model specifications for estimating small area counts are discussed and their relative merits are illustrated. We conducted a simulation study by reproducing in a simplified form the Italian Labour Force Survey and taking the Local Labor Markets as target areas. Simulated data were generated by assuming population characteristics of interest as well as survey sampling design as known. In one set of experiments, numbers of employment/unemployment from census data were utilized, in others population characteristics were varied. Results show persistent model failures for some standard Fay-Herriot specifications and for generalized linear Poisson models with (log-)normal sampling stage, whilst either unmatched or nonnormal sampling stage models get the best performance in terms of bias, accuracy and reliability. Though, the study also found that any model noticeably improves on its performance by letting sampling variances be stochastically determined rather than assumed as known as is the general practice. Moreover, we address the issue of model determination to point out limits and possible deceptions of commonly used criteria for model selection and checking in SAE context.展开更多
Ore sorting is a preconcentration technology and can dramatically reduce energy and water usage to improve the sustainability and profitability of a mining operation.In porphyry Cu deposits,Cu is the primary target,wi...Ore sorting is a preconcentration technology and can dramatically reduce energy and water usage to improve the sustainability and profitability of a mining operation.In porphyry Cu deposits,Cu is the primary target,with ores usually containing secondary‘pay’metals such as Au,Mo and gangue elements such as Fe and As.Due to sensing technology limitations,secondary and deleterious materials vary in correlation type and strength with Cu but cannot be detected simultaneously via magnetic resonance(MR)ore sorting.Inferring the relationships between Cu and other elemental abundances is particularly critical for mineral processing.The variations in metal grade relationships occur due to the transition into different geological domains.This raises two questions-how to define these geological domains and how the metal grade relationship is influenced by these geological domains.In this paper,linear relationship is assumed between Cu grade and other metal grades.We applies a Bayesian hierarchical(partial-pooling)model to quantify the linear relationships between Cu,Au,and Fe grades from geochemical bore core data.The hierarchical model was compared with two other models-‘complete-pooling’model and‘nopooling’model.Mining blocks were split based on spatial domain to construct hierarchical model.Geochemical bore core data records metal grades measured from laboratory assay with spatial coordinates of sample location.Two case studies from different porphyry Cu deposits were used to evaluate the performance of the hierarchical model.Markov chain Monte Carlo(MCMC)was used to sample the posterior parameters.Our results show that the Bayesian hierarchical model dramatically reduced the posterior predictive variance for metal grades regression compared to the no-pooling model.In addition,the posterior inference in the hierarchical model is insensitive to the choice of prior.The data is wellrepresented in the posterior which indicates a robust model.The results show that the spatial domain can be successfully utilised for metal grade regression.Uncertainty in estimating the relationship between pay metals and both secondary and gangue elements is quantified and shown to be reduced with partial-pooling.Thus,the proposed Bayesian hierarchical model can offer a reliable and stable way to monitor the relationship between metal grades for ore sorting and other mineral processing options.展开更多
The state of in situ stress is a crucial parameter in subsurface engineering,especially for critical projects like nuclear waste repository.As one of the two ISRM suggested methods,the overcoring(OC)method is widely u...The state of in situ stress is a crucial parameter in subsurface engineering,especially for critical projects like nuclear waste repository.As one of the two ISRM suggested methods,the overcoring(OC)method is widely used to estimate the full stress tensors in rocks by independent regression analysis of the data from each OC test.However,such customary independent analysis of individual OC tests,known as no pooling,is liable to yield unreliable test-specific stress estimates due to various uncertainty sources involved in the OC method.To address this problem,a practical and no-cost solution is considered by incorporating into OC data analysis additional information implied within adjacent OC tests,which are usually available in OC measurement campaigns.Hence,this paper presents a Bayesian partial pooling(hierarchical)model for combined analysis of adjacent OC tests.We performed five case studies using OC test data made at a nuclear waste repository research site of Sweden.The results demonstrate that partial pooling of adjacent OC tests indeed allows borrowing of information across adjacent tests,and yields improved stress tensor estimates with reduced uncertainties simultaneously for all individual tests than they are independently analysed as no pooling,particularly for those unreliable no pooling stress estimates.A further model comparison shows that the partial pooling model also gives better predictive performance,and thus confirms that the information borrowed across adjacent OC tests is relevant and effective.展开更多
A concept map is a diagram depicting relationships among concepts which is used as a knowledge representation tool in many knowledge domains. In this paper, we build on the modeling framework of Hui et al. (2008) in o...A concept map is a diagram depicting relationships among concepts which is used as a knowledge representation tool in many knowledge domains. In this paper, we build on the modeling framework of Hui et al. (2008) in order to develop a concept map suitable for testing the empirical evidence of theories. We identify a theory by a set of core tenets each asserting that one set of independent variables affects one dependent variable, moreover every variable can have several operational definitions. Data consist of a selected sample of scientific articles from the empirical literature on the theory under investigation. Our “tenet map” features a number of complexities more than the original version. First the links are two-layer: first-layer links connect variables which are related in the test of the theory at issue;second-layer links represent connections which are found statistically significant. Besides, either layer matrix of link-formation probabilities is block-symmetric. In addition to a form of censoring which resembles the Hui et al. pruning step, observed maps are subject to a further censoring related to second-layer links. Still, we perform a full Bayesian analysis instead of adopting the empirical Bayes approach. Lastly, we develop a three-stage model which accounts for dependence either of data or of parameters. The investigation of the empirical support and consensus degree of new economic theories of the firm motivated the proposed methodology. In this paper, the Transaction Cost Economics view is tested by a tenet map analysis. Both the two-stage and the multilevel models identify the same tenets as the most corroborated by empirical evidence though the latter provides a more comprehensive and complex insight of relationships between constructs.展开更多
This study seeks to investigate the variations associated with lane lateral locations and days of the week in the stochastic and dynamic transition of traffic regimes(DTTR).In the proposed analysis,hierarchical regres...This study seeks to investigate the variations associated with lane lateral locations and days of the week in the stochastic and dynamic transition of traffic regimes(DTTR).In the proposed analysis,hierarchical regression models fitted using Bayesian frameworks were used to calibrate the transition probabilities that describe the DTTR.Datasets of two sites on a freeway facility located in Jacksonville,Florida,were selected for the analysis.The traffic speed thresholds to define traffic regimes were estimated using the Gaussian mixture model(GMM).The GMM revealed that two and three regimes were adequate mixture components for estimating the traffic speed distributions for Site 1 and 2 datasets,respectively.The results of hierarchical regression models show that there is considerable evidence that there are heterogeneity characteristics in the DTTR associated with lateral lane locations.In particular,the hierarchical regressions reveal that the breakdown process is more affected by the variations compared to other evaluated transition processes with the estimated intra-class correlation(ICC)of about 73%.The transition from congestion on-set/dissolution(COD)to the congested regime is estimated with the highest ICC of 49.4%in the three-regime model,and the lowest ICC of 1%was observed on the transition from the congested to COD regime.On the other hand,different days of the week are not found to contribute to the variations(the highest ICC was 1.44%)on the DTTR.These findings can be used in developing effective congestion countermeasures,particularly in the application of intelligent transportation systems,such as dynamic lane-management strategies.展开更多
Computations involved in Bayesian approach to practical model selection problems are usually very difficult. Computational simplifications are sometimes possible, but are not generally applicable. There is a large lit...Computations involved in Bayesian approach to practical model selection problems are usually very difficult. Computational simplifications are sometimes possible, but are not generally applicable. There is a large literature available on a methodology based on information theory called Minimum Description Length (MDL). It is described here how many of these techniques are either directly Bayesian in nature, or are very good objective approximations to Bayesian solutions. First, connections between the Bayesian approach and MDL are theoretically explored;thereafter a few illustrations are provided to describe how MDL can give useful computational simplifications.展开更多
Piecewise linear regression models are very flexible models for modeling the data. If the piecewise linear regression models are matched against the data, then the parameters are generally not known. This paper studie...Piecewise linear regression models are very flexible models for modeling the data. If the piecewise linear regression models are matched against the data, then the parameters are generally not known. This paper studies the problem of parameter estimation ofpiecewise linear regression models. The method used to estimate the parameters ofpicewise linear regression models is Bayesian method. But the Bayes estimator can not be found analytically. To overcome these problems, the reversible jump MCMC (Marcov Chain Monte Carlo) algorithm is proposed. Reversible jump MCMC algorithm generates the Markov chain converges to the limit distribution of the posterior distribution of the parameters ofpicewise linear regression models. The resulting Markov chain is used to calculate the Bayes estimator for the parameters of picewise linear regression models.展开更多
大多数操作系统的安全防护主要依赖基于签名或基于规则的方法,因此现有大多数的异常检测方法精度较低。因此,利用贝叶斯模型为同类群体建模,并结合时间效应与分层原则,为用户实体行为分析(User and Entity Behavior Analytics,UEBA)研...大多数操作系统的安全防护主要依赖基于签名或基于规则的方法,因此现有大多数的异常检测方法精度较低。因此,利用贝叶斯模型为同类群体建模,并结合时间效应与分层原则,为用户实体行为分析(User and Entity Behavior Analytics,UEBA)研究提供精度更高的数据集。然后,将基于实际记录的用户行为数据与贝叶斯层级图模型推测出的数据进行比较,降低模型中的误报率。该方法主要分为两个阶段:在第1阶段,基于数据驱动的方法形成用户行为聚类,定义用户的个人身份验证模式;在第2阶段,同时考虑到周期性因素和分层原则,并通过泊松分布建模。研究表明,数据驱动的聚类方法在减少误报方面能够取得更好的结果,并减轻网络安全管理的负担,进一步减少误报数量。展开更多
As one of the top four commercially important species in China,yellow croaker(Larimichthys polyactis)with two geographic subpopulations,has undergone profound changes during the last several decades.It is widely compr...As one of the top four commercially important species in China,yellow croaker(Larimichthys polyactis)with two geographic subpopulations,has undergone profound changes during the last several decades.It is widely comprehended that understanding its population dynamics is critically important for sustainable management of this valuable fishery in China.The only two existing population dynamics models assessed the population of yellow croaker using short time-series data,without considering geographical variations.In this study,Bayesian models with and without hierarchical subpopulation structure were developed to explore the spatial heterogeneity of the population dynamics of yellow croaker from 1968 to 2015.Alternative hypotheses were constructed to test potential temporal patterns in yellow croaker’s population dynamics.Substantial variations in population dynamics characteristics among space and time were found through this study.The population growth rate was revealed to increase since the late 1980s,and the catchability increased more than twice from 1981 to 2015.The East China Sea’s subpopulation witnesses faster growth,but suffers from higher fishing pressure than that in the Bohai Sea and Yellow Sea.The global population and two subpopulations all have high risks of overfishing and being overfished according to the MSY-based reference points in recent years.More conservative management strategies with subpopulation considerations are imperative for the fishery management of yellow croaker in China.The methodology developed in this study could also be applied to the stock assessment and fishery management of other species,especially for those species with large spatial heterogeneity data.展开更多
Indirect approaches to estimation of biomass factors are often applied to measure carbon flux in the forestry sector. An assumption underlying a country-level carbon stock estimate is the representativeness of these f...Indirect approaches to estimation of biomass factors are often applied to measure carbon flux in the forestry sector. An assumption underlying a country-level carbon stock estimate is the representativeness of these factors. Although intensive studies have been conducted to quantify biomass factors, each study typically covers a limited geographic area. The goal of this study was to employ a meta-analysis approach to develop regional bio- mass factors for Quercus mongolica forests in South Korea. The biomass factors of interest were biomass conversion and expansion factor (BCEF), biomass expansion factor (BEF) and root-to-shoot ratio (RSR). Our objectives were to select probability density functions (PDFs) that best fitted the three biomass factors and to quantify their means and uncertainties. A total of 12 scientific publications were selected as data sources based on a set of criteria. Fromthese publications we chose 52 study sites spread out across South Korea. The statistical model for the meta- analysis was a multilevel model with publication (data source) as the nesting factor specified under the Bayesian framework. Gamma, Log-normal and Weibull PDFs were evaluated. The Log-normal PDF yielded the best quanti- tative and qualitative fit for the three biomass factors. However, a poor fit of the PDF to the long right tail of observed BEF and RSR distributions was apparent. The median posterior estimates for means and 95 % credible intervals for BCEF, BEF and RSR across all 12 publica- tions were 1.016 (0.800-1.299), 1.414 (1.304-1.560) and 0.260 (0.200-0.335), respectively. The Log-normal PDF proved useful for estimating carbon stock of Q. mongolica forests on a regional scale and for uncertainty analysis based on Monte Carlo simulation.展开更多
In genetic association studies of complex diseases, endo-phenotypes such as expression profiles, epigenetic data, or clinical intermediate-phenotypes provide insight to understand the underlying biological path of the...In genetic association studies of complex diseases, endo-phenotypes such as expression profiles, epigenetic data, or clinical intermediate-phenotypes provide insight to understand the underlying biological path of the disease. In such situations, in order to establish the path from the gene to the disease, we have to decide whether the gene acts on the disease phenotype primarily through a specific endo-phenotype or whether the gene influences the disease through an unidentified path which is characterized by different intermediate phenotypes. Here, we address the question that a genetic locus, given its effect on an endo-phenotype, influences the trait of interest primarily through the path of the endo-phenotype. We propose a Bayesian approach that can evaluate the genetic association between the genetic locus and the phenotype of interest in the presence of the genetic effect on the endo-phenotype. Using simulation studies, we verify that our approach has the desired properties and compare this approach with a mediation approach. The proposed Bayesian approach is illustrated by an application to genome-wide association study for childhood asthma (CAMP) that contains expression profiles.展开更多
This paper presents a hierarchical Bayesian approach to the estimation of components’ reliability (survival) using a Weibull model for each of them. The proposed method can be used to estimation with general survival...This paper presents a hierarchical Bayesian approach to the estimation of components’ reliability (survival) using a Weibull model for each of them. The proposed method can be used to estimation with general survival censored data, because the estimation of a component’s reliability in a series (parallel) system is equivalent to the estimation of its survival function with right- (left-) censored data. Besides the Weibull parametric model for reliability data, independent gamma distributions are considered at the first hierarchical level for the Weibull parameters and independent uniform distributions over the real line as priors for the parameters of the gammas. In order to evaluate the model, an example and a simulation study are discussed.展开更多
文摘Proper understanding of global distribution of infectious diseases is an important part of disease management and policy making. However, data are subject to complexities caused by heterogeneities across host classes and space-time epidemic processes. This paper seeks to suggest or propose Bayesian spatio-temporal model for modeling and mapping tuberculosis relative risks in space and time as well identify risks factors associated with the tuberculosis and counties in Kenya with high tuberculosis relative risks. In this paper, we used spatio-temporal Bayesian hierarchical models to study the pattern of tuberculosis relative risks in Kenya. The Markov Chain Monte Carlo method via WinBUGS and R packages were used for simulations and estimation of the parameter estimates. The best fitting model is selected using the Deviance Information Criterion proposed by Spiegelhalter and colleagues. Among the spatio-temporal models used, the Knorr-Held model with space-time interaction type III and IV fit the data well but type IV appears better than type III. Variation in tuberculosis risk is observed among Kenya counties and clustering among counties with high tuberculosis relative risks. The prevalence of HIV is identified as the determinant of TB. We found clustering and heterogeneity of TB risk among high rate counties and the overall tuberculosis risk is slightly decreasing from 2002-2009. We proposed that the Knorr-Held model with interaction type IV should be used to model and map Kenyan tuberculosis relative risks. Interaction of TB relative risk in space and time increases among rural counties that share boundaries with urban counties with high tuberculosis risk. This is due to the ability of models to borrow strength from neighboring counties, such that nearby counties have similar risk. Although the approaches are less than ideal, we hope that our study provide a useful stepping stone in the development of spatial and spatio-temporal methodology for the statistical analysis of risk from tuberculosis in Kenya.
基金supported by the National Natural Science Foundation of China(Grants No.51779074 and 41371052)the Special Fund for the Public Welfare Industry of the Ministry of Water Resources of China(Grant No.201501059)+3 种基金the National Key Research and Development Program of China(Grant No.2017YFC0404304)the Jiangsu Water Conservancy Science and Technology Project(Grant No.2017027)the Program for Outstanding Young Talents in Colleges and Universities of Anhui Province(Grant No.gxyq2018143)the Natural Science Foundation of Wanjiang University of Technology(Grant No.WG18030)
文摘This study developed a hierarchical Bayesian(HB)model for local and regional flood frequency analysis in the Dongting Lake Basin,in China.The annual maximum daily flows from 15 streamflow-gauged sites in the study area were analyzed with the HB model.The generalized extreme value(GEV)distribution was selected as the extreme flood distribution,and the GEV distribution location and scale parameters were spatially modeled through a regression approach with the drainage area as a covariate.The Markov chain Monte Carlo(MCMC)method with Gibbs sampling was employed to calculate the posterior distribution in the HB model.The results showed that the proposed HB model provided satisfactory Bayesian credible intervals for flood quantiles,while the traditional delta method could not provide reliable uncertainty estimations for large flood quantiles,due to the fact that the lower confidence bounds tended to decrease as the return periods increased.Furthermore,the HB model for regional analysis allowed for a reduction in the value of some restrictive assumptions in the traditional index flood method,such as the homogeneity region assumption and the scale invariance assumption.The HB model can also provide an uncertainty band of flood quantile prediction at a poorly gauged or ungauged site,but the index flood method with L-moments does not demonstrate this uncertainty directly.Therefore,the HB model is an effective method of implementing the flexible local and regional frequency analysis scheme,and of quantifying the associated predictive uncertainty.
文摘Small area estimation (SAE) tackles the problem of providing reliable estimates for small areas, i.e., subsets of the population for which sample information is not sufficient to warrant the use of a direct estimator. Hierarchical Bayesian approach to SAE problems offers several advantages over traditional SAE models including the ability of appropriately accounting for the type of surveyed variable. In this paper, a number of model specifications for estimating small area counts are discussed and their relative merits are illustrated. We conducted a simulation study by reproducing in a simplified form the Italian Labour Force Survey and taking the Local Labor Markets as target areas. Simulated data were generated by assuming population characteristics of interest as well as survey sampling design as known. In one set of experiments, numbers of employment/unemployment from census data were utilized, in others population characteristics were varied. Results show persistent model failures for some standard Fay-Herriot specifications and for generalized linear Poisson models with (log-)normal sampling stage, whilst either unmatched or nonnormal sampling stage models get the best performance in terms of bias, accuracy and reliability. Though, the study also found that any model noticeably improves on its performance by letting sampling variances be stochastically determined rather than assumed as known as is the general practice. Moreover, we address the issue of model determination to point out limits and possible deceptions of commonly used criteria for model selection and checking in SAE context.
基金This research was funded by the CSIRO ResearchPlus Science Leader Grant Program.
文摘Ore sorting is a preconcentration technology and can dramatically reduce energy and water usage to improve the sustainability and profitability of a mining operation.In porphyry Cu deposits,Cu is the primary target,with ores usually containing secondary‘pay’metals such as Au,Mo and gangue elements such as Fe and As.Due to sensing technology limitations,secondary and deleterious materials vary in correlation type and strength with Cu but cannot be detected simultaneously via magnetic resonance(MR)ore sorting.Inferring the relationships between Cu and other elemental abundances is particularly critical for mineral processing.The variations in metal grade relationships occur due to the transition into different geological domains.This raises two questions-how to define these geological domains and how the metal grade relationship is influenced by these geological domains.In this paper,linear relationship is assumed between Cu grade and other metal grades.We applies a Bayesian hierarchical(partial-pooling)model to quantify the linear relationships between Cu,Au,and Fe grades from geochemical bore core data.The hierarchical model was compared with two other models-‘complete-pooling’model and‘nopooling’model.Mining blocks were split based on spatial domain to construct hierarchical model.Geochemical bore core data records metal grades measured from laboratory assay with spatial coordinates of sample location.Two case studies from different porphyry Cu deposits were used to evaluate the performance of the hierarchical model.Markov chain Monte Carlo(MCMC)was used to sample the posterior parameters.Our results show that the Bayesian hierarchical model dramatically reduced the posterior predictive variance for metal grades regression compared to the no-pooling model.In addition,the posterior inference in the hierarchical model is insensitive to the choice of prior.The data is wellrepresented in the posterior which indicates a robust model.The results show that the spatial domain can be successfully utilised for metal grade regression.Uncertainty in estimating the relationship between pay metals and both secondary and gangue elements is quantified and shown to be reduced with partial-pooling.Thus,the proposed Bayesian hierarchical model can offer a reliable and stable way to monitor the relationship between metal grades for ore sorting and other mineral processing options.
基金supported by the Guangdong Basic and Applied Basic Research Foundation(2023A1515011244).
文摘The state of in situ stress is a crucial parameter in subsurface engineering,especially for critical projects like nuclear waste repository.As one of the two ISRM suggested methods,the overcoring(OC)method is widely used to estimate the full stress tensors in rocks by independent regression analysis of the data from each OC test.However,such customary independent analysis of individual OC tests,known as no pooling,is liable to yield unreliable test-specific stress estimates due to various uncertainty sources involved in the OC method.To address this problem,a practical and no-cost solution is considered by incorporating into OC data analysis additional information implied within adjacent OC tests,which are usually available in OC measurement campaigns.Hence,this paper presents a Bayesian partial pooling(hierarchical)model for combined analysis of adjacent OC tests.We performed five case studies using OC test data made at a nuclear waste repository research site of Sweden.The results demonstrate that partial pooling of adjacent OC tests indeed allows borrowing of information across adjacent tests,and yields improved stress tensor estimates with reduced uncertainties simultaneously for all individual tests than they are independently analysed as no pooling,particularly for those unreliable no pooling stress estimates.A further model comparison shows that the partial pooling model also gives better predictive performance,and thus confirms that the information borrowed across adjacent OC tests is relevant and effective.
文摘A concept map is a diagram depicting relationships among concepts which is used as a knowledge representation tool in many knowledge domains. In this paper, we build on the modeling framework of Hui et al. (2008) in order to develop a concept map suitable for testing the empirical evidence of theories. We identify a theory by a set of core tenets each asserting that one set of independent variables affects one dependent variable, moreover every variable can have several operational definitions. Data consist of a selected sample of scientific articles from the empirical literature on the theory under investigation. Our “tenet map” features a number of complexities more than the original version. First the links are two-layer: first-layer links connect variables which are related in the test of the theory at issue;second-layer links represent connections which are found statistically significant. Besides, either layer matrix of link-formation probabilities is block-symmetric. In addition to a form of censoring which resembles the Hui et al. pruning step, observed maps are subject to a further censoring related to second-layer links. Still, we perform a full Bayesian analysis instead of adopting the empirical Bayes approach. Lastly, we develop a three-stage model which accounts for dependence either of data or of parameters. The investigation of the empirical support and consensus degree of new economic theories of the firm motivated the proposed methodology. In this paper, the Transaction Cost Economics view is tested by a tenet map analysis. Both the two-stage and the multilevel models identify the same tenets as the most corroborated by empirical evidence though the latter provides a more comprehensive and complex insight of relationships between constructs.
文摘This study seeks to investigate the variations associated with lane lateral locations and days of the week in the stochastic and dynamic transition of traffic regimes(DTTR).In the proposed analysis,hierarchical regression models fitted using Bayesian frameworks were used to calibrate the transition probabilities that describe the DTTR.Datasets of two sites on a freeway facility located in Jacksonville,Florida,were selected for the analysis.The traffic speed thresholds to define traffic regimes were estimated using the Gaussian mixture model(GMM).The GMM revealed that two and three regimes were adequate mixture components for estimating the traffic speed distributions for Site 1 and 2 datasets,respectively.The results of hierarchical regression models show that there is considerable evidence that there are heterogeneity characteristics in the DTTR associated with lateral lane locations.In particular,the hierarchical regressions reveal that the breakdown process is more affected by the variations compared to other evaluated transition processes with the estimated intra-class correlation(ICC)of about 73%.The transition from congestion on-set/dissolution(COD)to the congested regime is estimated with the highest ICC of 49.4%in the three-regime model,and the lowest ICC of 1%was observed on the transition from the congested to COD regime.On the other hand,different days of the week are not found to contribute to the variations(the highest ICC was 1.44%)on the DTTR.These findings can be used in developing effective congestion countermeasures,particularly in the application of intelligent transportation systems,such as dynamic lane-management strategies.
文摘Computations involved in Bayesian approach to practical model selection problems are usually very difficult. Computational simplifications are sometimes possible, but are not generally applicable. There is a large literature available on a methodology based on information theory called Minimum Description Length (MDL). It is described here how many of these techniques are either directly Bayesian in nature, or are very good objective approximations to Bayesian solutions. First, connections between the Bayesian approach and MDL are theoretically explored;thereafter a few illustrations are provided to describe how MDL can give useful computational simplifications.
文摘Piecewise linear regression models are very flexible models for modeling the data. If the piecewise linear regression models are matched against the data, then the parameters are generally not known. This paper studies the problem of parameter estimation ofpiecewise linear regression models. The method used to estimate the parameters ofpicewise linear regression models is Bayesian method. But the Bayes estimator can not be found analytically. To overcome these problems, the reversible jump MCMC (Marcov Chain Monte Carlo) algorithm is proposed. Reversible jump MCMC algorithm generates the Markov chain converges to the limit distribution of the posterior distribution of the parameters ofpicewise linear regression models. The resulting Markov chain is used to calculate the Bayes estimator for the parameters of picewise linear regression models.
文摘大多数操作系统的安全防护主要依赖基于签名或基于规则的方法,因此现有大多数的异常检测方法精度较低。因此,利用贝叶斯模型为同类群体建模,并结合时间效应与分层原则,为用户实体行为分析(User and Entity Behavior Analytics,UEBA)研究提供精度更高的数据集。然后,将基于实际记录的用户行为数据与贝叶斯层级图模型推测出的数据进行比较,降低模型中的误报率。该方法主要分为两个阶段:在第1阶段,基于数据驱动的方法形成用户行为聚类,定义用户的个人身份验证模式;在第2阶段,同时考虑到周期性因素和分层原则,并通过泊松分布建模。研究表明,数据驱动的聚类方法在减少误报方面能够取得更好的结果,并减轻网络安全管理的负担,进一步减少误报数量。
基金Foundation item:The National Key R&D Program of China under contract No.2017YFE0104400the National Natural Science Foundation of China under contract No.31772852the Fundamental Research Funds for the Central Universities under contract Nos 201512002 and 201562030.
文摘As one of the top four commercially important species in China,yellow croaker(Larimichthys polyactis)with two geographic subpopulations,has undergone profound changes during the last several decades.It is widely comprehended that understanding its population dynamics is critically important for sustainable management of this valuable fishery in China.The only two existing population dynamics models assessed the population of yellow croaker using short time-series data,without considering geographical variations.In this study,Bayesian models with and without hierarchical subpopulation structure were developed to explore the spatial heterogeneity of the population dynamics of yellow croaker from 1968 to 2015.Alternative hypotheses were constructed to test potential temporal patterns in yellow croaker’s population dynamics.Substantial variations in population dynamics characteristics among space and time were found through this study.The population growth rate was revealed to increase since the late 1980s,and the catchability increased more than twice from 1981 to 2015.The East China Sea’s subpopulation witnesses faster growth,but suffers from higher fishing pressure than that in the Bohai Sea and Yellow Sea.The global population and two subpopulations all have high risks of overfishing and being overfished according to the MSY-based reference points in recent years.More conservative management strategies with subpopulation considerations are imperative for the fishery management of yellow croaker in China.The methodology developed in this study could also be applied to the stock assessment and fishery management of other species,especially for those species with large spatial heterogeneity data.
文摘Indirect approaches to estimation of biomass factors are often applied to measure carbon flux in the forestry sector. An assumption underlying a country-level carbon stock estimate is the representativeness of these factors. Although intensive studies have been conducted to quantify biomass factors, each study typically covers a limited geographic area. The goal of this study was to employ a meta-analysis approach to develop regional bio- mass factors for Quercus mongolica forests in South Korea. The biomass factors of interest were biomass conversion and expansion factor (BCEF), biomass expansion factor (BEF) and root-to-shoot ratio (RSR). Our objectives were to select probability density functions (PDFs) that best fitted the three biomass factors and to quantify their means and uncertainties. A total of 12 scientific publications were selected as data sources based on a set of criteria. Fromthese publications we chose 52 study sites spread out across South Korea. The statistical model for the meta- analysis was a multilevel model with publication (data source) as the nesting factor specified under the Bayesian framework. Gamma, Log-normal and Weibull PDFs were evaluated. The Log-normal PDF yielded the best quanti- tative and qualitative fit for the three biomass factors. However, a poor fit of the PDF to the long right tail of observed BEF and RSR distributions was apparent. The median posterior estimates for means and 95 % credible intervals for BCEF, BEF and RSR across all 12 publica- tions were 1.016 (0.800-1.299), 1.414 (1.304-1.560) and 0.260 (0.200-0.335), respectively. The Log-normal PDF proved useful for estimating carbon stock of Q. mongolica forests on a regional scale and for uncertainty analysis based on Monte Carlo simulation.
文摘In genetic association studies of complex diseases, endo-phenotypes such as expression profiles, epigenetic data, or clinical intermediate-phenotypes provide insight to understand the underlying biological path of the disease. In such situations, in order to establish the path from the gene to the disease, we have to decide whether the gene acts on the disease phenotype primarily through a specific endo-phenotype or whether the gene influences the disease through an unidentified path which is characterized by different intermediate phenotypes. Here, we address the question that a genetic locus, given its effect on an endo-phenotype, influences the trait of interest primarily through the path of the endo-phenotype. We propose a Bayesian approach that can evaluate the genetic association between the genetic locus and the phenotype of interest in the presence of the genetic effect on the endo-phenotype. Using simulation studies, we verify that our approach has the desired properties and compare this approach with a mediation approach. The proposed Bayesian approach is illustrated by an application to genome-wide association study for childhood asthma (CAMP) that contains expression profiles.
文摘This paper presents a hierarchical Bayesian approach to the estimation of components’ reliability (survival) using a Weibull model for each of them. The proposed method can be used to estimation with general survival censored data, because the estimation of a component’s reliability in a series (parallel) system is equivalent to the estimation of its survival function with right- (left-) censored data. Besides the Weibull parametric model for reliability data, independent gamma distributions are considered at the first hierarchical level for the Weibull parameters and independent uniform distributions over the real line as priors for the parameters of the gammas. In order to evaluate the model, an example and a simulation study are discussed.