This paper focuses on a method to solve structural optimization problems using particle swarm optimization (PSO), surrogate models and Bayesian statistics. PSO is a random/stochastic search algorithm designed to fin...This paper focuses on a method to solve structural optimization problems using particle swarm optimization (PSO), surrogate models and Bayesian statistics. PSO is a random/stochastic search algorithm designed to find the global optimum. However, PSO needs many evaluations compared to gradient-based optimization. This means PSO increases the analysis costs of structural optimization. One of the methods to reduce computing costs in stochastic optimization is to use approximation techniques. In this work, surrogate models are used, including the response surface method (RSM) and Kriging. When surrogate models are used, there are some errors between exact values and approximated values. These errors decrease the reliability of the optimum values and discard the realistic approximation of using surrogate models. In this paper, Bayesian statistics is used to obtain more reliable results. To verify and confirm the efficiency of the proposed method using surrogate models and Bayesian statistics for stochastic structural optimization, two numerical examples are optimized, and the optimization of a hub sleeve is demonstrated as a practical problem.展开更多
Accurate nitrogen(N)nutrition diagnosis is essential for improving N use efficiency in crop production.The widely used critical N(Nc)dilution curve traditionally depends solely on agronomic variables,neglecting crop w...Accurate nitrogen(N)nutrition diagnosis is essential for improving N use efficiency in crop production.The widely used critical N(Nc)dilution curve traditionally depends solely on agronomic variables,neglecting crop water status.With three-year field experiments with winter wheat,encompassing two irrigation levels(rainfed and irrigation at jointing and anthesis)and three N levels(0,180,and 270 kg ha1),this study aims to establish a novel approach for determining the Nc dilution curve based on crop cumulative transpiration(T),providing a comprehensive analysis of the interaction between N and water availability.The Nc curves derived from both crop dry matter(DM)and T demonstrated N concentration dilution under different conditions with different parameters.The equation Nc=6.43T0.24 established a consistent relationship across varying irrigation regimes.Independent test results indicated that the nitrogen nutrition index(NNI),calculated from this curve,effectively identifies and quantifies the two sources of N deficiency:insufficient N supply in the soil and insufficient soil water concentration leading to decreased N availability for root absorption.Additionally,the NNI calculated from the Nc-DM and Nc-T curves exhibited a strong negative correlation with accumulated N deficit(Nand)and a positive correlation with relative grain yield(RGY).The NNI derived from the Nc-T curve outperformed the NNI derived from the Nc-DM curve concerning its relationship with Nand and RGY,as indicated by larger R2 values and smaller AIC.The novel Nc curve based on T serves as an effective diagnostic tool for assessing winter wheat N status,predicting grain yield,and optimizing N fertilizer management across varying irrigation conditions.These findings would provide new insights and methods to improve the simulations of water-N interaction relationship in crop growth models.展开更多
This paper presents one of many possible applications of Bayesian inference predictive context of planned tests. We are particularly interested in the use of predictive Bayesian approach in clinical trials or objectiv...This paper presents one of many possible applications of Bayesian inference predictive context of planned tests. We are particularly interested in the use of predictive Bayesian approach in clinical trials or objective is the development of important evidence of an effect of interest We offer the procedure based on the notion of satisfaction index which is a function of the p-value and we look forward, given the available data to calculate a forecast for future satisfaction data as predictive Bayesian hope this index conditional on previous observations. To illustrate the proposed procedure, several models have been studied by choosing the prior distribution justify the reasons of objectivity or neutrality that underlie the analysis of experimental data.展开更多
Change point detection becomes increasingly important because it can support data analysis by providing labels to the data in an unsupervised manner.In the context of process data analytics,change points in the time s...Change point detection becomes increasingly important because it can support data analysis by providing labels to the data in an unsupervised manner.In the context of process data analytics,change points in the time series of process variables may have an important indication about the process operation.For example,in a batch process,the change points can correspond to the operations and phases defined by the batch recipe.Hence identifying change points can assist labelling the time series data.Various unsupervised algorithms have been developed for change point detection,including the optimisation approachwhich minimises a cost functionwith certain penalties to search for the change points.The Bayesian approach is another,which uses Bayesian statistics to calculate the posterior probability of a specific sample being a change point.The paper investigates how the two approaches for change point detection can be applied to process data analytics.In addition,a new type of cost function using Tikhonov regularisation is proposed for the optimisation approach to reduce irrelevant change points caused by randomness in the data.The novelty lies in using regularisation-based cost functions to handle ill-posed problems of noisy data.The results demonstrate that change point detection is useful for process data analytics because change points can produce data segments corresponding to different operating modes or varying conditions,which will be useful for other machine learning tasks.展开更多
The high-resolution nonlinear simultaneous inversion of petrophysical parameters is based on Bayesian statistics and combines petrophysics with geostatistical a priori information. We used the fast Fourier transform–...The high-resolution nonlinear simultaneous inversion of petrophysical parameters is based on Bayesian statistics and combines petrophysics with geostatistical a priori information. We used the fast Fourier transform–moving average(FFT–MA) and gradual deformation method(GDM) to obtain a reasonable variogram by using structural analysis and geostatistical a priori information of petrophysical parameters. Subsequently, we constructed the likelihood function according to the statistical petrophysical model. Finally, we used the Metropolis algorithm to sample the posteriori probability density and complete the inversion of the petrophysical parameters. We used the proposed method to process data from an oil fi eld in China and found good match between inversion and real data with high-resolution. In addition, the direct inversion of petrophysical parameters avoids the error accumulation and decreases the uncertainty, and increases the computational effi ciency.展开更多
Conventional joint PP-PS inversion is based on approximations of the Zoeppritz equations and assumes constant VP/VS;therefore,the inversion precision and stability cannot satisfy current exploration requirements.We pr...Conventional joint PP-PS inversion is based on approximations of the Zoeppritz equations and assumes constant VP/VS;therefore,the inversion precision and stability cannot satisfy current exploration requirements.We propose a joint PP-PS inversion method based on the exact Zoeppritz equations that combines Bayesian statistics and generalized linear inversion.A forward model based on the exact Zoeppritz equations is built to minimize the error of the approximations in the large-angle data,the prior distribution of the model parameters is added as a regularization item to decrease the ill-posed nature of the inversion,low-frequency constraints are introduced to stabilize the low-frequency data and improve robustness,and a fast algorithm is used to solve the objective function while minimizing the computational load.The proposed method has superior antinoising properties and well reproduces real data.展开更多
A key problem in seismic inversion is the identification of the reservoir fluids. Elastic parameters, such as seismic wave velocity and formation density, do not have sufficient sensitivity, thus, the conventional amp...A key problem in seismic inversion is the identification of the reservoir fluids. Elastic parameters, such as seismic wave velocity and formation density, do not have sufficient sensitivity, thus, the conventional amplitude-versus-offset(AVO) method is not applicable. The frequency-dependent AVO method considers the dependency of the seismic amplitude to frequency and uses this dependency to obtain information regarding the fluids in the reservoir fractures. We propose an improved Bayesian inversion method based on the parameterization of the Chapman model. The proposed method is based on 1) inelastic attribute inversion by the FDAVO method and 2) Bayesian statistics for fluid identification. First, we invert the inelastic fracture parameters by formulating an error function, which is used to match observations and model data. Second, we identify fluid types by using a Markov random field a priori model considering data from various sources, such as prestack inversion and well logs. We consider the inelastic parameters to take advantage of the viscosity differences among the different fluids possible. Finally, we use the maximum posteriori probability for obtaining the best lithology/fluid identification results.展开更多
Many black box functions and datasets have regions of different variability. Models such as the Gaussian process may fall short in giving the best representation of these complex functions. One successful approach for...Many black box functions and datasets have regions of different variability. Models such as the Gaussian process may fall short in giving the best representation of these complex functions. One successful approach for modeling this type of nonstationarity is the Treed Gaussian process <span style="font-family:Verdana;">[1]</span><span></span><span><span></span></span><span style="font-family:Verdana;">, which extended the Gaussian process by dividing the input space into different regions using a binary tree algorithm. Each region became its own Gaussian process. This iterative inference process formed many different trees and thus, many different Gaussian processes. In the end these were combined to get a posterior predictive distribution at each point. The idea was that when the iterations were combined, smoothing would take place for the surface of the predicted points near tree boundaries. We introduce the Improved Treed Gaussian process, which divides the input space into a single main binary tree where the different tree regions have different variability. The parameters for the Gaussian process for each tree region are then determined. These parameters are then smoothed at the region boundaries. This smoothing leads to a set of parameters for each point in the input space that specify the covariance matrix used to predict the point. The advantage is that the prediction and actual errors are estimated better since the standard deviation and range parameters of each point are related to the variation of the region it is in. Further, smoothing between regions is better since each point prediction uses its parameters over the whole input space. Examples are given in this paper which show these advantages for lower-dimensional problems.</span>展开更多
The black box functions found in computer experiments often result in multimodal optimization programs. Optimization that focuses on a single best optimum may not achieve the goal of getting the best answer for the pu...The black box functions found in computer experiments often result in multimodal optimization programs. Optimization that focuses on a single best optimum may not achieve the goal of getting the best answer for the purposes of the experiment. This paper builds upon an algorithm introduced in [1] that is successful for finding multiple optima within the input space of the objective function. Here we introduce an alternative cluster search algorithm for finding these optima, making use of clustering. The cluster search algorithm has several advantages over the earlier algorithm. It gives a forward view of the optima that are present in the input space so the user has a preview of what to expect as the optimization process continues. It employs pattern search, in many instances, closer to the minimum’s location in input space, saving on simulator point computations. At termination, this algorithm does not need additional verification that a minimum is a duplicate of a previously found minimum, which also saves on simulator point computations. Finally, it finds minima that can be “hidden” by close larger minima.展开更多
We introduce a framework for statistical inference of the closure coefficients using machine learning methods.The objective of this framework is to quantify the epistemic uncertainty associated with the closure model ...We introduce a framework for statistical inference of the closure coefficients using machine learning methods.The objective of this framework is to quantify the epistemic uncertainty associated with the closure model by using experimental data via Bayesian statistics.The framework is tailored towards cases for which a limited amount of experimental data is available.It consists of two components.First,by treating all latent variables(non-observed variables)in the model as stochastic variables,all sources of uncertainty of the probabilistic closure model are quantified by a fully Bayesian approach.The probabilistic model is defined to consist of the closure coefficients as parameters and other parameters incorporating noise.Then,the uncertainty associated with the closure coefficients is extracted from the overall uncertainty by considering the noise being zero.The overall uncertainty is rigorously evaluated by using Markov-Chain Monte Carlo sampling assisted by surrogate models.We apply the framework to the Spalart-Allmars one-equation turbulence model.Two test cases are considered,including an industrially relevant full aircraft model at transonic flow conditions,the Airbus XRF1.Eventually,we demonstrate that epistemic uncertainties in the closure coefficients result into uncertainties in flow quantities of interest which are prominent around,and downstream,of the shock occurring over the XRF1 wing.This data-driven approach could help to enhance the predictive capabilities of CFD in terms of reliable turbulence modeling at extremes of the flight envelope if measured data is available,which is important in the context of robust design and towards virtual aircraft certification.The plentiful amount of information about the uncertainties could also assist when it comes to estimating the influence of the measured data on the inferred model coefficients.Finally,the developed framework is flexible and can be applied to different test cases and to various turbulence models.展开更多
The calculation of Bayesian confidence upper limit for a Poisson variable including both signal and background with and without systematic uncertainties has been formulated. A Fortran 77 routine, BPULE, has been devel...The calculation of Bayesian confidence upper limit for a Poisson variable including both signal and background with and without systematic uncertainties has been formulated. A Fortran 77 routine, BPULE, has been developed to implement the calculation. The routine can account for systematic uncertainties in the background expectation and signal efficiency. The systematic uncertainties may be separately parameterized by a Gaussian, Log-Gaussian or flat probability density function (pdf). Some technical details of BPULE have been discussed.展开更多
Background:The prevalence of schistosomiasis japonica has decreased significantly,and the responses changing from control to elimination in Jiangsu Province,P.R.China.How to estimate the change in prevalence of schist...Background:The prevalence of schistosomiasis japonica has decreased significantly,and the responses changing from control to elimination in Jiangsu Province,P.R.China.How to estimate the change in prevalence of schistosomiasis using only serological data will be important and useful.Methods:We collected serum samples from 2011 to 2015 to build a serum bank from Dantu County of Jiangsu,China.Serum samples were detected by enzyme-linked immunosorbent assay(ELISA),the positive rate and optical density(OD)value were obtained.The Bayesian model including the prior information of sensitivity and specificity of ELISA was established,and the estimated infection rates were obtained for different years,genders and age groups.Results:There was no significant difference in the mean OD between different years and genders,but there was a significant difference between the different age groups.There were statistically significant differences in the positive rate for different years and age groups,but no significant difference at different genders.The estimated infection rate for the five years was 1.288,1.456,1.032,1.485 and 1.358%,respectively.There was no significant difference between different years and between genders,but a significant difference between different age groups.Conclusions:The risk of schistosomiasis transmission in this area still exists,and risk monitoring of schistosomiasis should be strengthened.展开更多
This paper deals with the statistical modeling of latent topic hierarchies in text corpora. The height of the topic tree is assumed as fixed, while the number of topics on each level as unknown a priori and to be infe...This paper deals with the statistical modeling of latent topic hierarchies in text corpora. The height of the topic tree is assumed as fixed, while the number of topics on each level as unknown a priori and to be inferred from data. Taking a nonpara-metric Bayesian approach to this problem, we propose a new probabilistic generative model based on the nested hierarchical Dirichlet process (nHDP) and present a Markov chain Monte Carlo sampling algorithm for the inference of the topic tree structure as well as the word distribution of each topic and topic distribution of each document. Our theoretical analysis and experiment results show that this model can produce a more compact hierarchical topic structure and captures more fine-grained topic rela-tionships compared to the hierarchical latent Dirichlet allocation model.展开更多
文摘This paper focuses on a method to solve structural optimization problems using particle swarm optimization (PSO), surrogate models and Bayesian statistics. PSO is a random/stochastic search algorithm designed to find the global optimum. However, PSO needs many evaluations compared to gradient-based optimization. This means PSO increases the analysis costs of structural optimization. One of the methods to reduce computing costs in stochastic optimization is to use approximation techniques. In this work, surrogate models are used, including the response surface method (RSM) and Kriging. When surrogate models are used, there are some errors between exact values and approximated values. These errors decrease the reliability of the optimum values and discard the realistic approximation of using surrogate models. In this paper, Bayesian statistics is used to obtain more reliable results. To verify and confirm the efficiency of the proposed method using surrogate models and Bayesian statistics for stochastic structural optimization, two numerical examples are optimized, and the optimization of a hub sleeve is demonstrated as a practical problem.
基金supported by the National Key Research and Development Program of China(2022YFD2001005)the Key Research&Development Program of Jiangsu province(BE2021358)+2 种基金the National Natural Science Foundation of China(32271989)the Natural Science Foundation of Jiangsu province(BK20220146)the Jiangsu Independent Innovation Fund Project of Agricultural Science and Technology[CX(23)3121].
文摘Accurate nitrogen(N)nutrition diagnosis is essential for improving N use efficiency in crop production.The widely used critical N(Nc)dilution curve traditionally depends solely on agronomic variables,neglecting crop water status.With three-year field experiments with winter wheat,encompassing two irrigation levels(rainfed and irrigation at jointing and anthesis)and three N levels(0,180,and 270 kg ha1),this study aims to establish a novel approach for determining the Nc dilution curve based on crop cumulative transpiration(T),providing a comprehensive analysis of the interaction between N and water availability.The Nc curves derived from both crop dry matter(DM)and T demonstrated N concentration dilution under different conditions with different parameters.The equation Nc=6.43T0.24 established a consistent relationship across varying irrigation regimes.Independent test results indicated that the nitrogen nutrition index(NNI),calculated from this curve,effectively identifies and quantifies the two sources of N deficiency:insufficient N supply in the soil and insufficient soil water concentration leading to decreased N availability for root absorption.Additionally,the NNI calculated from the Nc-DM and Nc-T curves exhibited a strong negative correlation with accumulated N deficit(Nand)and a positive correlation with relative grain yield(RGY).The NNI derived from the Nc-T curve outperformed the NNI derived from the Nc-DM curve concerning its relationship with Nand and RGY,as indicated by larger R2 values and smaller AIC.The novel Nc curve based on T serves as an effective diagnostic tool for assessing winter wheat N status,predicting grain yield,and optimizing N fertilizer management across varying irrigation conditions.These findings would provide new insights and methods to improve the simulations of water-N interaction relationship in crop growth models.
文摘This paper presents one of many possible applications of Bayesian inference predictive context of planned tests. We are particularly interested in the use of predictive Bayesian approach in clinical trials or objective is the development of important evidence of an effect of interest We offer the procedure based on the notion of satisfaction index which is a function of the p-value and we look forward, given the available data to calculate a forecast for future satisfaction data as predictive Bayesian hope this index conditional on previous observations. To illustrate the proposed procedure, several models have been studied by choosing the prior distribution justify the reasons of objectivity or neutrality that underlie the analysis of experimental data.
基金support by the Federal Ministry for Economic Affairs and Climate Action of Germany(BMWK)within the Innovation Platform“KEEN-Artificial Intelligence Incubator Laboratory in the Process Industry”(Grant No.01MK20014T)The research of L.B.is supported by the Swedish Research Council Grant VR 2018-03661。
文摘Change point detection becomes increasingly important because it can support data analysis by providing labels to the data in an unsupervised manner.In the context of process data analytics,change points in the time series of process variables may have an important indication about the process operation.For example,in a batch process,the change points can correspond to the operations and phases defined by the batch recipe.Hence identifying change points can assist labelling the time series data.Various unsupervised algorithms have been developed for change point detection,including the optimisation approachwhich minimises a cost functionwith certain penalties to search for the change points.The Bayesian approach is another,which uses Bayesian statistics to calculate the posterior probability of a specific sample being a change point.The paper investigates how the two approaches for change point detection can be applied to process data analytics.In addition,a new type of cost function using Tikhonov regularisation is proposed for the optimisation approach to reduce irrelevant change points caused by randomness in the data.The novelty lies in using regularisation-based cost functions to handle ill-posed problems of noisy data.The results demonstrate that change point detection is useful for process data analytics because change points can produce data segments corresponding to different operating modes or varying conditions,which will be useful for other machine learning tasks.
基金sponsored by the National Basic Research Program of China(No.2013CB228604)the Major National Science and Technology Projects(No.2011ZX05009)+1 种基金the Natural Science Foundation of Shandong Province(No.ZR2011DQ013)the National Science Foundation of China(No.41204085)
文摘The high-resolution nonlinear simultaneous inversion of petrophysical parameters is based on Bayesian statistics and combines petrophysics with geostatistical a priori information. We used the fast Fourier transform–moving average(FFT–MA) and gradual deformation method(GDM) to obtain a reasonable variogram by using structural analysis and geostatistical a priori information of petrophysical parameters. Subsequently, we constructed the likelihood function according to the statistical petrophysical model. Finally, we used the Metropolis algorithm to sample the posteriori probability density and complete the inversion of the petrophysical parameters. We used the proposed method to process data from an oil fi eld in China and found good match between inversion and real data with high-resolution. In addition, the direct inversion of petrophysical parameters avoids the error accumulation and decreases the uncertainty, and increases the computational effi ciency.
基金supported by the 863 Program of China(No.2013AA064201)
文摘Conventional joint PP-PS inversion is based on approximations of the Zoeppritz equations and assumes constant VP/VS;therefore,the inversion precision and stability cannot satisfy current exploration requirements.We propose a joint PP-PS inversion method based on the exact Zoeppritz equations that combines Bayesian statistics and generalized linear inversion.A forward model based on the exact Zoeppritz equations is built to minimize the error of the approximations in the large-angle data,the prior distribution of the model parameters is added as a regularization item to decrease the ill-posed nature of the inversion,low-frequency constraints are introduced to stabilize the low-frequency data and improve robustness,and a fast algorithm is used to solve the objective function while minimizing the computational load.The proposed method has superior antinoising properties and well reproduces real data.
基金supported by the 973 Program of China(No.2013CB429805)the National Natural Science Foundation of China(No.41174080)
文摘A key problem in seismic inversion is the identification of the reservoir fluids. Elastic parameters, such as seismic wave velocity and formation density, do not have sufficient sensitivity, thus, the conventional amplitude-versus-offset(AVO) method is not applicable. The frequency-dependent AVO method considers the dependency of the seismic amplitude to frequency and uses this dependency to obtain information regarding the fluids in the reservoir fractures. We propose an improved Bayesian inversion method based on the parameterization of the Chapman model. The proposed method is based on 1) inelastic attribute inversion by the FDAVO method and 2) Bayesian statistics for fluid identification. First, we invert the inelastic fracture parameters by formulating an error function, which is used to match observations and model data. Second, we identify fluid types by using a Markov random field a priori model considering data from various sources, such as prestack inversion and well logs. We consider the inelastic parameters to take advantage of the viscosity differences among the different fluids possible. Finally, we use the maximum posteriori probability for obtaining the best lithology/fluid identification results.
文摘Many black box functions and datasets have regions of different variability. Models such as the Gaussian process may fall short in giving the best representation of these complex functions. One successful approach for modeling this type of nonstationarity is the Treed Gaussian process <span style="font-family:Verdana;">[1]</span><span></span><span><span></span></span><span style="font-family:Verdana;">, which extended the Gaussian process by dividing the input space into different regions using a binary tree algorithm. Each region became its own Gaussian process. This iterative inference process formed many different trees and thus, many different Gaussian processes. In the end these were combined to get a posterior predictive distribution at each point. The idea was that when the iterations were combined, smoothing would take place for the surface of the predicted points near tree boundaries. We introduce the Improved Treed Gaussian process, which divides the input space into a single main binary tree where the different tree regions have different variability. The parameters for the Gaussian process for each tree region are then determined. These parameters are then smoothed at the region boundaries. This smoothing leads to a set of parameters for each point in the input space that specify the covariance matrix used to predict the point. The advantage is that the prediction and actual errors are estimated better since the standard deviation and range parameters of each point are related to the variation of the region it is in. Further, smoothing between regions is better since each point prediction uses its parameters over the whole input space. Examples are given in this paper which show these advantages for lower-dimensional problems.</span>
文摘The black box functions found in computer experiments often result in multimodal optimization programs. Optimization that focuses on a single best optimum may not achieve the goal of getting the best answer for the purposes of the experiment. This paper builds upon an algorithm introduced in [1] that is successful for finding multiple optima within the input space of the objective function. Here we introduce an alternative cluster search algorithm for finding these optima, making use of clustering. The cluster search algorithm has several advantages over the earlier algorithm. It gives a forward view of the optima that are present in the input space so the user has a preview of what to expect as the optimization process continues. It employs pattern search, in many instances, closer to the minimum’s location in input space, saving on simulator point computations. At termination, this algorithm does not need additional verification that a minimum is a duplicate of a previously found minimum, which also saves on simulator point computations. Finally, it finds minima that can be “hidden” by close larger minima.
文摘We introduce a framework for statistical inference of the closure coefficients using machine learning methods.The objective of this framework is to quantify the epistemic uncertainty associated with the closure model by using experimental data via Bayesian statistics.The framework is tailored towards cases for which a limited amount of experimental data is available.It consists of two components.First,by treating all latent variables(non-observed variables)in the model as stochastic variables,all sources of uncertainty of the probabilistic closure model are quantified by a fully Bayesian approach.The probabilistic model is defined to consist of the closure coefficients as parameters and other parameters incorporating noise.Then,the uncertainty associated with the closure coefficients is extracted from the overall uncertainty by considering the noise being zero.The overall uncertainty is rigorously evaluated by using Markov-Chain Monte Carlo sampling assisted by surrogate models.We apply the framework to the Spalart-Allmars one-equation turbulence model.Two test cases are considered,including an industrially relevant full aircraft model at transonic flow conditions,the Airbus XRF1.Eventually,we demonstrate that epistemic uncertainties in the closure coefficients result into uncertainties in flow quantities of interest which are prominent around,and downstream,of the shock occurring over the XRF1 wing.This data-driven approach could help to enhance the predictive capabilities of CFD in terms of reliable turbulence modeling at extremes of the flight envelope if measured data is available,which is important in the context of robust design and towards virtual aircraft certification.The plentiful amount of information about the uncertainties could also assist when it comes to estimating the influence of the measured data on the inferred model coefficients.Finally,the developed framework is flexible and can be applied to different test cases and to various turbulence models.
文摘The calculation of Bayesian confidence upper limit for a Poisson variable including both signal and background with and without systematic uncertainties has been formulated. A Fortran 77 routine, BPULE, has been developed to implement the calculation. The routine can account for systematic uncertainties in the background expectation and signal efficiency. The systematic uncertainties may be separately parameterized by a Gaussian, Log-Gaussian or flat probability density function (pdf). Some technical details of BPULE have been discussed.
基金This project was supported by the National S&T Major Program(grant no.2012ZX10004220)Project of National Natural Science Foundation of China(No.81101275)Capacity improvement project of Jiangsu Public Welfare Institute(No.BM2015024).
文摘Background:The prevalence of schistosomiasis japonica has decreased significantly,and the responses changing from control to elimination in Jiangsu Province,P.R.China.How to estimate the change in prevalence of schistosomiasis using only serological data will be important and useful.Methods:We collected serum samples from 2011 to 2015 to build a serum bank from Dantu County of Jiangsu,China.Serum samples were detected by enzyme-linked immunosorbent assay(ELISA),the positive rate and optical density(OD)value were obtained.The Bayesian model including the prior information of sensitivity and specificity of ELISA was established,and the estimated infection rates were obtained for different years,genders and age groups.Results:There was no significant difference in the mean OD between different years and genders,but there was a significant difference between the different age groups.There were statistically significant differences in the positive rate for different years and age groups,but no significant difference at different genders.The estimated infection rate for the five years was 1.288,1.456,1.032,1.485 and 1.358%,respectively.There was no significant difference between different years and between genders,but a significant difference between different age groups.Conclusions:The risk of schistosomiasis transmission in this area still exists,and risk monitoring of schistosomiasis should be strengthened.
基金Project (No. 60773180) supported by the National Natural Science Foundation of China
文摘This paper deals with the statistical modeling of latent topic hierarchies in text corpora. The height of the topic tree is assumed as fixed, while the number of topics on each level as unknown a priori and to be inferred from data. Taking a nonpara-metric Bayesian approach to this problem, we propose a new probabilistic generative model based on the nested hierarchical Dirichlet process (nHDP) and present a Markov chain Monte Carlo sampling algorithm for the inference of the topic tree structure as well as the word distribution of each topic and topic distribution of each document. Our theoretical analysis and experiment results show that this model can produce a more compact hierarchical topic structure and captures more fine-grained topic rela-tionships compared to the hierarchical latent Dirichlet allocation model.