This paper focuses on a method to solve structural optimization problems using particle swarm optimization (PSO), surrogate models and Bayesian statistics. PSO is a random/stochastic search algorithm designed to fin...This paper focuses on a method to solve structural optimization problems using particle swarm optimization (PSO), surrogate models and Bayesian statistics. PSO is a random/stochastic search algorithm designed to find the global optimum. However, PSO needs many evaluations compared to gradient-based optimization. This means PSO increases the analysis costs of structural optimization. One of the methods to reduce computing costs in stochastic optimization is to use approximation techniques. In this work, surrogate models are used, including the response surface method (RSM) and Kriging. When surrogate models are used, there are some errors between exact values and approximated values. These errors decrease the reliability of the optimum values and discard the realistic approximation of using surrogate models. In this paper, Bayesian statistics is used to obtain more reliable results. To verify and confirm the efficiency of the proposed method using surrogate models and Bayesian statistics for stochastic structural optimization, two numerical examples are optimized, and the optimization of a hub sleeve is demonstrated as a practical problem.展开更多
Change point detection becomes increasingly important because it can support data analysis by providing labels to the data in an unsupervised manner.In the context of process data analytics,change points in the time s...Change point detection becomes increasingly important because it can support data analysis by providing labels to the data in an unsupervised manner.In the context of process data analytics,change points in the time series of process variables may have an important indication about the process operation.For example,in a batch process,the change points can correspond to the operations and phases defined by the batch recipe.Hence identifying change points can assist labelling the time series data.Various unsupervised algorithms have been developed for change point detection,including the optimisation approachwhich minimises a cost functionwith certain penalties to search for the change points.The Bayesian approach is another,which uses Bayesian statistics to calculate the posterior probability of a specific sample being a change point.The paper investigates how the two approaches for change point detection can be applied to process data analytics.In addition,a new type of cost function using Tikhonov regularisation is proposed for the optimisation approach to reduce irrelevant change points caused by randomness in the data.The novelty lies in using regularisation-based cost functions to handle ill-posed problems of noisy data.The results demonstrate that change point detection is useful for process data analytics because change points can produce data segments corresponding to different operating modes or varying conditions,which will be useful for other machine learning tasks.展开更多
Many black box functions and datasets have regions of different variability. Models such as the Gaussian process may fall short in giving the best representation of these complex functions. One successful approach for...Many black box functions and datasets have regions of different variability. Models such as the Gaussian process may fall short in giving the best representation of these complex functions. One successful approach for modeling this type of nonstationarity is the Treed Gaussian process <span style="font-family:Verdana;">[1]</span><span></span><span><span></span></span><span style="font-family:Verdana;">, which extended the Gaussian process by dividing the input space into different regions using a binary tree algorithm. Each region became its own Gaussian process. This iterative inference process formed many different trees and thus, many different Gaussian processes. In the end these were combined to get a posterior predictive distribution at each point. The idea was that when the iterations were combined, smoothing would take place for the surface of the predicted points near tree boundaries. We introduce the Improved Treed Gaussian process, which divides the input space into a single main binary tree where the different tree regions have different variability. The parameters for the Gaussian process for each tree region are then determined. These parameters are then smoothed at the region boundaries. This smoothing leads to a set of parameters for each point in the input space that specify the covariance matrix used to predict the point. The advantage is that the prediction and actual errors are estimated better since the standard deviation and range parameters of each point are related to the variation of the region it is in. Further, smoothing between regions is better since each point prediction uses its parameters over the whole input space. Examples are given in this paper which show these advantages for lower-dimensional problems.</span>展开更多
The black box functions found in computer experiments often result in multimodal optimization programs. Optimization that focuses on a single best optimum may not achieve the goal of getting the best answer for the pu...The black box functions found in computer experiments often result in multimodal optimization programs. Optimization that focuses on a single best optimum may not achieve the goal of getting the best answer for the purposes of the experiment. This paper builds upon an algorithm introduced in [1] that is successful for finding multiple optima within the input space of the objective function. Here we introduce an alternative cluster search algorithm for finding these optima, making use of clustering. The cluster search algorithm has several advantages over the earlier algorithm. It gives a forward view of the optima that are present in the input space so the user has a preview of what to expect as the optimization process continues. It employs pattern search, in many instances, closer to the minimum’s location in input space, saving on simulator point computations. At termination, this algorithm does not need additional verification that a minimum is a duplicate of a previously found minimum, which also saves on simulator point computations. Finally, it finds minima that can be “hidden” by close larger minima.展开更多
We introduce a framework for statistical inference of the closure coefficients using machine learning methods.The objective of this framework is to quantify the epistemic uncertainty associated with the closure model ...We introduce a framework for statistical inference of the closure coefficients using machine learning methods.The objective of this framework is to quantify the epistemic uncertainty associated with the closure model by using experimental data via Bayesian statistics.The framework is tailored towards cases for which a limited amount of experimental data is available.It consists of two components.First,by treating all latent variables(non-observed variables)in the model as stochastic variables,all sources of uncertainty of the probabilistic closure model are quantified by a fully Bayesian approach.The probabilistic model is defined to consist of the closure coefficients as parameters and other parameters incorporating noise.Then,the uncertainty associated with the closure coefficients is extracted from the overall uncertainty by considering the noise being zero.The overall uncertainty is rigorously evaluated by using Markov-Chain Monte Carlo sampling assisted by surrogate models.We apply the framework to the Spalart-Allmars one-equation turbulence model.Two test cases are considered,including an industrially relevant full aircraft model at transonic flow conditions,the Airbus XRF1.Eventually,we demonstrate that epistemic uncertainties in the closure coefficients result into uncertainties in flow quantities of interest which are prominent around,and downstream,of the shock occurring over the XRF1 wing.This data-driven approach could help to enhance the predictive capabilities of CFD in terms of reliable turbulence modeling at extremes of the flight envelope if measured data is available,which is important in the context of robust design and towards virtual aircraft certification.The plentiful amount of information about the uncertainties could also assist when it comes to estimating the influence of the measured data on the inferred model coefficients.Finally,the developed framework is flexible and can be applied to different test cases and to various turbulence models.展开更多
The calculation of Bayesian confidence upper limit for a Poisson variable including both signal and background with and without systematic uncertainties has been formulated. A Fortran 77 routine, BPULE, has been devel...The calculation of Bayesian confidence upper limit for a Poisson variable including both signal and background with and without systematic uncertainties has been formulated. A Fortran 77 routine, BPULE, has been developed to implement the calculation. The routine can account for systematic uncertainties in the background expectation and signal efficiency. The systematic uncertainties may be separately parameterized by a Gaussian, Log-Gaussian or flat probability density function (pdf). Some technical details of BPULE have been discussed.展开更多
Background:The prevalence of schistosomiasis japonica has decreased significantly,and the responses changing from control to elimination in Jiangsu Province,P.R.China.How to estimate the change in prevalence of schist...Background:The prevalence of schistosomiasis japonica has decreased significantly,and the responses changing from control to elimination in Jiangsu Province,P.R.China.How to estimate the change in prevalence of schistosomiasis using only serological data will be important and useful.Methods:We collected serum samples from 2011 to 2015 to build a serum bank from Dantu County of Jiangsu,China.Serum samples were detected by enzyme-linked immunosorbent assay(ELISA),the positive rate and optical density(OD)value were obtained.The Bayesian model including the prior information of sensitivity and specificity of ELISA was established,and the estimated infection rates were obtained for different years,genders and age groups.Results:There was no significant difference in the mean OD between different years and genders,but there was a significant difference between the different age groups.There were statistically significant differences in the positive rate for different years and age groups,but no significant difference at different genders.The estimated infection rate for the five years was 1.288,1.456,1.032,1.485 and 1.358%,respectively.There was no significant difference between different years and between genders,but a significant difference between different age groups.Conclusions:The risk of schistosomiasis transmission in this area still exists,and risk monitoring of schistosomiasis should be strengthened.展开更多
文摘This paper focuses on a method to solve structural optimization problems using particle swarm optimization (PSO), surrogate models and Bayesian statistics. PSO is a random/stochastic search algorithm designed to find the global optimum. However, PSO needs many evaluations compared to gradient-based optimization. This means PSO increases the analysis costs of structural optimization. One of the methods to reduce computing costs in stochastic optimization is to use approximation techniques. In this work, surrogate models are used, including the response surface method (RSM) and Kriging. When surrogate models are used, there are some errors between exact values and approximated values. These errors decrease the reliability of the optimum values and discard the realistic approximation of using surrogate models. In this paper, Bayesian statistics is used to obtain more reliable results. To verify and confirm the efficiency of the proposed method using surrogate models and Bayesian statistics for stochastic structural optimization, two numerical examples are optimized, and the optimization of a hub sleeve is demonstrated as a practical problem.
基金support by the Federal Ministry for Economic Affairs and Climate Action of Germany(BMWK)within the Innovation Platform“KEEN-Artificial Intelligence Incubator Laboratory in the Process Industry”(Grant No.01MK20014T)The research of L.B.is supported by the Swedish Research Council Grant VR 2018-03661。
文摘Change point detection becomes increasingly important because it can support data analysis by providing labels to the data in an unsupervised manner.In the context of process data analytics,change points in the time series of process variables may have an important indication about the process operation.For example,in a batch process,the change points can correspond to the operations and phases defined by the batch recipe.Hence identifying change points can assist labelling the time series data.Various unsupervised algorithms have been developed for change point detection,including the optimisation approachwhich minimises a cost functionwith certain penalties to search for the change points.The Bayesian approach is another,which uses Bayesian statistics to calculate the posterior probability of a specific sample being a change point.The paper investigates how the two approaches for change point detection can be applied to process data analytics.In addition,a new type of cost function using Tikhonov regularisation is proposed for the optimisation approach to reduce irrelevant change points caused by randomness in the data.The novelty lies in using regularisation-based cost functions to handle ill-posed problems of noisy data.The results demonstrate that change point detection is useful for process data analytics because change points can produce data segments corresponding to different operating modes or varying conditions,which will be useful for other machine learning tasks.
文摘Many black box functions and datasets have regions of different variability. Models such as the Gaussian process may fall short in giving the best representation of these complex functions. One successful approach for modeling this type of nonstationarity is the Treed Gaussian process <span style="font-family:Verdana;">[1]</span><span></span><span><span></span></span><span style="font-family:Verdana;">, which extended the Gaussian process by dividing the input space into different regions using a binary tree algorithm. Each region became its own Gaussian process. This iterative inference process formed many different trees and thus, many different Gaussian processes. In the end these were combined to get a posterior predictive distribution at each point. The idea was that when the iterations were combined, smoothing would take place for the surface of the predicted points near tree boundaries. We introduce the Improved Treed Gaussian process, which divides the input space into a single main binary tree where the different tree regions have different variability. The parameters for the Gaussian process for each tree region are then determined. These parameters are then smoothed at the region boundaries. This smoothing leads to a set of parameters for each point in the input space that specify the covariance matrix used to predict the point. The advantage is that the prediction and actual errors are estimated better since the standard deviation and range parameters of each point are related to the variation of the region it is in. Further, smoothing between regions is better since each point prediction uses its parameters over the whole input space. Examples are given in this paper which show these advantages for lower-dimensional problems.</span>
文摘The black box functions found in computer experiments often result in multimodal optimization programs. Optimization that focuses on a single best optimum may not achieve the goal of getting the best answer for the purposes of the experiment. This paper builds upon an algorithm introduced in [1] that is successful for finding multiple optima within the input space of the objective function. Here we introduce an alternative cluster search algorithm for finding these optima, making use of clustering. The cluster search algorithm has several advantages over the earlier algorithm. It gives a forward view of the optima that are present in the input space so the user has a preview of what to expect as the optimization process continues. It employs pattern search, in many instances, closer to the minimum’s location in input space, saving on simulator point computations. At termination, this algorithm does not need additional verification that a minimum is a duplicate of a previously found minimum, which also saves on simulator point computations. Finally, it finds minima that can be “hidden” by close larger minima.
文摘We introduce a framework for statistical inference of the closure coefficients using machine learning methods.The objective of this framework is to quantify the epistemic uncertainty associated with the closure model by using experimental data via Bayesian statistics.The framework is tailored towards cases for which a limited amount of experimental data is available.It consists of two components.First,by treating all latent variables(non-observed variables)in the model as stochastic variables,all sources of uncertainty of the probabilistic closure model are quantified by a fully Bayesian approach.The probabilistic model is defined to consist of the closure coefficients as parameters and other parameters incorporating noise.Then,the uncertainty associated with the closure coefficients is extracted from the overall uncertainty by considering the noise being zero.The overall uncertainty is rigorously evaluated by using Markov-Chain Monte Carlo sampling assisted by surrogate models.We apply the framework to the Spalart-Allmars one-equation turbulence model.Two test cases are considered,including an industrially relevant full aircraft model at transonic flow conditions,the Airbus XRF1.Eventually,we demonstrate that epistemic uncertainties in the closure coefficients result into uncertainties in flow quantities of interest which are prominent around,and downstream,of the shock occurring over the XRF1 wing.This data-driven approach could help to enhance the predictive capabilities of CFD in terms of reliable turbulence modeling at extremes of the flight envelope if measured data is available,which is important in the context of robust design and towards virtual aircraft certification.The plentiful amount of information about the uncertainties could also assist when it comes to estimating the influence of the measured data on the inferred model coefficients.Finally,the developed framework is flexible and can be applied to different test cases and to various turbulence models.
文摘The calculation of Bayesian confidence upper limit for a Poisson variable including both signal and background with and without systematic uncertainties has been formulated. A Fortran 77 routine, BPULE, has been developed to implement the calculation. The routine can account for systematic uncertainties in the background expectation and signal efficiency. The systematic uncertainties may be separately parameterized by a Gaussian, Log-Gaussian or flat probability density function (pdf). Some technical details of BPULE have been discussed.
基金This project was supported by the National S&T Major Program(grant no.2012ZX10004220)Project of National Natural Science Foundation of China(No.81101275)Capacity improvement project of Jiangsu Public Welfare Institute(No.BM2015024).
文摘Background:The prevalence of schistosomiasis japonica has decreased significantly,and the responses changing from control to elimination in Jiangsu Province,P.R.China.How to estimate the change in prevalence of schistosomiasis using only serological data will be important and useful.Methods:We collected serum samples from 2011 to 2015 to build a serum bank from Dantu County of Jiangsu,China.Serum samples were detected by enzyme-linked immunosorbent assay(ELISA),the positive rate and optical density(OD)value were obtained.The Bayesian model including the prior information of sensitivity and specificity of ELISA was established,and the estimated infection rates were obtained for different years,genders and age groups.Results:There was no significant difference in the mean OD between different years and genders,but there was a significant difference between the different age groups.There were statistically significant differences in the positive rate for different years and age groups,but no significant difference at different genders.The estimated infection rate for the five years was 1.288,1.456,1.032,1.485 and 1.358%,respectively.There was no significant difference between different years and between genders,but a significant difference between different age groups.Conclusions:The risk of schistosomiasis transmission in this area still exists,and risk monitoring of schistosomiasis should be strengthened.