Statistical prediction is often required in reservoir simulation to quantify production uncertainty or assess potential risks.Most existing uncertainty quantification procedures aim to decompose the input random field...Statistical prediction is often required in reservoir simulation to quantify production uncertainty or assess potential risks.Most existing uncertainty quantification procedures aim to decompose the input random field to independent random variables,and may suffer from the curse of dimensionality if the correlation scale is small compared to the domain size.In this work,we develop and test a new approach,K-means clustering assisted empirical modeling,for efficiently estimating waterflooding performance for multiple geological realizations.This method performs single-phase flow simulations in a large number of realizations,and uses K-means clustering to select only a few representatives,on which the two-phase flow simulations are implemented.The empirical models are then adopted to describe the relation between the single-phase solutions and the two-phase solutions using these representatives.Finally,the two-phase solutions in all realizations can be predicted using the empirical models readily.The method is applied to both 2D and 3D synthetic models and is shown to perform well in the P10,P50 and P90 of production rates,as well as the probability distributions as illustrated by cumulative density functions.It is able to capture the ensemble statistics of the Monte Carlo simulation results with a large number of realizations,and the computational cost is significantly reduced.展开更多
Main problem of modern climatology is to assess the present as well as future climate change, For this aim two approaches are used: physic-mathematic modeling on the basis of GCMs and palaeoclimatic analogues. The thi...Main problem of modern climatology is to assess the present as well as future climate change, For this aim two approaches are used: physic-mathematic modeling on the basis of GCMs and palaeoclimatic analogues. The third approach is based on the empirical-statistical methodology and is developed in this paper. This approach allows to decide two main problems: to give a real assessment of climate changes by observed data for climate monitoring and extrapolation of obtained climate tendencies to the nearest future (10-15 years) and give the empirical basis for further development of physic-mathematical models. The basic theory and methodology of empirical-statistic approach have been developed as well as a common model for description of space-time climate variations taking into account the processes of different time scales. The way of decreasing of the present and future uncertainty is suggested as the extraction of long-term climate changes components in the particular time series and spatial generalization of the same climate tendencies in the obtained homogeneous regions. Algorithm and methods for realization of empirical-statistic methodology have been developed along with methods for generalization of intraannual fluctuations, methods for extraction of homogeneous components of different time scales (interannual, decadal, century), methodology and methods for spatial generalization and modeling, methods for extrapolation on the basis of two main kinds of time models: stochastic and deterministic-stochastic. Some applications of developed methodology and methods are given for the longest time series of temperature and precipitation over the world and for spatial generalization over the European area.展开更多
The global sustainability plan for future development relies on solar radiation which is the main source of renewable energy. Thus, this work studies the performance of six models to estimate global solar radiation on...The global sustainability plan for future development relies on solar radiation which is the main source of renewable energy. Thus, this work studies the performance of six models to estimate global solar radiation on a horizontal surface for the Abeche site in Chad. The data used in this work were collected at the General Directorate of National Meteorology of Chad. The reliability and accuracy of different models for estimating global solar radiation were validated by statistical indicators to identify the most accurate model. The results show that among all the models, the Sabbagh model has the best performance in estimating the global solar radiation. The average is 6.354 kWh/m<sup>2</sup> with an average of -3.704%. This model is validated against NASA data which is widely used.展开更多
Many problems in rock engineering are limited by our imperfect knowledge of the material properties and failure mechanics of rock masses. Mining problems are somewhat unique, however, in that plenty of real world expe...Many problems in rock engineering are limited by our imperfect knowledge of the material properties and failure mechanics of rock masses. Mining problems are somewhat unique, however, in that plenty of real world experience is generally available and can be turned into valuable experimental data.Every pillar that is developed, or stope that is mined, represents a full-scale test of a rock mechanics design. By harvesting these data, and then using the appropriate statistical techniques to interpret them,mining engineers have developed powerful design techniques that are widely used around the world.Successful empirical methods are readily accepted because they are simple, transparent, practical, and firmly tethered to reality. The author has been intimately associated with empirical design for his entire career, but his previous publications have described the application of individual techniques to specific problems. The focus of this paper is the process used to develop a successful empirical method. A sixstage process is described: identification of the problem, and of the end users of the final product; development of a conceptual rock mechanics model, and identification of the key parameters in that model;identification of measures for each of the key parameters, and the development of new measures(such as rating scales) where necessary; data sources and data collection; statistical analysis; and packaging of the final product. Each of these stages has its own potential rewards and pitfalls, which will be illustrated by incidents from the author's own experience. The ultimate goal of this paper is to provide a new and deeper appreciation for empirical techniques, as well as some guidelines and opportunities for future developers.展开更多
Leaf area index(LAI) is a key factor that determines a forest ecosystem's net primary production and energy exchange between the atmosphere and land surfaces.LAI can be measured in many ways, but there has been li...Leaf area index(LAI) is a key factor that determines a forest ecosystem's net primary production and energy exchange between the atmosphere and land surfaces.LAI can be measured in many ways, but there has been little research to compare LAI estimated by different methods. In this study, we compared the LAI results from two different approaches, i.e., the dimidiate pixel model(DPM) and an empirical statistic model(ESM) using ZY-3 high-accuracy satellite images validated by field data. We explored the relationship of LAI of Larix principis-rupprechtii Mayr plantations with topographic conditions. The results show that DPM improves the simulation of LAI(r = 0.86,RMSE = 0.57) compared with ESM(r = 0.62, RMSE =0.79). We further concluded that elevation and slope significantly affect the distribution of LAI. The maximum peak of LAI appeared at an aspect of east and southeast at an elevation of 1700–2000 m. Our results suggest that ZY-3 can satisfy the needs of quantitative monitoring of leaf area indices in small-scale catchment areas. DPM provides a simple and accurate method to obtain forest vegetation parameters in the case of non-ground measurement points.展开更多
In this study the medium-term response of beach profiles was investigated at two sites: a gently sloping sandy beach and a steeper mixed sand and gravel beach. The former is the Duck site in North Carolina, on the ea...In this study the medium-term response of beach profiles was investigated at two sites: a gently sloping sandy beach and a steeper mixed sand and gravel beach. The former is the Duck site in North Carolina, on the east coast of the USA, which is exposed to Atlantic Ocean swells and storm waves, and the latter is the Milford-on-Sea site at Christchurch Bay, on the south coast of England, which is partially sheltered from Atlantic swells but has a directionally bimodal wave exposure. The data sets comprise detailed bathymetric surveys of beach profiles covering a period of more than 25 years for the Duck site and over 18 years for the Milford-on-Sea site. The structure of the data sets and the data-driven methods are described. Canonical correlation analysis (CCA) was used to find linkages between the wave characteristics and beach profiles. The sensitivity of the linkages was investigated by deploying a wave height threshold to filter out the smaller waves incrementally. The results of the analysis indicate that, for the gently sloping sandy beach, waves of all heights are important to the morphological response. For the mixed sand and gravel beach, filtering the smaller waves improves the statistical fit and it suggests that low-height waves do not play a primary role in the medium-term morohological resoonse, which is primarily driven by the intermittent larger storm waves.展开更多
A novel empirical large signal direct current (DC)Ⅰ-Ⅴ model is presented considering the high saturation voltage, high pinch-off voltage, and wide operational range of drain voltage for 4H-SiC MESFETs. A compariso...A novel empirical large signal direct current (DC)Ⅰ-Ⅴ model is presented considering the high saturation voltage, high pinch-off voltage, and wide operational range of drain voltage for 4H-SiC MESFETs. A comparison of the presented model with Statz, Materka, Curtice-Cubic, and recently reported 4H-SiC MESFET large signal Ⅰ-Ⅴ models is made through the Levenberg-Marquardt method for fitting in nonlinear regression. The results show that the new model has the advantages of high accuracy, easily making initial value and robustness over other models. The more accurate results are obtained by the improved channel modulation and saturation voltage coefficient when the device is operated in the sub-threshold and near pinch-off region. In addition the new model can be implemented to CAD tools directly, using for design of 4H-SiC MESFET based RF&MW circuit, particularly MMIC (microwave monolithic integrate circuit).展开更多
Existing blockwise empirical likelihood(BEL)method blocks the observations or their analogues,which is proven useful under some dependent data settings.In this paper,we introduce a new BEL(NBEL)method by blocking the ...Existing blockwise empirical likelihood(BEL)method blocks the observations or their analogues,which is proven useful under some dependent data settings.In this paper,we introduce a new BEL(NBEL)method by blocking the scoring functions under high dimensional cases.We study the construction of confidence regions for the parameters in spatial autoregressive models with spatial autoregressive disturbances(SARAR models)with high dimension of parameters by using the NBEL method.It is shown that the NBEL ratio statistics are asymptoticallyχ^(2)-type distributed,which are used to obtain the NBEL based confidence regions for the parameters in SARAR models.A simulation study is conducted to compare the performances of the NBEL and the usual EL methods.展开更多
We propose a semiparametric Wald statistic to test the validity of logistic regression models based on case-control data. The test statistic is constructed using a semiparametric ROC curve estimator and a nonparametri...We propose a semiparametric Wald statistic to test the validity of logistic regression models based on case-control data. The test statistic is constructed using a semiparametric ROC curve estimator and a nonparametric ROC curve estimator. The statistic has an asymptotic chi-squared distribution and is an alternative to the Kolmogorov-Smirnov-type statistic proposed by Qin and Zhang in 1997, the chi-squared-type statistic proposed by Zhang in 1999 and the information matrix test statistic proposed by Zhang in 2001. The statistic is easy to compute in the sense that it requires none of the following methods: using a bootstrap method to find its critical values, partitioning the sample data or inverting a high-dimensional matrix. We present some results on simulation and on analysis of two real examples. Moreover, we discuss how to extend our statistic to a family of statistics and how to construct its Kolmogorov-Smirnov counterpart.展开更多
Based on the 500-hPa geopotential height field series of T106 numerical forecast products, by empirical orthogonal function (EOF) time-space separation, and on the hypotheses of EOF space-models being stable, the EO...Based on the 500-hPa geopotential height field series of T106 numerical forecast products, by empirical orthogonal function (EOF) time-space separation, and on the hypotheses of EOF space-models being stable, the EOF time coefficient series were taken as dynamical statistic model variables. The dynamic system reconstruction idea and genetic algorithm were introduced to make the dynamical model parameters optimized, and a nonlinear dynamic statistic model of EOF separating time coefficient series was established. By the model time integral and EOF time-space reconstruction, a medium/long-range forecast of subtropical high was carried out. The results show that the dynamical model forecast and T106 numerical forecast were approximately similar in the short-range forecast (≤5 days), but in the medium/long-range forecast (≥5 days), the forecast results of dynamical model was superior to that of T106 numerical products. A new method and idea were presented for diagnosing and forecasting complicated weathers such as subtropical high, and showed a better application outlook.展开更多
Based on the statistical model and taking into account the Q-value dependence and odd-even effects, we proposed a new empirical formula to reproduce the cross sections of the (n, p) reactions at 14.7 MeV neutron energ...Based on the statistical model and taking into account the Q-value dependence and odd-even effects, we proposed a new empirical formula to reproduce the cross sections of the (n, p) reactions at 14.7 MeV neutron energy and at the target mass number 14 ≤ A ≤ 198 for even A and 29 ≤ A ≤ 205 for odd A. All calculated results from the proposed empirical formula were compared to the experimental data as well as the available semi-empirical formula obtained by other authors. A high level of agreement has been found between the collected experimental data and the most of semiempirical formulae obtained by others.展开更多
In this paper, we evaluate semiempirical methods (AM1, PM3, and ZINDO), HF and DFT (B3LYP) in different basis sets to determine which method best describes the sign and magnitude of the geometrical parameters of artem...In this paper, we evaluate semiempirical methods (AM1, PM3, and ZINDO), HF and DFT (B3LYP) in different basis sets to determine which method best describes the sign and magnitude of the geometrical parameters of artemisinin in the region of the endoperoxide ring compared to crystallographic data. We also classify these methods using statistical analysis. The results of PCA were based on three main components, explaining 98.0539% of the total variance, for the geometrical parameters C3O13, O1O2C3, O13C12C12a, and O2C3O13C12. The DFT method (B3LYP) corresponded well with the experimental data in the hierarchical cluster analysis (HCA). The experimental and theoretical angles were analyzed by simple linear regression, and statistical parameters (correlation coefficients, significance, and predictability) were evaluated to determine the accuracy of the calculations. The statistical analysis exhibited a good correlation and high predictive power for the DFT (B3LYP) method in the 6-31G** basis set.展开更多
基金the funding supported by Beijing Natural Science Foundation(Grant No.3222037)the PetroChina Innovation Foundation(Grant No.2020D-5007-0203)by the Science Foundation of China University of Petroleum,Beijing(Nos.2462021YXZZ010,2462018QZDX13,and 2462020YXZZ028)
文摘Statistical prediction is often required in reservoir simulation to quantify production uncertainty or assess potential risks.Most existing uncertainty quantification procedures aim to decompose the input random field to independent random variables,and may suffer from the curse of dimensionality if the correlation scale is small compared to the domain size.In this work,we develop and test a new approach,K-means clustering assisted empirical modeling,for efficiently estimating waterflooding performance for multiple geological realizations.This method performs single-phase flow simulations in a large number of realizations,and uses K-means clustering to select only a few representatives,on which the two-phase flow simulations are implemented.The empirical models are then adopted to describe the relation between the single-phase solutions and the two-phase solutions using these representatives.Finally,the two-phase solutions in all realizations can be predicted using the empirical models readily.The method is applied to both 2D and 3D synthetic models and is shown to perform well in the P10,P50 and P90 of production rates,as well as the probability distributions as illustrated by cumulative density functions.It is able to capture the ensemble statistics of the Monte Carlo simulation results with a large number of realizations,and the computational cost is significantly reduced.
文摘Main problem of modern climatology is to assess the present as well as future climate change, For this aim two approaches are used: physic-mathematic modeling on the basis of GCMs and palaeoclimatic analogues. The third approach is based on the empirical-statistical methodology and is developed in this paper. This approach allows to decide two main problems: to give a real assessment of climate changes by observed data for climate monitoring and extrapolation of obtained climate tendencies to the nearest future (10-15 years) and give the empirical basis for further development of physic-mathematical models. The basic theory and methodology of empirical-statistic approach have been developed as well as a common model for description of space-time climate variations taking into account the processes of different time scales. The way of decreasing of the present and future uncertainty is suggested as the extraction of long-term climate changes components in the particular time series and spatial generalization of the same climate tendencies in the obtained homogeneous regions. Algorithm and methods for realization of empirical-statistic methodology have been developed along with methods for generalization of intraannual fluctuations, methods for extraction of homogeneous components of different time scales (interannual, decadal, century), methodology and methods for spatial generalization and modeling, methods for extrapolation on the basis of two main kinds of time models: stochastic and deterministic-stochastic. Some applications of developed methodology and methods are given for the longest time series of temperature and precipitation over the world and for spatial generalization over the European area.
文摘The global sustainability plan for future development relies on solar radiation which is the main source of renewable energy. Thus, this work studies the performance of six models to estimate global solar radiation on a horizontal surface for the Abeche site in Chad. The data used in this work were collected at the General Directorate of National Meteorology of Chad. The reliability and accuracy of different models for estimating global solar radiation were validated by statistical indicators to identify the most accurate model. The results show that among all the models, the Sabbagh model has the best performance in estimating the global solar radiation. The average is 6.354 kWh/m<sup>2</sup> with an average of -3.704%. This model is validated against NASA data which is widely used.
文摘Many problems in rock engineering are limited by our imperfect knowledge of the material properties and failure mechanics of rock masses. Mining problems are somewhat unique, however, in that plenty of real world experience is generally available and can be turned into valuable experimental data.Every pillar that is developed, or stope that is mined, represents a full-scale test of a rock mechanics design. By harvesting these data, and then using the appropriate statistical techniques to interpret them,mining engineers have developed powerful design techniques that are widely used around the world.Successful empirical methods are readily accepted because they are simple, transparent, practical, and firmly tethered to reality. The author has been intimately associated with empirical design for his entire career, but his previous publications have described the application of individual techniques to specific problems. The focus of this paper is the process used to develop a successful empirical method. A sixstage process is described: identification of the problem, and of the end users of the final product; development of a conceptual rock mechanics model, and identification of the key parameters in that model;identification of measures for each of the key parameters, and the development of new measures(such as rating scales) where necessary; data sources and data collection; statistical analysis; and packaging of the final product. Each of these stages has its own potential rewards and pitfalls, which will be illustrated by incidents from the author's own experience. The ultimate goal of this paper is to provide a new and deeper appreciation for empirical techniques, as well as some guidelines and opportunities for future developers.
基金supported by the National Forestry Public Welfare Professional Scientific Research Project(No.201404213)the National Key Research and Development Program of China(No.2016YFD0600205)
文摘Leaf area index(LAI) is a key factor that determines a forest ecosystem's net primary production and energy exchange between the atmosphere and land surfaces.LAI can be measured in many ways, but there has been little research to compare LAI estimated by different methods. In this study, we compared the LAI results from two different approaches, i.e., the dimidiate pixel model(DPM) and an empirical statistic model(ESM) using ZY-3 high-accuracy satellite images validated by field data. We explored the relationship of LAI of Larix principis-rupprechtii Mayr plantations with topographic conditions. The results show that DPM improves the simulation of LAI(r = 0.86,RMSE = 0.57) compared with ESM(r = 0.62, RMSE =0.79). We further concluded that elevation and slope significantly affect the distribution of LAI. The maximum peak of LAI appeared at an aspect of east and southeast at an elevation of 1700–2000 m. Our results suggest that ZY-3 can satisfy the needs of quantitative monitoring of leaf area indices in small-scale catchment areas. DPM provides a simple and accurate method to obtain forest vegetation parameters in the case of non-ground measurement points.
基金supported by the UK Natural Environment Research Council(Grant No.NE/J005606/1)the UK Engineering and Physical Sciences Research Council(Grant No.EP/C005392/1)the Ensemble Estimation of Flood Risk in a Changing Climate(EFRa CC)project funded by the British Council under its Global Innovation Initiative
文摘In this study the medium-term response of beach profiles was investigated at two sites: a gently sloping sandy beach and a steeper mixed sand and gravel beach. The former is the Duck site in North Carolina, on the east coast of the USA, which is exposed to Atlantic Ocean swells and storm waves, and the latter is the Milford-on-Sea site at Christchurch Bay, on the south coast of England, which is partially sheltered from Atlantic swells but has a directionally bimodal wave exposure. The data sets comprise detailed bathymetric surveys of beach profiles covering a period of more than 25 years for the Duck site and over 18 years for the Milford-on-Sea site. The structure of the data sets and the data-driven methods are described. Canonical correlation analysis (CCA) was used to find linkages between the wave characteristics and beach profiles. The sensitivity of the linkages was investigated by deploying a wave height threshold to filter out the smaller waves incrementally. The results of the analysis indicate that, for the gently sloping sandy beach, waves of all heights are important to the morphological response. For the mixed sand and gravel beach, filtering the smaller waves improves the statistical fit and it suggests that low-height waves do not play a primary role in the medium-term morohological resoonse, which is primarily driven by the intermittent larger storm waves.
基金the National Defense Basic Research Program of China(Grant No.51327010101)
文摘A novel empirical large signal direct current (DC)Ⅰ-Ⅴ model is presented considering the high saturation voltage, high pinch-off voltage, and wide operational range of drain voltage for 4H-SiC MESFETs. A comparison of the presented model with Statz, Materka, Curtice-Cubic, and recently reported 4H-SiC MESFET large signal Ⅰ-Ⅴ models is made through the Levenberg-Marquardt method for fitting in nonlinear regression. The results show that the new model has the advantages of high accuracy, easily making initial value and robustness over other models. The more accurate results are obtained by the improved channel modulation and saturation voltage coefficient when the device is operated in the sub-threshold and near pinch-off region. In addition the new model can be implemented to CAD tools directly, using for design of 4H-SiC MESFET based RF&MW circuit, particularly MMIC (microwave monolithic integrate circuit).
基金Supported by the National Natural Science Foundation of China(12061017,12361055),and the Research Fund of Guangxi Key Lab of Multi-source Information Mining&Security(22-A-01-01)。
文摘Existing blockwise empirical likelihood(BEL)method blocks the observations or their analogues,which is proven useful under some dependent data settings.In this paper,we introduce a new BEL(NBEL)method by blocking the scoring functions under high dimensional cases.We study the construction of confidence regions for the parameters in spatial autoregressive models with spatial autoregressive disturbances(SARAR models)with high dimension of parameters by using the NBEL method.It is shown that the NBEL ratio statistics are asymptoticallyχ^(2)-type distributed,which are used to obtain the NBEL based confidence regions for the parameters in SARAR models.A simulation study is conducted to compare the performances of the NBEL and the usual EL methods.
基金the 11.5 Natural Scientific Plan (Grant No. 2006BAD09A04)Nanjing UniversityStart Fund (Grant No. 020822410110)
文摘We propose a semiparametric Wald statistic to test the validity of logistic regression models based on case-control data. The test statistic is constructed using a semiparametric ROC curve estimator and a nonparametric ROC curve estimator. The statistic has an asymptotic chi-squared distribution and is an alternative to the Kolmogorov-Smirnov-type statistic proposed by Qin and Zhang in 1997, the chi-squared-type statistic proposed by Zhang in 1999 and the information matrix test statistic proposed by Zhang in 2001. The statistic is easy to compute in the sense that it requires none of the following methods: using a bootstrap method to find its critical values, partitioning the sample data or inverting a high-dimensional matrix. We present some results on simulation and on analysis of two real examples. Moreover, we discuss how to extend our statistic to a family of statistics and how to construct its Kolmogorov-Smirnov counterpart.
基金the National Natural Science Foundation of China(40375019)the Tropical Marine and Meteorological Science Foundation(200609).
文摘Based on the 500-hPa geopotential height field series of T106 numerical forecast products, by empirical orthogonal function (EOF) time-space separation, and on the hypotheses of EOF space-models being stable, the EOF time coefficient series were taken as dynamical statistic model variables. The dynamic system reconstruction idea and genetic algorithm were introduced to make the dynamical model parameters optimized, and a nonlinear dynamic statistic model of EOF separating time coefficient series was established. By the model time integral and EOF time-space reconstruction, a medium/long-range forecast of subtropical high was carried out. The results show that the dynamical model forecast and T106 numerical forecast were approximately similar in the short-range forecast (≤5 days), but in the medium/long-range forecast (≥5 days), the forecast results of dynamical model was superior to that of T106 numerical products. A new method and idea were presented for diagnosing and forecasting complicated weathers such as subtropical high, and showed a better application outlook.
文摘Based on the statistical model and taking into account the Q-value dependence and odd-even effects, we proposed a new empirical formula to reproduce the cross sections of the (n, p) reactions at 14.7 MeV neutron energy and at the target mass number 14 ≤ A ≤ 198 for even A and 29 ≤ A ≤ 205 for odd A. All calculated results from the proposed empirical formula were compared to the experimental data as well as the available semi-empirical formula obtained by other authors. A high level of agreement has been found between the collected experimental data and the most of semiempirical formulae obtained by others.
文摘In this paper, we evaluate semiempirical methods (AM1, PM3, and ZINDO), HF and DFT (B3LYP) in different basis sets to determine which method best describes the sign and magnitude of the geometrical parameters of artemisinin in the region of the endoperoxide ring compared to crystallographic data. We also classify these methods using statistical analysis. The results of PCA were based on three main components, explaining 98.0539% of the total variance, for the geometrical parameters C3O13, O1O2C3, O13C12C12a, and O2C3O13C12. The DFT method (B3LYP) corresponded well with the experimental data in the hierarchical cluster analysis (HCA). The experimental and theoretical angles were analyzed by simple linear regression, and statistical parameters (correlation coefficients, significance, and predictability) were evaluated to determine the accuracy of the calculations. The statistical analysis exhibited a good correlation and high predictive power for the DFT (B3LYP) method in the 6-31G** basis set.