A powerful investigative tool in biology is to consider not a single mathematical model but a collection of models designed to explore different working hypotheses and select the best model in that collection.In these...A powerful investigative tool in biology is to consider not a single mathematical model but a collection of models designed to explore different working hypotheses and select the best model in that collection.In these lecture notes,the usual workflow of the use of mathematical models to investigate a biological problem is described and the use of a collection of model is motivated.Models depend on parameters that must be estimated using observations;and when a collection of models is considered,the best model has then to be identified based on available observations.Hence,model calibration and selection,which are intrinsically linked,are essential steps of the workflow.Here,some procedures for model calibration and a criterion,the Akaike Information Criterion,of model selection based on experimental data are described.Rough derivation,practical technique of computation and use of this criterion are detailed.展开更多
Modeling of network traffic is a fundamental building block of computer science. Measurements of network traffic demonstrate that self-similarity is one of the basic properties of the network traffic possess at large ...Modeling of network traffic is a fundamental building block of computer science. Measurements of network traffic demonstrate that self-similarity is one of the basic properties of the network traffic possess at large time-scale. This paper investigates the change of non-stationary self-similarity of network traffic over time,and proposes a method of combining the discrete wavelet transform (DWT) and Schwarz information criterion (SIC) to detect change points of self-similarity in network traffic. The traffic is segmented into pieces around changing points with homogenous characteristics for the Hurst parameter,named local Hurst parameter,and then each piece of network traffic is modeled using fractional Gaussian noise (FGN) model with the local Hurst parameter. The presented experimental performance on data set from the Internet Traffic Archive (ITA) demonstrates that the method is more accurate in describing the non-stationary self-similarity of network traffic.展开更多
In this paper, the estimators of the scale parameter of the exponential distribution obtained by applying four methods, using complete data, are critically examined and compared. These methods are the Maximum Likeliho...In this paper, the estimators of the scale parameter of the exponential distribution obtained by applying four methods, using complete data, are critically examined and compared. These methods are the Maximum Likelihood Estimator (MLE), the Square-Error Loss Function (BSE), the Entropy Loss Function (BEN) and the Composite LINEX Loss Function (BCL). The performance of these four methods was compared based on three criteria: the Mean Square Error (MSE), the Akaike Information Criterion (AIC), and the Bayesian Information Criterion (BIC). Using Monte Carlo simulation based on relevant samples, the comparisons in this study suggest that the Bayesian method is better than the maximum likelihood estimator with respect to the estimation of the parameter that offers the smallest values of MSE, AIC, and BIC. Confidence intervals were then assessed to test the performance of the methods by comparing the 95% CI and average lengths (AL) for all estimation methods, showing that the Bayesian methods still offer the best performance in terms of generating the smallest ALs.展开更多
To make the quantitative results of nuclear magnetic resonance(NMR) transverse relaxation(T;) spectrums reflect the type and pore structure of reservoir more directly, an unsupervised clustering method was developed t...To make the quantitative results of nuclear magnetic resonance(NMR) transverse relaxation(T;) spectrums reflect the type and pore structure of reservoir more directly, an unsupervised clustering method was developed to obtain the quantitative pore structure information from the NMR T;spectrums based on the Gaussian mixture model(GMM). Firstly, We conducted the principal component analysis on T;spectrums in order to reduce the dimension data and the dependence of the original variables. Secondly, the dimension-reduced data was fitted using the GMM probability density function, and the model parameters and optimal clustering numbers were obtained according to the expectation-maximization algorithm and the change of the Akaike information criterion. Finally, the T;spectrum features and pore structure types of different clustering groups were analyzed and compared with T;geometric mean and T;arithmetic mean. The effectiveness of the algorithm has been verified by numerical simulation and field NMR logging data. The research shows that the clustering results based on GMM method have good correlations with the shape and distribution of the T;spectrum, pore structure, and petroleum productivity, providing a new means for quantitative identification of pore structure, reservoir grading, and oil and gas productivity evaluation.展开更多
Based on the Bayesian information criterion, this paper proposes the improved local linear prediction method to predict chaotic time series. This method uses spatial correlation and temporal correlation simultaneously...Based on the Bayesian information criterion, this paper proposes the improved local linear prediction method to predict chaotic time series. This method uses spatial correlation and temporal correlation simultaneously. Simulation results show that the improved local linear prediction method can effectively make multi-step and one-step prediction of chaotic time series and the multi-step prediction performance and one-step prediction accuracy of the improved local linear prediction method are superior to those of the traditional local linear prediction method.展开更多
This study investigates the volatility in daily stock returns for Total Nigeria Plc using nine variants of GARCH models:sGARCH,girGARCH,eGARCH,iGARCH,aGARCH,TGARCH,NGARCH,NAGARCH,and AVGARCH along with value at risk e...This study investigates the volatility in daily stock returns for Total Nigeria Plc using nine variants of GARCH models:sGARCH,girGARCH,eGARCH,iGARCH,aGARCH,TGARCH,NGARCH,NAGARCH,and AVGARCH along with value at risk estimation and backtesting.We use daily data for Total Nigeria Plc returns for the period January 2,2001 to May 8,2017,and conclude that eGARCH and sGARCH perform better for normal innovations while NGARCH performs better for student t innovations.This investigation of the volatility,VaR,and backtesting of the daily stock price of Total Nigeria Plc is important as most previous studies covering the Nigerian stock market have not paid much attention to the application of backtesting as a primary approach.We found from the results of the estimations that the persistence of the GARCH models are stable except for few cases for which iGARCH and eGARCH were unstable.Additionally,for student t innovation,the sGARCH and girGARCH models failed to converge;the mean reverting number of days for returns differed from model to model.From the analysis of VaR and its backtesting,this study recommends shareholders and investors continue their business with Total Nigeria Plc because possible losses may be overcome in the future by improvements in stock prices.Furthermore,risk was reflected by significant up and down movement in the stock price at a 99%confidence level,suggesting that high risk brings a high return.展开更多
Time series analysis has two goals, modeling random mechanisms and predicting future series using historical data. In the present work, a uni-variate time series autoregressive integrated moving average (ARIMA) mode...Time series analysis has two goals, modeling random mechanisms and predicting future series using historical data. In the present work, a uni-variate time series autoregressive integrated moving average (ARIMA) model has been developed for (a) simulating and forecasting mean rainfall, obtained using Theissen weights; over the Mahanadi River Basin in India, and (b) simula^ag and forecasting mean rainfall at 38 rain-gauge stations in district towns across the basin. For the analysis, monthly rainfall data of each district town for the years 1901-2002 (102 years) were used. Theissen weights were obtained over the basin and mean monthly rainfall was estimated. The trend and seasonality observed in ACF and PACF plots of rainfall data were removed using power transformation (a=0.5) and first order seasonal differencing prior to the development of the ARIMA model. Interestingly, the AR1MA model (1,0,0)(0,1,1)12 developed here was found to be most suitable for simulating and forecasting mean rainfall over the Mahanadi River Basin and for all 38 district town rain-gauge stations, separately. The Akaike Information Criterion (AIC), good- ness of fit (Chi-square), R2 (coefficient of determination), MSE (mean square error) and MAE (mea absolute error) were used to test the validity and applicability of the developed ARIMA model at different stages. This model is considered appropriate to forecast the monthly rainfall for the upcoming 12 years in each district town to assist decision makers and policy makers establish priorities for water demand, storage, distribution, and disaster management.展开更多
Global spread of infectious disease threatens the well-being of human, domestic, and wildlife health. A proper understanding of global distribution of these diseases is an important part of disease management and poli...Global spread of infectious disease threatens the well-being of human, domestic, and wildlife health. A proper understanding of global distribution of these diseases is an important part of disease management and policy making. However, data are subject to complexities by heterogeneity across host classes. The use of frequentist methods in biostatistics and epidemiology is common and is therefore extensively utilized in answering varied research questions. In this paper, we applied the hierarchical Bayesian approach to study the spatial distribution of tuberculosis in Kenya. The focus was to identify best fitting model for modeling TB relative risk in Kenya. The Markov Chain Monte Carlo (MCMC) method via WinBUGS and R packages was used for simulations. The Deviance Information Criterion (DIC) proposed by [1] was used for models comparison and selection. Among the models considered, unstructured heterogeneity model perfumes better in terms of modeling and mapping TB RR in Kenya. Variation in TB risk is observed among Kenya counties and clustering among counties with high TB Relative Risk (RR). HIV prevalence is identified as the dominant determinant of TB. We find clustering and heterogeneity of risk among high rate counties. Although the approaches are less than ideal, we hope that our formulations provide a useful stepping stone in the development of spatial methodology for the statistical analysis of risk from TB in Kenya.展开更多
Shallow earthquakes usually show obvious spatio-temporal clustering patterns. In this study, several spatio-temporal point process models are applied to investigate the clustering characteristics of the well-known Tan...Shallow earthquakes usually show obvious spatio-temporal clustering patterns. In this study, several spatio-temporal point process models are applied to investigate the clustering characteristics of the well-known Tangshan sequence based on classical empirical laws and a few assumptions. The relative fit of competing models is compared by Akalke Information Criterion. The spatial clustering pattern is well characterized by the model which gives the best fit to the data. A simulated aftershock sequence is generated by thinning algorithm and compared with the real seismicity.展开更多
Statistical distributions play a prominent role in applied sciences,particularly in biomedical sciences.The medical data sets are generally skewed to the right,and skewed distributions can be used quite effectively to...Statistical distributions play a prominent role in applied sciences,particularly in biomedical sciences.The medical data sets are generally skewed to the right,and skewed distributions can be used quite effectively to model such kind of data sets.In the present study,therefore,we propose a new family of distributions suitable for modeling right-skewed medical data sets.The proposed family may be called a new generalized-X family.A special sub-model of the proposed family called a new generalized-Weibull distribution is discussed in detail.The maximum likelihood estimators of the model parameters are obtained.A brief Monte Carlo simulation study is conducted to evaluate the performance of these estimators.Finally,the proposed model is applied to the remission times of the stomach cancer patient’s data.The comparison of the goodness of fit results of the proposed model is made with the other competing models such as Weibull,Kumaraswamy Weibull,and exponentiated Weibull distributions.Certain analytical measures such as Akaike information criterion,Bayesian information criterion,Anderson Darling statistic,and Kolmogorov-Smirnov test statistic are considered to show which distribution provides the best fit to data.Based on these measures,it is showed that the proposed distribution is a reasonable candidate for modeling data in medical sciences and other related fields.展开更多
To avoid the negative effects of disturbances on satellites,the characteristics of micro-vibration on flywheels are studied.Considering rotor imbalance,bearing imperfections and structural elasticity,the extended mode...To avoid the negative effects of disturbances on satellites,the characteristics of micro-vibration on flywheels are studied.Considering rotor imbalance,bearing imperfections and structural elasticity,the extended model of micro-vibration is established.In the feature extraction of micro-vibration,singular value decomposition combined with the improved Akaike Information Criterion(AIC-SVD)is applied to denoise.More robust and self-adaptable than the peak threshold denoising,AIC-SVD can effectively remove the noise components.Subsequently,the effective harmonic coefficients are extracted by the binning algorithm.The results show that the harmonic coefficients have great identification in frequency domain.Except for the fundamental frequency caused by rotor imbalance,the harmonics are also caused by the coupling of imperfections on bearing components.展开更多
In supervised learning the number of values of a response variable can be very high. Grouping these values in a few clusters can be useful to perform accurate supervised classification analyses. On the other hand sele...In supervised learning the number of values of a response variable can be very high. Grouping these values in a few clusters can be useful to perform accurate supervised classification analyses. On the other hand selecting relevant covariates is a crucial step to build robust and efficient prediction models. We propose in this paper an algorithm that simultaneously groups the values of a response variable into a limited number of clusters and selects stepwise the best covariates that discriminate this clustering. These objectives are achieved by alternate optimization of a user-defined model selection criterion. This process extends a former version of the algorithm to a more general framework. Moreover possible further developments are discussed in detail.展开更多
Mixture regression is a regression problem with mixed data. Specifically, in the observations, some data are from one model, while others from other models. Only after assuming the quantity of the model is given, EM o...Mixture regression is a regression problem with mixed data. Specifically, in the observations, some data are from one model, while others from other models. Only after assuming the quantity of the model is given, EM or other algorithms can be used to solve this problem. We propose an information criterion for mixture regression model in this paper. Compared to ordinary information citizen by data simulations, results show our citizen has better performance on choosing the correct quantity of models.展开更多
The multimodel inference makes statistical inferences from a set of plausible models rather than from a single model.In this paper,we focus on the multimodel inference based on smoothed information criteria proposed b...The multimodel inference makes statistical inferences from a set of plausible models rather than from a single model.In this paper,we focus on the multimodel inference based on smoothed information criteria proposed by seminal monographs(see Buckland et al.(1997)and Burnham and Anderson(2003)),which are termed as smoothed Akaike information criterion(SAIC)and smoothed Bayesian information criterion(SBIC)methods.Due to their simplicity and applicability,these methods are very widely used in many fields.By using an illustrative example and deriving limiting properties for the weights in the linear regression,we find that the existing variance estimation for SAIC is not applicable because of a restrictive condition,but for SBIC it is applicable.Especially,we propose a simulation-based inference for SAIC based on the limiting properties.Both the simulation study and the real data example show the promising performance of the proposed simulationbased inference.展开更多
Herein, a typhoon hazard assessment method at the site-specific scale is proposed. This method integrates the nonlinear threedimensional wind field model and the probability density evolution method. At the site-speci...Herein, a typhoon hazard assessment method at the site-specific scale is proposed. This method integrates the nonlinear threedimensional wind field model and the probability density evolution method. At the site-specific scale, the track of a typhoon near the engineering site is approximated via a straight line. The wind field model is utilized to calculate the wind speed at the surface given the gradient wind field at the top of the boundary layer. A comparison between the simulated and observed wind histories for Typhoon Hagupit that made landfall in Guangdong Province demonstrates the fidelity of the wind field model. The probability density evolution method is utilized to calculate the propagation of the randomness from the basic random variables toward the extremities of the typhoon surface wind. To model the probability distribution of the basic random variables, several candidate distributions are considered to fit the observations. Akaike information criterion and Anderson-Darling distance are used for selecting the preferred probability distribution model. The adequacy of the probability density evolution method in assessing typhoon hazards is verified by comparing the results with those generated by Monte Carlo simulations. The typhoon wind hazards estimated by the present study are compared with those proposed by other studies and the design code, and the differences are analyzed and discussed. The results of the proposed method provide the reasonable probabilistic model for the assessment of the structural reliability and the improvement of community resilience in the typhoon-prone areas.展开更多
Multiple change-points estimation for functional time series is studied in this paper.The change-point problem is first transformed into a high-dimensional sparse estimation problem via basis functions.Group least abs...Multiple change-points estimation for functional time series is studied in this paper.The change-point problem is first transformed into a high-dimensional sparse estimation problem via basis functions.Group least absolute shrinkage and selection operator(LASSO)is then applied to estimate the number and the locations of possible change points.However,the group LASSO(GLASSO)always overestimate the true points.To circumvent this problem,a further Information Criterion(IC)is applied to eliminate the redundant estimated points.It is shown that the proposed two-step procedure estimates the number and the locations of the change-points consistently.Simulations and two temperature data examples are also provided to illustrate the finite sample performance of the proposed method.展开更多
In order to measure the uncertainty of financial asset returns in the stock market, this paper presents a new model, called SV-dt C model, a stochastic volatility(SV) model assuming that the stock return has a doubly ...In order to measure the uncertainty of financial asset returns in the stock market, this paper presents a new model, called SV-dt C model, a stochastic volatility(SV) model assuming that the stock return has a doubly truncated Cauchy distribution, which takes into account the high peak and fat tail of the empirical distribution simultaneously. Under the Bayesian framework, a prior and posterior analysis for the parameters is made and Markov Chain Monte Carlo(MCMC) is used for computing the posterior estimates of the model parameters and forecasting in the empirical application of Shanghai Stock Exchange Composite Index(SSECI) with respect to the proposed SV-dt C model and two classic SV-N(SV model with Normal distribution)and SV-T(SV model with Student-t distribution) models. The empirical analysis shows that the proposed SV-dt C model has better performance by model checking, including independence test(Projection correlation test), Kolmogorov-Smirnov test(K-S test) and Q-Q plot. Additionally, deviance information criterion(DIC) also shows that the proposed model has a significant improvement in model fit over the others.展开更多
We propose a novel polynomial network autoregressive model by incorporating higher-order connected relationships to simultaneously model the effects of both direct and indirect connections. A quasimaximum likelihood e...We propose a novel polynomial network autoregressive model by incorporating higher-order connected relationships to simultaneously model the effects of both direct and indirect connections. A quasimaximum likelihood estimation method is proposed to estimate the unknown influence parameters, and we demonstrate its consistency and asymptotic normality without imposing any distribution assumption. Moreover,an extended Bayesian information criterion is set for order selection with a divergent upper order. The application of the proposed polynomial network autoregressive model is demonstrated through both the simulation and the real data analysis.展开更多
基金SP is supported by a Discovery Grant of the Natural Sciences and Engineering Research Council of Canada(RGOIN-2018-04967).
文摘A powerful investigative tool in biology is to consider not a single mathematical model but a collection of models designed to explore different working hypotheses and select the best model in that collection.In these lecture notes,the usual workflow of the use of mathematical models to investigate a biological problem is described and the use of a collection of model is motivated.Models depend on parameters that must be estimated using observations;and when a collection of models is considered,the best model has then to be identified based on available observations.Hence,model calibration and selection,which are intrinsically linked,are essential steps of the workflow.Here,some procedures for model calibration and a criterion,the Akaike Information Criterion,of model selection based on experimental data are described.Rough derivation,practical technique of computation and use of this criterion are detailed.
基金the National High Technology Research and Development Program (863) of China(Nos. 2005AA145110 and 2006AA01Z436)the Natural Science Foundation of Shanghai of China(No. 05ZR14083)the Pudong New Area Technology Innovation Public Service Platform of China(No. PDPT2005-04)
文摘Modeling of network traffic is a fundamental building block of computer science. Measurements of network traffic demonstrate that self-similarity is one of the basic properties of the network traffic possess at large time-scale. This paper investigates the change of non-stationary self-similarity of network traffic over time,and proposes a method of combining the discrete wavelet transform (DWT) and Schwarz information criterion (SIC) to detect change points of self-similarity in network traffic. The traffic is segmented into pieces around changing points with homogenous characteristics for the Hurst parameter,named local Hurst parameter,and then each piece of network traffic is modeled using fractional Gaussian noise (FGN) model with the local Hurst parameter. The presented experimental performance on data set from the Internet Traffic Archive (ITA) demonstrates that the method is more accurate in describing the non-stationary self-similarity of network traffic.
文摘In this paper, the estimators of the scale parameter of the exponential distribution obtained by applying four methods, using complete data, are critically examined and compared. These methods are the Maximum Likelihood Estimator (MLE), the Square-Error Loss Function (BSE), the Entropy Loss Function (BEN) and the Composite LINEX Loss Function (BCL). The performance of these four methods was compared based on three criteria: the Mean Square Error (MSE), the Akaike Information Criterion (AIC), and the Bayesian Information Criterion (BIC). Using Monte Carlo simulation based on relevant samples, the comparisons in this study suggest that the Bayesian method is better than the maximum likelihood estimator with respect to the estimation of the parameter that offers the smallest values of MSE, AIC, and BIC. Confidence intervals were then assessed to test the performance of the methods by comparing the 95% CI and average lengths (AL) for all estimation methods, showing that the Bayesian methods still offer the best performance in terms of generating the smallest ALs.
基金Supported by the National Natural Science Foundation of China (42174142)National Science and Technology Major Project (2017ZX05039-002)+2 种基金Operation Fund of China National Petroleum Corporation Logging Key Laboratory (2021DQ20210107-11)Fundamental Research Funds for Central Universities (19CX02006A)Major Science and Technology Project of China National Petroleum Corporation (ZD2019-183-006)。
文摘To make the quantitative results of nuclear magnetic resonance(NMR) transverse relaxation(T;) spectrums reflect the type and pore structure of reservoir more directly, an unsupervised clustering method was developed to obtain the quantitative pore structure information from the NMR T;spectrums based on the Gaussian mixture model(GMM). Firstly, We conducted the principal component analysis on T;spectrums in order to reduce the dimension data and the dependence of the original variables. Secondly, the dimension-reduced data was fitted using the GMM probability density function, and the model parameters and optimal clustering numbers were obtained according to the expectation-maximization algorithm and the change of the Akaike information criterion. Finally, the T;spectrum features and pore structure types of different clustering groups were analyzed and compared with T;geometric mean and T;arithmetic mean. The effectiveness of the algorithm has been verified by numerical simulation and field NMR logging data. The research shows that the clustering results based on GMM method have good correlations with the shape and distribution of the T;spectrum, pore structure, and petroleum productivity, providing a new means for quantitative identification of pore structure, reservoir grading, and oil and gas productivity evaluation.
文摘Based on the Bayesian information criterion, this paper proposes the improved local linear prediction method to predict chaotic time series. This method uses spatial correlation and temporal correlation simultaneously. Simulation results show that the improved local linear prediction method can effectively make multi-step and one-step prediction of chaotic time series and the multi-step prediction performance and one-step prediction accuracy of the improved local linear prediction method are superior to those of the traditional local linear prediction method.
文摘This study investigates the volatility in daily stock returns for Total Nigeria Plc using nine variants of GARCH models:sGARCH,girGARCH,eGARCH,iGARCH,aGARCH,TGARCH,NGARCH,NAGARCH,and AVGARCH along with value at risk estimation and backtesting.We use daily data for Total Nigeria Plc returns for the period January 2,2001 to May 8,2017,and conclude that eGARCH and sGARCH perform better for normal innovations while NGARCH performs better for student t innovations.This investigation of the volatility,VaR,and backtesting of the daily stock price of Total Nigeria Plc is important as most previous studies covering the Nigerian stock market have not paid much attention to the application of backtesting as a primary approach.We found from the results of the estimations that the persistence of the GARCH models are stable except for few cases for which iGARCH and eGARCH were unstable.Additionally,for student t innovation,the sGARCH and girGARCH models failed to converge;the mean reverting number of days for returns differed from model to model.From the analysis of VaR and its backtesting,this study recommends shareholders and investors continue their business with Total Nigeria Plc because possible losses may be overcome in the future by improvements in stock prices.Furthermore,risk was reflected by significant up and down movement in the stock price at a 99%confidence level,suggesting that high risk brings a high return.
文摘Time series analysis has two goals, modeling random mechanisms and predicting future series using historical data. In the present work, a uni-variate time series autoregressive integrated moving average (ARIMA) model has been developed for (a) simulating and forecasting mean rainfall, obtained using Theissen weights; over the Mahanadi River Basin in India, and (b) simula^ag and forecasting mean rainfall at 38 rain-gauge stations in district towns across the basin. For the analysis, monthly rainfall data of each district town for the years 1901-2002 (102 years) were used. Theissen weights were obtained over the basin and mean monthly rainfall was estimated. The trend and seasonality observed in ACF and PACF plots of rainfall data were removed using power transformation (a=0.5) and first order seasonal differencing prior to the development of the ARIMA model. Interestingly, the AR1MA model (1,0,0)(0,1,1)12 developed here was found to be most suitable for simulating and forecasting mean rainfall over the Mahanadi River Basin and for all 38 district town rain-gauge stations, separately. The Akaike Information Criterion (AIC), good- ness of fit (Chi-square), R2 (coefficient of determination), MSE (mean square error) and MAE (mea absolute error) were used to test the validity and applicability of the developed ARIMA model at different stages. This model is considered appropriate to forecast the monthly rainfall for the upcoming 12 years in each district town to assist decision makers and policy makers establish priorities for water demand, storage, distribution, and disaster management.
文摘Global spread of infectious disease threatens the well-being of human, domestic, and wildlife health. A proper understanding of global distribution of these diseases is an important part of disease management and policy making. However, data are subject to complexities by heterogeneity across host classes. The use of frequentist methods in biostatistics and epidemiology is common and is therefore extensively utilized in answering varied research questions. In this paper, we applied the hierarchical Bayesian approach to study the spatial distribution of tuberculosis in Kenya. The focus was to identify best fitting model for modeling TB relative risk in Kenya. The Markov Chain Monte Carlo (MCMC) method via WinBUGS and R packages was used for simulations. The Deviance Information Criterion (DIC) proposed by [1] was used for models comparison and selection. Among the models considered, unstructured heterogeneity model perfumes better in terms of modeling and mapping TB RR in Kenya. Variation in TB risk is observed among Kenya counties and clustering among counties with high TB Relative Risk (RR). HIV prevalence is identified as the dominant determinant of TB. We find clustering and heterogeneity of risk among high rate counties. Although the approaches are less than ideal, we hope that our formulations provide a useful stepping stone in the development of spatial methodology for the statistical analysis of risk from TB in Kenya.
基金supported by National Natural Science of Foundation of China(No.10871026)
文摘Shallow earthquakes usually show obvious spatio-temporal clustering patterns. In this study, several spatio-temporal point process models are applied to investigate the clustering characteristics of the well-known Tangshan sequence based on classical empirical laws and a few assumptions. The relative fit of competing models is compared by Akalke Information Criterion. The spatial clustering pattern is well characterized by the model which gives the best fit to the data. A simulated aftershock sequence is generated by thinning algorithm and compared with the real seismicity.
基金School of Statistics,Shanxi University of Finance and Economics,Taiyuan china.(i)The National Social Science Fund of China(17BTJ010)and(ii)The Fund for Shanxi“1331 Project”Key Innovative ResearchTeam.
文摘Statistical distributions play a prominent role in applied sciences,particularly in biomedical sciences.The medical data sets are generally skewed to the right,and skewed distributions can be used quite effectively to model such kind of data sets.In the present study,therefore,we propose a new family of distributions suitable for modeling right-skewed medical data sets.The proposed family may be called a new generalized-X family.A special sub-model of the proposed family called a new generalized-Weibull distribution is discussed in detail.The maximum likelihood estimators of the model parameters are obtained.A brief Monte Carlo simulation study is conducted to evaluate the performance of these estimators.Finally,the proposed model is applied to the remission times of the stomach cancer patient’s data.The comparison of the goodness of fit results of the proposed model is made with the other competing models such as Weibull,Kumaraswamy Weibull,and exponentiated Weibull distributions.Certain analytical measures such as Akaike information criterion,Bayesian information criterion,Anderson Darling statistic,and Kolmogorov-Smirnov test statistic are considered to show which distribution provides the best fit to data.Based on these measures,it is showed that the proposed distribution is a reasonable candidate for modeling data in medical sciences and other related fields.
基金National Natural Science Foundation of China(No.U1831123)Fundamental Research Funds for the Central Universities,China(No.2232017A3-04)。
文摘To avoid the negative effects of disturbances on satellites,the characteristics of micro-vibration on flywheels are studied.Considering rotor imbalance,bearing imperfections and structural elasticity,the extended model of micro-vibration is established.In the feature extraction of micro-vibration,singular value decomposition combined with the improved Akaike Information Criterion(AIC-SVD)is applied to denoise.More robust and self-adaptable than the peak threshold denoising,AIC-SVD can effectively remove the noise components.Subsequently,the effective harmonic coefficients are extracted by the binning algorithm.The results show that the harmonic coefficients have great identification in frequency domain.Except for the fundamental frequency caused by rotor imbalance,the harmonics are also caused by the coupling of imperfections on bearing components.
文摘In supervised learning the number of values of a response variable can be very high. Grouping these values in a few clusters can be useful to perform accurate supervised classification analyses. On the other hand selecting relevant covariates is a crucial step to build robust and efficient prediction models. We propose in this paper an algorithm that simultaneously groups the values of a response variable into a limited number of clusters and selects stepwise the best covariates that discriminate this clustering. These objectives are achieved by alternate optimization of a user-defined model selection criterion. This process extends a former version of the algorithm to a more general framework. Moreover possible further developments are discussed in detail.
文摘Mixture regression is a regression problem with mixed data. Specifically, in the observations, some data are from one model, while others from other models. Only after assuming the quantity of the model is given, EM or other algorithms can be used to solve this problem. We propose an information criterion for mixture regression model in this paper. Compared to ordinary information citizen by data simulations, results show our citizen has better performance on choosing the correct quantity of models.
基金supported by National Key R&D Program of China(Grant No.2020AAA 0105200)National Natural Science Foundation of China(Grant Nos.12001559,71925007,71988101 and 72042019)+3 种基金Ministry of Education of China(Grant No.17YJC910011)the Youth Innovation Promotion Association of the Chinese Academy of Sciencesthe Beijing Academy of Artificial IntelligenceAcademy for Multidisciplinary Studies,Capital Normal University。
文摘The multimodel inference makes statistical inferences from a set of plausible models rather than from a single model.In this paper,we focus on the multimodel inference based on smoothed information criteria proposed by seminal monographs(see Buckland et al.(1997)and Burnham and Anderson(2003)),which are termed as smoothed Akaike information criterion(SAIC)and smoothed Bayesian information criterion(SBIC)methods.Due to their simplicity and applicability,these methods are very widely used in many fields.By using an illustrative example and deriving limiting properties for the weights in the linear regression,we find that the existing variance estimation for SAIC is not applicable because of a restrictive condition,but for SBIC it is applicable.Especially,we propose a simulation-based inference for SAIC based on the limiting properties.Both the simulation study and the real data example show the promising performance of the proposed simulationbased inference.
基金supported by the National Natural Science Foundation of China (Grant No. 51538010)。
文摘Herein, a typhoon hazard assessment method at the site-specific scale is proposed. This method integrates the nonlinear threedimensional wind field model and the probability density evolution method. At the site-specific scale, the track of a typhoon near the engineering site is approximated via a straight line. The wind field model is utilized to calculate the wind speed at the surface given the gradient wind field at the top of the boundary layer. A comparison between the simulated and observed wind histories for Typhoon Hagupit that made landfall in Guangdong Province demonstrates the fidelity of the wind field model. The probability density evolution method is utilized to calculate the propagation of the randomness from the basic random variables toward the extremities of the typhoon surface wind. To model the probability distribution of the basic random variables, several candidate distributions are considered to fit the observations. Akaike information criterion and Anderson-Darling distance are used for selecting the preferred probability distribution model. The adequacy of the probability density evolution method in assessing typhoon hazards is verified by comparing the results with those generated by Monte Carlo simulations. The typhoon wind hazards estimated by the present study are compared with those proposed by other studies and the design code, and the differences are analyzed and discussed. The results of the proposed method provide the reasonable probabilistic model for the assessment of the structural reliability and the improvement of community resilience in the typhoon-prone areas.
基金NSFC(Grant No.12171427/U21A20426/11771390)Zhejiang Provincial Natural Science Foundation(Grant No.LZ21A010002)the Fundamental Research Funds for the Central Universities(Grant No.2021XZZX002)。
文摘Multiple change-points estimation for functional time series is studied in this paper.The change-point problem is first transformed into a high-dimensional sparse estimation problem via basis functions.Group least absolute shrinkage and selection operator(LASSO)is then applied to estimate the number and the locations of possible change points.However,the group LASSO(GLASSO)always overestimate the true points.To circumvent this problem,a further Information Criterion(IC)is applied to eliminate the redundant estimated points.It is shown that the proposed two-step procedure estimates the number and the locations of the change-points consistently.Simulations and two temperature data examples are also provided to illustrate the finite sample performance of the proposed method.
基金supported by the Open Fund of State Key Laboratory of New Metal Materials,Beijing University of Science and Technology (No.2022Z-18)。
文摘In order to measure the uncertainty of financial asset returns in the stock market, this paper presents a new model, called SV-dt C model, a stochastic volatility(SV) model assuming that the stock return has a doubly truncated Cauchy distribution, which takes into account the high peak and fat tail of the empirical distribution simultaneously. Under the Bayesian framework, a prior and posterior analysis for the parameters is made and Markov Chain Monte Carlo(MCMC) is used for computing the posterior estimates of the model parameters and forecasting in the empirical application of Shanghai Stock Exchange Composite Index(SSECI) with respect to the proposed SV-dt C model and two classic SV-N(SV model with Normal distribution)and SV-T(SV model with Student-t distribution) models. The empirical analysis shows that the proposed SV-dt C model has better performance by model checking, including independence test(Projection correlation test), Kolmogorov-Smirnov test(K-S test) and Q-Q plot. Additionally, deviance information criterion(DIC) also shows that the proposed model has a significant improvement in model fit over the others.
基金supported by the Fundamental Research Funds for the Central Universities(Grant No.JBK2207075)The second author was supported by National Natural Science Foundation of China(Grant Nos.71991472,12171395,11931014 and 71532001)+1 种基金the Joint Lab of Data Science and Business Intelligence at Southwestern University of Finance and Economics and the Fundamental Research Funds for the Central Universities(Grant No.JBK1806002)The fourth author was supported by the Humanity and Social Science Youth Foundation of Ministry of Education of China(Grant No.19YJC790204)。
文摘We propose a novel polynomial network autoregressive model by incorporating higher-order connected relationships to simultaneously model the effects of both direct and indirect connections. A quasimaximum likelihood estimation method is proposed to estimate the unknown influence parameters, and we demonstrate its consistency and asymptotic normality without imposing any distribution assumption. Moreover,an extended Bayesian information criterion is set for order selection with a divergent upper order. The application of the proposed polynomial network autoregressive model is demonstrated through both the simulation and the real data analysis.