In this paper, we investigate the variable selection problem of the generalized regression models. To estimate the regression parameter, a procedure combining the rank correlation method and the adaptive lasso techniq...In this paper, we investigate the variable selection problem of the generalized regression models. To estimate the regression parameter, a procedure combining the rank correlation method and the adaptive lasso technique is developed, which is proved to have oracle properties. A modified IMO (iterative marginal optimization) algorithm which directly aims to maximize the penalized rank correlation function is proposed. The effects of the estimating procedure are illustrated by simulation studies.展开更多
In order to detect whether the data conforms to the given model, it is necessary to diagnose the data in the statistical way. The diagnostic problem in generalized nonlinear models based on the maximum Lq-likelihood e...In order to detect whether the data conforms to the given model, it is necessary to diagnose the data in the statistical way. The diagnostic problem in generalized nonlinear models based on the maximum Lq-likelihood estimation is considered. Three diagnostic statistics are used to detect whether the outliers exist in the data set. Simulation results show that when the sample size is small, the values of diagnostic statistics based on the maximum Lq-likelihood estimation are greater than the values based on the maximum likelihood estimation. As the sample size increases, the difference between the values of the diagnostic statistics based on two estimation methods diminishes gradually. It means that the outliers can be distinguished easier through the maximum Lq-likelihood method than those through the maximum likelihood estimation method.展开更多
Forest fires are natural disasters that can occur suddenly and can be very damaging,burning thousands of square kilometers.Prevention is better than suppression and prediction models of forest fire occurrence have dev...Forest fires are natural disasters that can occur suddenly and can be very damaging,burning thousands of square kilometers.Prevention is better than suppression and prediction models of forest fire occurrence have developed from the logistic regression model,the geographical weighted logistic regression model,the Lasso regression model,the random forest model,and the support vector machine model based on historical forest fire data from 2000 to 2019 in Jilin Province.The models,along with a distribution map are presented in this paper to provide a theoretical basis for forest fire management in this area.Existing studies show that the prediction accuracies of the two machine learning models are higher than those of the three generalized linear regression models.The accuracies of the random forest model,the support vector machine model,geographical weighted logistic regression model,the Lasso regression model,and logistic model were 88.7%,87.7%,86.0%,85.0%and 84.6%,respectively.Weather is the main factor affecting forest fires,while the impacts of topography factors,human and social-economic factors on fire occurrence were similar.展开更多
Objective To obtain the exposure-response relationship for temperature and mortality, and assess the risk of heat-related premature death. Methods A statistical model was developed using a Poisson generalized linear r...Objective To obtain the exposure-response relationship for temperature and mortality, and assess the risk of heat-related premature death. Methods A statistical model was developed using a Poisson generalized linear regression model with Beijing mortality and temperature data from October 1st, 2006 to September 30th, 2008. We calculated the exposure-response relationship for temperature and mortality in the central city, and inner suburban and outer suburban regions. Based on this relationship, a health risk model was used to assess the risk of heat-related premature death in the summer (June to August) of 2009. Results The population in the outer suburbs had the highest temperature-related mortality risk. People in the central city had a mid-range risk, while people in the inner suburbs had the lowest risk. Risk assessment predicted that the number of heat-related premature deaths in the summer of 2009 was 1581. The city areas of Chaoyang and Haidian districts had the highest number of premature deaths. The number of premature deaths in the southern areas of Beijing (Fangshan, Fengtai, Daxing, and Tongzhou districts) was in the mid-range. Conclusion Ambient temperature significantly affects human mortality in Beijing. People in the city and outer suburban area have a higher temperature-related mortality risk than people in the inner suburban area. This may be explained by a temperature-related vulnerability. Key words: Temperature; Mortality; Premature death; Health risk; Generalized linear regression model; Climate change展开更多
Background:Diabetes mellitus is a chronic metabolic disease that is a risk factor for epidemic pathologies.Under hyperglycemic conditions,the enzyme aldose reductase catalyzes the formation of sorbitol in the metaboli...Background:Diabetes mellitus is a chronic metabolic disease that is a risk factor for epidemic pathologies.Under hyperglycemic conditions,the enzyme aldose reductase catalyzes the formation of sorbitol in the metabolism of glucose via polyols,leading to the development of diabetic complications.Therefore,inhibitors of this enzyme are therapeutic targets for the prophylaxis and treatment of these conditions.Methods:In this study,a generalized linear regression model was developed to analyze flavonoids-obtained from a database-that have been tested as inhibitors of aldose reductase.In this sense,the molecular descriptors implemented in DRAGON and MATLAB software were used to determine the correlation between the chemical structure of the inhibitors and their pharmacological activity.The model was validated according to the Organisation for Economic Co-operation and Development Standards and subsequently used for the virtual screening of the flavonoids identified in Jatropha gossypiifolia L.Results:The proposed model showed a good fit for its statistical parameters(R2=0.95).In addition,it showed good predictive power(R2 ext=0.94)and robustness(Q2 LOO=0.92).The experimental chemical space wherein the predictions were reliable(domain of application)was also defined.Finally,the model was used to identify 10 flavonoids from Jatropha gossypiifolia L.as candidates for natural drugs.Compounds with a low probability of oral absorption were identified,among which the elagic acid biflavonoid showed the greatest promise(pIC50 predicted=9.75).Conclusion:The Jatropha gossypiifolia L.species harbors flavonoids with high potential as inhibitors of the aldose reductase enzyme,in which the biflavonoid ellagic acid was shown to be the most promising inhibitor of the aldose reductase enzyme,suggesting its possible use in the treatment of the late complications of diabetes mellitus.展开更多
基金supported by National Natural Science Foundation of China(10901162)supported by the Fundamental Research Funds for the Central Universities and the Research Funds of Renmin University of China(10XNF073)supported by China Postdoctoral Science Foundation(2014M550799)
文摘In this paper, we investigate the variable selection problem of the generalized regression models. To estimate the regression parameter, a procedure combining the rank correlation method and the adaptive lasso technique is developed, which is proved to have oracle properties. A modified IMO (iterative marginal optimization) algorithm which directly aims to maximize the penalized rank correlation function is proposed. The effects of the estimating procedure are illustrated by simulation studies.
基金The National Natural Science Foundation of China(No.11171065)the Natural Science Foundation of Jiangsu Province(No.BK2011058)
文摘In order to detect whether the data conforms to the given model, it is necessary to diagnose the data in the statistical way. The diagnostic problem in generalized nonlinear models based on the maximum Lq-likelihood estimation is considered. Three diagnostic statistics are used to detect whether the outliers exist in the data set. Simulation results show that when the sample size is small, the values of diagnostic statistics based on the maximum Lq-likelihood estimation are greater than the values based on the maximum likelihood estimation. As the sample size increases, the difference between the values of the diagnostic statistics based on two estimation methods diminishes gradually. It means that the outliers can be distinguished easier through the maximum Lq-likelihood method than those through the maximum likelihood estimation method.
基金This research was funded by the National Natural Science Foundation of China(grant no.32271881).
文摘Forest fires are natural disasters that can occur suddenly and can be very damaging,burning thousands of square kilometers.Prevention is better than suppression and prediction models of forest fire occurrence have developed from the logistic regression model,the geographical weighted logistic regression model,the Lasso regression model,the random forest model,and the support vector machine model based on historical forest fire data from 2000 to 2019 in Jilin Province.The models,along with a distribution map are presented in this paper to provide a theoretical basis for forest fire management in this area.Existing studies show that the prediction accuracies of the two machine learning models are higher than those of the three generalized linear regression models.The accuracies of the random forest model,the support vector machine model,geographical weighted logistic regression model,the Lasso regression model,and logistic model were 88.7%,87.7%,86.0%,85.0%and 84.6%,respectively.Weather is the main factor affecting forest fires,while the impacts of topography factors,human and social-economic factors on fire occurrence were similar.
基金supported by the National Natural Science Foundation of China(project numbers:40905069,41110104015)the Chinese Center for Disease Control and Prevention Science Foundation for Youth(project number:2011A206)
文摘Objective To obtain the exposure-response relationship for temperature and mortality, and assess the risk of heat-related premature death. Methods A statistical model was developed using a Poisson generalized linear regression model with Beijing mortality and temperature data from October 1st, 2006 to September 30th, 2008. We calculated the exposure-response relationship for temperature and mortality in the central city, and inner suburban and outer suburban regions. Based on this relationship, a health risk model was used to assess the risk of heat-related premature death in the summer (June to August) of 2009. Results The population in the outer suburbs had the highest temperature-related mortality risk. People in the central city had a mid-range risk, while people in the inner suburbs had the lowest risk. Risk assessment predicted that the number of heat-related premature deaths in the summer of 2009 was 1581. The city areas of Chaoyang and Haidian districts had the highest number of premature deaths. The number of premature deaths in the southern areas of Beijing (Fangshan, Fengtai, Daxing, and Tongzhou districts) was in the mid-range. Conclusion Ambient temperature significantly affects human mortality in Beijing. People in the city and outer suburban area have a higher temperature-related mortality risk than people in the inner suburban area. This may be explained by a temperature-related vulnerability. Key words: Temperature; Mortality; Premature death; Health risk; Generalized linear regression model; Climate change
文摘Background:Diabetes mellitus is a chronic metabolic disease that is a risk factor for epidemic pathologies.Under hyperglycemic conditions,the enzyme aldose reductase catalyzes the formation of sorbitol in the metabolism of glucose via polyols,leading to the development of diabetic complications.Therefore,inhibitors of this enzyme are therapeutic targets for the prophylaxis and treatment of these conditions.Methods:In this study,a generalized linear regression model was developed to analyze flavonoids-obtained from a database-that have been tested as inhibitors of aldose reductase.In this sense,the molecular descriptors implemented in DRAGON and MATLAB software were used to determine the correlation between the chemical structure of the inhibitors and their pharmacological activity.The model was validated according to the Organisation for Economic Co-operation and Development Standards and subsequently used for the virtual screening of the flavonoids identified in Jatropha gossypiifolia L.Results:The proposed model showed a good fit for its statistical parameters(R2=0.95).In addition,it showed good predictive power(R2 ext=0.94)and robustness(Q2 LOO=0.92).The experimental chemical space wherein the predictions were reliable(domain of application)was also defined.Finally,the model was used to identify 10 flavonoids from Jatropha gossypiifolia L.as candidates for natural drugs.Compounds with a low probability of oral absorption were identified,among which the elagic acid biflavonoid showed the greatest promise(pIC50 predicted=9.75).Conclusion:The Jatropha gossypiifolia L.species harbors flavonoids with high potential as inhibitors of the aldose reductase enzyme,in which the biflavonoid ellagic acid was shown to be the most promising inhibitor of the aldose reductase enzyme,suggesting its possible use in the treatment of the late complications of diabetes mellitus.