Fault monitoring of bioprocess is important to ensure safety of a reactor and maintain high quality of products. It is difficult to build an accurate mechanistic model for a bioprocess, so fault monitoring based on ri...Fault monitoring of bioprocess is important to ensure safety of a reactor and maintain high quality of products. It is difficult to build an accurate mechanistic model for a bioprocess, so fault monitoring based on rich historical or online database is an effective way. A group of data based on bootstrap method could be resampling stochastically, improving generalization capability of model. In this paper, online fault monitoring of generalized additive models (GAMs) combining with bootstrap is proposed for glutamate fermentation process. GAMs and bootstrap are first used to decide confidence interval based on the online and off-line normal sampled data from glutamate fermentation experiments. Then GAMs are used to online fault monitoring for time, dissolved oxygen, oxygen uptake rate, and carbon dioxide evolution rate. The method can provide accurate fault alarm online and is helpful to provide useful information for removing fault and abnormal phenomena in the fermentation.展开更多
In dealing with nonparametric regression the GAM procedure is the most versatile of several new procedures. The terminology behind this procedure is more flexible than traditional parametric modeling tools. It relaxes...In dealing with nonparametric regression the GAM procedure is the most versatile of several new procedures. The terminology behind this procedure is more flexible than traditional parametric modeling tools. It relaxes the usual assumptions of parametric model and enables us to uncover structure to establish the relationship between independent variables and dependent variable in exponential family that may not be obvious otherwise. In this paper, we discussed two methods of fitting generalized additive logistic regression model, one based on Newton Raphson method and another based on iterative weighted least square method for first and second order Taylor series expansion. The use of the GAM procedure with the specified set of weights, using local scoring algorithm, was applied to real life data sets. The cubic spline smoother is applied to the independent variables. Based on nonparametric regression and smoothing techniques, this procedure provides powerful tools for data analysis.展开更多
This study aims to provide a predictive vegetation mapping approach based on the spectral data, DEM and Generalized Additive Models (GAMs). GAMs were used as a prediction tool to describe the relationship between vege...This study aims to provide a predictive vegetation mapping approach based on the spectral data, DEM and Generalized Additive Models (GAMs). GAMs were used as a prediction tool to describe the relationship between vegetation and environmental variables, as well as spectral variables. Based on the fitted GAMs model, probability map of species occurrence was generated and then vegetation type of each grid was defined according to the probability of species occurrence. Deviance analysis was employed to test the goodness of curve fitting and drop contribution calculation was used to evaluate the contribution of each predictor in the fitted GAMs models. Area under curve (AUC) of Receiver Operating Characteristic (ROC) curve was employed to assess the results maps of probability. The results showed that: 1) AUC values of the fitted GAMs models are very high which proves that integrating spectral data and environmental variables based on the GAMs is a feasible way to map the vegetation. 2) Prediction accuracy varies with plant community, and community with dense cover is better predicted than sparse plant community. 3) Both spectral variables and environmental variables play an important role in mapping the vegetation. However, the contribution of the same predictor in the GAMs models for different plant communities is different. 4) Insufficient resolution of spectral data, environmental data and confounding effects of land use and other variables which are not closely related to the environmental conditions are the major causes of imprecision.展开更多
This research develops a new mathematical modeling method by combining industrial big data and process mechanism analysis under the framework of generalized additive models(GAM)to generate a practical model with gener...This research develops a new mathematical modeling method by combining industrial big data and process mechanism analysis under the framework of generalized additive models(GAM)to generate a practical model with generalization and precision.Specifically,the proposed modeling method includes the following steps.Firstly,the influence factors are screened using mechanism knowledge and data-mining methods.Secondly,the unary GAM without interactions including cleaning the data,building the sub-models,and verifying the sub-models.Subsequently,the interactions between the various factors are explored,and the binary GAM with interactions is constructed.The relationships among the sub-models are analyzed,and the integrated model is built.Finally,based on the proposed modeling method,two prediction models of mechanical property and deformation resistance for hot-rolled strips are established.Industrial actual data verification demonstrates that the new models have good prediction precision,and the mean absolute percentage errors of tensile strength,yield strength and deformation resistance are 2.54%,3.34%and 6.53%,respectively.And experimental results suggest that the proposed method offers a new approach to industrial process modeling.展开更多
There are typical ecosystems of littoral wetlands in the Yellow River Delta.In order to study the relationships between Tamarix chinensis and environmental variables and to predict T.chinensis potential distribution i...There are typical ecosystems of littoral wetlands in the Yellow River Delta.In order to study the relationships between Tamarix chinensis and environmental variables and to predict T.chinensis potential distribution in the Yellow River Delta,641 vegetation samples and 964 soil samples were collected in the area in October of 2004,2005,2006 and 2007.The contents of soil organic matter,total phosphorus,salt,and soluble potassium were determined.Then,the analyzed data were interpolated into spatial raster data by Kriging interpolation method.Meanwhile,the digital elevation model,soil type map and landform unit map of the Yellow River Delta were also collected.Generalized Additive Models(GAMs) were employed to build species-environment model and then simulate the potential distribution of T.chinensis.The results indicated that the distribution of T.chinensis was mainly limited by soil salt content,total soil phosphorus content,soluble potassium content,soil type,landform unit,and elevation.The distribution probability of T.chinensis was produced with a lookup table generated by Grasp Module(based on GAMs) in software ArcView GIS 3.2.The AUC(Area Under Curve) value of validation and cross-validation of ROC(Receive Operating Characteristic) were both higher than 0.8,which suggested that the established model had a high precision for predicting species distribution.展开更多
In this study, the horizontal and vertical distribution of primary production(PP) and its monthly variations were described based on field data collected from the Daya Bay in January–December of 2016. The relationshi...In this study, the horizontal and vertical distribution of primary production(PP) and its monthly variations were described based on field data collected from the Daya Bay in January–December of 2016. The relationships between PP and environmental factors were analyzed using a general additive model(GAM). Significant seasonal differences were observed in the horizontal distribution of PP, while vertical distribution showed a relatively consistent unimodal pattern. The monthly average PP(calculated by carbon) ranged from 48.03 to 390.56 mg/(m~2·h),with an annual average of 182.77 mg/(m~2·h). The highest PP was observed in May and the lowest in November.Additionally, the overall trend in PP was spring>summer>winter>autumn, and spring PP was approximately three times that of autumn PP. GAM analysis revealed that temperature, bottom salinity, phytoplankton, and photosynthetically active radiation(PAR) had no significant relationships with PP, while longitude, depth, surface salinity, chlorophyll a(Chl a) and transparency were significantly correlated with PP. Overall, the results presented herein indicate that monsoonal changes and terrestrial and offshore water systems have crucial effects on environmental factors that are associated with PP changes.展开更多
Over the past decade,the presence of mistletoe(Viscum album ssp.austriacum)in Scots pine stands has increased in many European countries.Understanding the factors that influence the occurrence of mistletoe in stands i...Over the past decade,the presence of mistletoe(Viscum album ssp.austriacum)in Scots pine stands has increased in many European countries.Understanding the factors that influence the occurrence of mistletoe in stands is key to making appropriate forest management decisions to limit damage and prevent the spread of mistletoe in the future.Therefore,the main objective of this study was to determine the probability of mistletoe occurrence in Scots pine stands in relation to stand-related endogenous factors such as age,top height,and stand density,as well as topographic and edaphic factors.We used unmanned aerial vehicle(UAV)imagery from 2,247 stands to detect mistletoe in Scots pine stands,while majority stand and site characteristics were calculated from airborne laser scanning(ALS)data.Information on stand age and site type from the State Forest database were also used.We found that mistletoe infestation in Scots pine stands is influenced by stand and site characteristics.We documented that the densest,tallest,and oldest stands were more susceptible to mistletoe infestation.Site type and specific microsite conditions associated with topography were also important factors driving mistletoe occurrence.In addition,climatic water balance was a significant factor in increasing the probability of mistletoe occurrence,which is important in the context of predicted temperature increases associated with climate change.Our results are important for better understanding patterns of mistletoe infestation and ecosystem functioning under climate change.In an era of climate change and technological development,the use of remote sensing methods to determine the risk of mistletoe infestation can be a very useful tool for managing forest ecosystems to maintain forest sustainability and prevent forest disturbance.展开更多
Count data that exhibit over dispersion (variance of counts is larger than its mean) are commonly analyzed using discrete distributions such as negative binomial, Poisson inverse Gaussian and other models. The Poisson...Count data that exhibit over dispersion (variance of counts is larger than its mean) are commonly analyzed using discrete distributions such as negative binomial, Poisson inverse Gaussian and other models. The Poisson is characterized by the equality of mean and variance whereas the Negative Binomial and the Poisson inverse Gaussian have variance larger than the mean and therefore are more appropriate to model over-dispersed count data. As an alternative to these two models, we shall use the generalized Poisson distribution for group comparisons in the presence of multiple covariates. This problem is known as the ANCOVA and is solved for continuous data. Our objectives were to develop ANCOVA using the generalized Poisson distribution, and compare its goodness of fit to that of the nonparametric Generalized Additive Models. We used real life data to show that the model performs quite satisfactorily when compared to the nonparametric Generalized Additive Models.展开更多
This paper covers predicting high-resolution electricity peak demand features given lower-resolution data.This is a relevant setup as it answers whether limited higher-resolution monitoring helps to estimate future hi...This paper covers predicting high-resolution electricity peak demand features given lower-resolution data.This is a relevant setup as it answers whether limited higher-resolution monitoring helps to estimate future high-resolution peak loads when the high-resolution data is no longer available.That question is particularly interesting for network operators considering replacing high-resolution monitoring by predictive models due to economic considerations.We propose models to predict half-hourly minima and maxima of high-resolution(every minute)electricity load data while model inputs are of a lower resolution(30 min).We combine predictions of generalized additive models(GAM)and deep artificial neural networks(DNN),which are popular in load forecasting.We extensively analyze the prediction models,including the input parameters’importance,focusing on load,weather,and seasonal effects.The proposed method won a data competition organized by Western Power Distribution,a British distribution network operator.In addition,we provide a rigorous evaluation study that goes beyond the competition frame to analyze the models’robustness.The results show that the proposed methods are superior to the competition benchmark concerning the out-of-sample root mean squared error(RMSE).This holds regarding the competition month and the supplementary evaluation study,which covers an additional eleven months.Overall,our proposed model combination reduces the out-of-sample RMSE by 57.4%compared to the benchmark.展开更多
A probabilistic precipitation forecasting model using generalized additive models (GAMs) and Bayesian model averaging (BMA) was proposed in this paper. GAMs were used to fit the spatial-temporal precipitation mode...A probabilistic precipitation forecasting model using generalized additive models (GAMs) and Bayesian model averaging (BMA) was proposed in this paper. GAMs were used to fit the spatial-temporal precipitation models to individual ensemble member forecasts. The distributions of the precipitation occurrence and the cumulative precipitation amount were represented simultaneously by a single Tweedie distribution. BMA was then used as a post-processing method to combine the individual models to form a more skillful probabilistic forecasting model. The mixing weights were estimated using the expectation-maximization algorithm. The residual diagnostics was used to examine if the fitted BMA forecasting model had fully captured the spatial and temporal variations of precipitation. The proposed method was applied to daily observations at the Yishusi River basin for July 2007 using the National Centers for Environmental Prediction ensemble forecasts. By applying scoring rules, the BMA forecasts were verified and showed better performances compared with the empirical probabilistic ensemble forecasts, particularly for extreme precipitation. Finally, possible improvements and a^plication of this method to the downscaling of climate change scenarios were discussed.展开更多
海山是海底重要的生物栖息地类型之一,是研究海洋生物多样性的热点区域。黄鳍金枪鱼(Thunnus albacares)广泛分布于中西太平洋,具有极高的生态和经济价值,然而,鲜有关于海山及其相关特征对黄鳍金枪鱼资源丰度和分布影响的研究。基于2010...海山是海底重要的生物栖息地类型之一,是研究海洋生物多样性的热点区域。黄鳍金枪鱼(Thunnus albacares)广泛分布于中西太平洋,具有极高的生态和经济价值,然而,鲜有关于海山及其相关特征对黄鳍金枪鱼资源丰度和分布影响的研究。基于2010—2021年中西太平洋渔业委员会(Western and Central Pacific Fisheries Commission,WCPFC)汇总的延绳钓和围网渔业数据结合海山特征数据,采用广义加性模型(Generalized additive model,GAM)分析两种不同捕捞方式的黄鳍金枪鱼单位捕捞努力量渔获量(Catch per unit effort,CPUE)与海山相关特征之间的关系。结果表明,中西太平洋两种渔业方式的黄鳍金枪鱼渔获量主要来源于海山区域,海山特征对两种渔业黄鳍金枪鱼的CPUE均产生了极显著性影响(P<0.001)。在延绳钓渔业中,较高的CPUE出现在山顶深度、粗糙度、底面积和海山密度较小、坡度较缓的区域;而在围网渔业中,较高的CPUE则出现在粗糙度较小、山顶深度较大、底面积较大、较陡峭且密集的海山区域。研究探讨了中西太平洋海山特征对黄鳍金枪鱼不同群体的影响机制,为今后进一步探索黄鳍金枪鱼种群分布和资源丰度变化与海洋环境的关系提供了参考与新思路。展开更多
Climate change is one of the critical determinants affecting life cycles and transmission of most infectious agents,including malaria,cholera,dengue fever,hand,foot,and mouth disease(HFMD),and the recent Corona-virus ...Climate change is one of the critical determinants affecting life cycles and transmission of most infectious agents,including malaria,cholera,dengue fever,hand,foot,and mouth disease(HFMD),and the recent Corona-virus pandemic.HFMD has been associated with a growing number of outbreaks resulting in fatal complications since the late 1990s.The outbreaks may result from a combination of rapid population growth,climate change,socioeconomic changes,and other lifestyle changes.However,the modeling of climate variability and HFMD remains unclear,particularly in statistical theory development.The statistical relationship between HFMD and climate factors has been widely studied using generalized linear and additive modeling.When dealing with time-series data with clustered variables such as HFMD with clustered states,the independence principle of both modeling approaches may be violated.Thus,a Generalized Additive Mixed Model(GAMM)is used to investigate the relationship between HFMD and climate factors in Malaysia.The model is improved by using a first-order autoregressive term and treating all Malaysian states as a random effect.This method is preferred as it allows states to be modeled as random effects and accounts for time series data autocorrelation.The findings indicate that climate variables such as rainfall and wind speed affect HFMD cases in Malaysia.The risk of HFMD increased in the subsequent two weeks with rainfall below 60 mm and decreased with rainfall exceeding 60 mm.Besides,a two-week lag in wind speeds between 2 and 5 m/s reduced HFMD's chances.The results also show that HFMD cases rose in Malaysia during the inter-monsoon and southwest monsoon seasons but fell during the northeast monsoon.The study's outcomes can be used by public health officials and the general public to raise awareness,and thus,implement effective preventive measures.展开更多
A model of deformation resistance during hot strip rolling was established based on generalized additive model.Firstly,a data modeling method based on generalized additive model was given.It included the selection of ...A model of deformation resistance during hot strip rolling was established based on generalized additive model.Firstly,a data modeling method based on generalized additive model was given.It included the selection of dependent variable and independent variables of the model,the link function of dependent variable and smoothing functional form of each independent variable,estimating process of the link function and smooth functions,and the last model modification.Then,the practical modeling test was carried out based on a large amount of hot rolling process data.An integrated variable was proposed to reflect the effects of different chemical compositions such as carbon,silicon,manganese,nickel,chromium,niobium,etc.The integrated chemical composition,strain,strain rate and rolling temperature were selected as independent variables and the cubic spline as the smooth function for them.The modeling process of deformation resistance was realized by SAS software,and the influence curves of the independent variables on deformation resistance were obtained by local scoring algorithm.Some interesting phenomena were found,for example,there is a critical value of strain rate,and the deformation resistance increases before this value and then decreases.The results confirm that the new model has higher prediction accuracy than traditional ones and is suitable for carbon steel,microalloyed steel,alloyed steel and other steel grades.展开更多
基金Supported by the National Natural Science Foundation of China (61273131) 111 Project (B12018)+1 种基金 the Innovation Project of Graduate in Jiangsu Province (CXZZ12_0741) the Fundamental Research Funds for the Central Universities (JUDCF12034)
文摘Fault monitoring of bioprocess is important to ensure safety of a reactor and maintain high quality of products. It is difficult to build an accurate mechanistic model for a bioprocess, so fault monitoring based on rich historical or online database is an effective way. A group of data based on bootstrap method could be resampling stochastically, improving generalization capability of model. In this paper, online fault monitoring of generalized additive models (GAMs) combining with bootstrap is proposed for glutamate fermentation process. GAMs and bootstrap are first used to decide confidence interval based on the online and off-line normal sampled data from glutamate fermentation experiments. Then GAMs are used to online fault monitoring for time, dissolved oxygen, oxygen uptake rate, and carbon dioxide evolution rate. The method can provide accurate fault alarm online and is helpful to provide useful information for removing fault and abnormal phenomena in the fermentation.
文摘In dealing with nonparametric regression the GAM procedure is the most versatile of several new procedures. The terminology behind this procedure is more flexible than traditional parametric modeling tools. It relaxes the usual assumptions of parametric model and enables us to uncover structure to establish the relationship between independent variables and dependent variable in exponential family that may not be obvious otherwise. In this paper, we discussed two methods of fitting generalized additive logistic regression model, one based on Newton Raphson method and another based on iterative weighted least square method for first and second order Taylor series expansion. The use of the GAM procedure with the specified set of weights, using local scoring algorithm, was applied to real life data sets. The cubic spline smoother is applied to the independent variables. Based on nonparametric regression and smoothing techniques, this procedure provides powerful tools for data analysis.
基金Under the auspices of National Natural Science Foundation of China(No.41001363)
文摘This study aims to provide a predictive vegetation mapping approach based on the spectral data, DEM and Generalized Additive Models (GAMs). GAMs were used as a prediction tool to describe the relationship between vegetation and environmental variables, as well as spectral variables. Based on the fitted GAMs model, probability map of species occurrence was generated and then vegetation type of each grid was defined according to the probability of species occurrence. Deviance analysis was employed to test the goodness of curve fitting and drop contribution calculation was used to evaluate the contribution of each predictor in the fitted GAMs models. Area under curve (AUC) of Receiver Operating Characteristic (ROC) curve was employed to assess the results maps of probability. The results showed that: 1) AUC values of the fitted GAMs models are very high which proves that integrating spectral data and environmental variables based on the GAMs is a feasible way to map the vegetation. 2) Prediction accuracy varies with plant community, and community with dense cover is better predicted than sparse plant community. 3) Both spectral variables and environmental variables play an important role in mapping the vegetation. However, the contribution of the same predictor in the GAMs models for different plant communities is different. 4) Insufficient resolution of spectral data, environmental data and confounding effects of land use and other variables which are not closely related to the environmental conditions are the major causes of imprecision.
基金Project(51774219)supported by the National Natural Science Foundation of China
文摘This research develops a new mathematical modeling method by combining industrial big data and process mechanism analysis under the framework of generalized additive models(GAM)to generate a practical model with generalization and precision.Specifically,the proposed modeling method includes the following steps.Firstly,the influence factors are screened using mechanism knowledge and data-mining methods.Secondly,the unary GAM without interactions including cleaning the data,building the sub-models,and verifying the sub-models.Subsequently,the interactions between the various factors are explored,and the binary GAM with interactions is constructed.The relationships among the sub-models are analyzed,and the integrated model is built.Finally,based on the proposed modeling method,two prediction models of mechanical property and deformation resistance for hot-rolled strips are established.Industrial actual data verification demonstrates that the new models have good prediction precision,and the mean absolute percentage errors of tensile strength,yield strength and deformation resistance are 2.54%,3.34%and 6.53%,respectively.And experimental results suggest that the proposed method offers a new approach to industrial process modeling.
基金Under the auspices of the Project of National Natural Science Foundation of China ( No. 41001363)Autonomous Project of State Key Laboratory of Resources and Environmental Information System,Geo-information Tupu Theory and Virtual Geoscience
文摘There are typical ecosystems of littoral wetlands in the Yellow River Delta.In order to study the relationships between Tamarix chinensis and environmental variables and to predict T.chinensis potential distribution in the Yellow River Delta,641 vegetation samples and 964 soil samples were collected in the area in October of 2004,2005,2006 and 2007.The contents of soil organic matter,total phosphorus,salt,and soluble potassium were determined.Then,the analyzed data were interpolated into spatial raster data by Kriging interpolation method.Meanwhile,the digital elevation model,soil type map and landform unit map of the Yellow River Delta were also collected.Generalized Additive Models(GAMs) were employed to build species-environment model and then simulate the potential distribution of T.chinensis.The results indicated that the distribution of T.chinensis was mainly limited by soil salt content,total soil phosphorus content,soluble potassium content,soil type,landform unit,and elevation.The distribution probability of T.chinensis was produced with a lookup table generated by Grasp Module(based on GAMs) in software ArcView GIS 3.2.The AUC(Area Under Curve) value of validation and cross-validation of ROC(Receive Operating Characteristic) were both higher than 0.8,which suggested that the established model had a high precision for predicting species distribution.
基金The National Natural Science Foundation of China under contract No.41506136the Scientific Research Foundation of Third Institute of Oceanography,SOA under contract No.2015005
文摘In this study, the horizontal and vertical distribution of primary production(PP) and its monthly variations were described based on field data collected from the Daya Bay in January–December of 2016. The relationships between PP and environmental factors were analyzed using a general additive model(GAM). Significant seasonal differences were observed in the horizontal distribution of PP, while vertical distribution showed a relatively consistent unimodal pattern. The monthly average PP(calculated by carbon) ranged from 48.03 to 390.56 mg/(m~2·h),with an annual average of 182.77 mg/(m~2·h). The highest PP was observed in May and the lowest in November.Additionally, the overall trend in PP was spring>summer>winter>autumn, and spring PP was approximately three times that of autumn PP. GAM analysis revealed that temperature, bottom salinity, phytoplankton, and photosynthetically active radiation(PAR) had no significant relationships with PP, while longitude, depth, surface salinity, chlorophyll a(Chl a) and transparency were significantly correlated with PP. Overall, the results presented herein indicate that monsoonal changes and terrestrial and offshore water systems have crucial effects on environmental factors that are associated with PP changes.
基金funded by National Science Centre,Poland under the project"Assessment of the impact of weather conditions on forest health status and forest disturbances at regional and national scale based on the integration of ground and space-based remote sensing datasets"(project no.2021/41/B/ST10/)Data collection and research was also supported by the project no.EZ.271.3.19.2021"Modele ryzyka zamierania drzewostanow glownych gatunkow lasotworczych Polski"funded by the General Directorate of State Forests in Poland。
文摘Over the past decade,the presence of mistletoe(Viscum album ssp.austriacum)in Scots pine stands has increased in many European countries.Understanding the factors that influence the occurrence of mistletoe in stands is key to making appropriate forest management decisions to limit damage and prevent the spread of mistletoe in the future.Therefore,the main objective of this study was to determine the probability of mistletoe occurrence in Scots pine stands in relation to stand-related endogenous factors such as age,top height,and stand density,as well as topographic and edaphic factors.We used unmanned aerial vehicle(UAV)imagery from 2,247 stands to detect mistletoe in Scots pine stands,while majority stand and site characteristics were calculated from airborne laser scanning(ALS)data.Information on stand age and site type from the State Forest database were also used.We found that mistletoe infestation in Scots pine stands is influenced by stand and site characteristics.We documented that the densest,tallest,and oldest stands were more susceptible to mistletoe infestation.Site type and specific microsite conditions associated with topography were also important factors driving mistletoe occurrence.In addition,climatic water balance was a significant factor in increasing the probability of mistletoe occurrence,which is important in the context of predicted temperature increases associated with climate change.Our results are important for better understanding patterns of mistletoe infestation and ecosystem functioning under climate change.In an era of climate change and technological development,the use of remote sensing methods to determine the risk of mistletoe infestation can be a very useful tool for managing forest ecosystems to maintain forest sustainability and prevent forest disturbance.
文摘Count data that exhibit over dispersion (variance of counts is larger than its mean) are commonly analyzed using discrete distributions such as negative binomial, Poisson inverse Gaussian and other models. The Poisson is characterized by the equality of mean and variance whereas the Negative Binomial and the Poisson inverse Gaussian have variance larger than the mean and therefore are more appropriate to model over-dispersed count data. As an alternative to these two models, we shall use the generalized Poisson distribution for group comparisons in the presence of multiple covariates. This problem is known as the ANCOVA and is solved for continuous data. Our objectives were to develop ANCOVA using the generalized Poisson distribution, and compare its goodness of fit to that of the nonparametric Generalized Additive Models. We used real life data to show that the model performs quite satisfactorily when compared to the nonparametric Generalized Additive Models.
文摘This paper covers predicting high-resolution electricity peak demand features given lower-resolution data.This is a relevant setup as it answers whether limited higher-resolution monitoring helps to estimate future high-resolution peak loads when the high-resolution data is no longer available.That question is particularly interesting for network operators considering replacing high-resolution monitoring by predictive models due to economic considerations.We propose models to predict half-hourly minima and maxima of high-resolution(every minute)electricity load data while model inputs are of a lower resolution(30 min).We combine predictions of generalized additive models(GAM)and deep artificial neural networks(DNN),which are popular in load forecasting.We extensively analyze the prediction models,including the input parameters’importance,focusing on load,weather,and seasonal effects.The proposed method won a data competition organized by Western Power Distribution,a British distribution network operator.In addition,we provide a rigorous evaluation study that goes beyond the competition frame to analyze the models’robustness.The results show that the proposed methods are superior to the competition benchmark concerning the out-of-sample root mean squared error(RMSE).This holds regarding the competition month and the supplementary evaluation study,which covers an additional eleven months.Overall,our proposed model combination reduces the out-of-sample RMSE by 57.4%compared to the benchmark.
基金Supported by the National Basic Research and Development (973) Program of China (2010CB428402)China Meteorological Administration Special Public Welfare Research Fund (GYHY200706001)
文摘A probabilistic precipitation forecasting model using generalized additive models (GAMs) and Bayesian model averaging (BMA) was proposed in this paper. GAMs were used to fit the spatial-temporal precipitation models to individual ensemble member forecasts. The distributions of the precipitation occurrence and the cumulative precipitation amount were represented simultaneously by a single Tweedie distribution. BMA was then used as a post-processing method to combine the individual models to form a more skillful probabilistic forecasting model. The mixing weights were estimated using the expectation-maximization algorithm. The residual diagnostics was used to examine if the fitted BMA forecasting model had fully captured the spatial and temporal variations of precipitation. The proposed method was applied to daily observations at the Yishusi River basin for July 2007 using the National Centers for Environmental Prediction ensemble forecasts. By applying scoring rules, the BMA forecasts were verified and showed better performances compared with the empirical probabilistic ensemble forecasts, particularly for extreme precipitation. Finally, possible improvements and a^plication of this method to the downscaling of climate change scenarios were discussed.
文摘海山是海底重要的生物栖息地类型之一,是研究海洋生物多样性的热点区域。黄鳍金枪鱼(Thunnus albacares)广泛分布于中西太平洋,具有极高的生态和经济价值,然而,鲜有关于海山及其相关特征对黄鳍金枪鱼资源丰度和分布影响的研究。基于2010—2021年中西太平洋渔业委员会(Western and Central Pacific Fisheries Commission,WCPFC)汇总的延绳钓和围网渔业数据结合海山特征数据,采用广义加性模型(Generalized additive model,GAM)分析两种不同捕捞方式的黄鳍金枪鱼单位捕捞努力量渔获量(Catch per unit effort,CPUE)与海山相关特征之间的关系。结果表明,中西太平洋两种渔业方式的黄鳍金枪鱼渔获量主要来源于海山区域,海山特征对两种渔业黄鳍金枪鱼的CPUE均产生了极显著性影响(P<0.001)。在延绳钓渔业中,较高的CPUE出现在山顶深度、粗糙度、底面积和海山密度较小、坡度较缓的区域;而在围网渔业中,较高的CPUE则出现在粗糙度较小、山顶深度较大、底面积较大、较陡峭且密集的海山区域。研究探讨了中西太平洋海山特征对黄鳍金枪鱼不同群体的影响机制,为今后进一步探索黄鳍金枪鱼种群分布和资源丰度变化与海洋环境的关系提供了参考与新思路。
基金This work was supported by the Ministry of Higher Education,Malaysia under the Fundamental Research Grant Scheme FRGS/1/2020/STG06/UTM/02/3(5F311)Research University Grant with vote no:QJ130000.3854.19J58Zamalah UTM Scholarship under Universiti Teknologi Malaysia.
文摘Climate change is one of the critical determinants affecting life cycles and transmission of most infectious agents,including malaria,cholera,dengue fever,hand,foot,and mouth disease(HFMD),and the recent Corona-virus pandemic.HFMD has been associated with a growing number of outbreaks resulting in fatal complications since the late 1990s.The outbreaks may result from a combination of rapid population growth,climate change,socioeconomic changes,and other lifestyle changes.However,the modeling of climate variability and HFMD remains unclear,particularly in statistical theory development.The statistical relationship between HFMD and climate factors has been widely studied using generalized linear and additive modeling.When dealing with time-series data with clustered variables such as HFMD with clustered states,the independence principle of both modeling approaches may be violated.Thus,a Generalized Additive Mixed Model(GAMM)is used to investigate the relationship between HFMD and climate factors in Malaysia.The model is improved by using a first-order autoregressive term and treating all Malaysian states as a random effect.This method is preferred as it allows states to be modeled as random effects and accounts for time series data autocorrelation.The findings indicate that climate variables such as rainfall and wind speed affect HFMD cases in Malaysia.The risk of HFMD increased in the subsequent two weeks with rainfall below 60 mm and decreased with rainfall exceeding 60 mm.Besides,a two-week lag in wind speeds between 2 and 5 m/s reduced HFMD's chances.The results also show that HFMD cases rose in Malaysia during the inter-monsoon and southwest monsoon seasons but fell during the northeast monsoon.The study's outcomes can be used by public health officials and the general public to raise awareness,and thus,implement effective preventive measures.
基金supported by National Natural Science Foundation of China (51774219)Science and Technology Research Program of Hubei Ministry of Education(D20161103)Youth Science and technology Program of Wuhan(2016070204010099)
文摘A model of deformation resistance during hot strip rolling was established based on generalized additive model.Firstly,a data modeling method based on generalized additive model was given.It included the selection of dependent variable and independent variables of the model,the link function of dependent variable and smoothing functional form of each independent variable,estimating process of the link function and smooth functions,and the last model modification.Then,the practical modeling test was carried out based on a large amount of hot rolling process data.An integrated variable was proposed to reflect the effects of different chemical compositions such as carbon,silicon,manganese,nickel,chromium,niobium,etc.The integrated chemical composition,strain,strain rate and rolling temperature were selected as independent variables and the cubic spline as the smooth function for them.The modeling process of deformation resistance was realized by SAS software,and the influence curves of the independent variables on deformation resistance were obtained by local scoring algorithm.Some interesting phenomena were found,for example,there is a critical value of strain rate,and the deformation resistance increases before this value and then decreases.The results confirm that the new model has higher prediction accuracy than traditional ones and is suitable for carbon steel,microalloyed steel,alloyed steel and other steel grades.