This study aims to provide a predictive vegetation mapping approach based on the spectral data, DEM and Generalized Additive Models (GAMs). GAMs were used as a prediction tool to describe the relationship between vege...This study aims to provide a predictive vegetation mapping approach based on the spectral data, DEM and Generalized Additive Models (GAMs). GAMs were used as a prediction tool to describe the relationship between vegetation and environmental variables, as well as spectral variables. Based on the fitted GAMs model, probability map of species occurrence was generated and then vegetation type of each grid was defined according to the probability of species occurrence. Deviance analysis was employed to test the goodness of curve fitting and drop contribution calculation was used to evaluate the contribution of each predictor in the fitted GAMs models. Area under curve (AUC) of Receiver Operating Characteristic (ROC) curve was employed to assess the results maps of probability. The results showed that: 1) AUC values of the fitted GAMs models are very high which proves that integrating spectral data and environmental variables based on the GAMs is a feasible way to map the vegetation. 2) Prediction accuracy varies with plant community, and community with dense cover is better predicted than sparse plant community. 3) Both spectral variables and environmental variables play an important role in mapping the vegetation. However, the contribution of the same predictor in the GAMs models for different plant communities is different. 4) Insufficient resolution of spectral data, environmental data and confounding effects of land use and other variables which are not closely related to the environmental conditions are the major causes of imprecision.展开更多
Background: Measurements of tree heights and diameters are essential in forest assessment and modelling. Tree heights are used for estimating timber volume, site index and other important variables related to forest ...Background: Measurements of tree heights and diameters are essential in forest assessment and modelling. Tree heights are used for estimating timber volume, site index and other important variables related to forest growth and yield, succession and carbon budget models. However, the diameter at breast height (dbh) can be more accurately obtained and at lower cost, than total tree height. Hence, generalized height-diameter (h-d) models that predict tree height from dbh, age and other covariates are needed. For a more flexible but biologically plausible estimation of covariate effects we use shape constrained generalized additive models as an extension of existing h-d model approaches. We use causal site parameters such as index of aridity to enhance the generality and causality of the models and to enable predictions under projected changeable climatic conditions. Methods: We develop unconstrained generalized additive models (GAM) and shape constrained generalized additive models (SCAM) for investigating the possible effects of tree-specific parameters such as tree age, relative diameter at breast height, and site-specific parameters such as index of aridity and sum of daily mean temperature during vegetation period, on the h-d relationship of forests in Lower Saxony, Germany. Results: Some of the derived effects, e.g. effects of age, index of aridity and sum of daily mean temperature have significantly non-linear pattern. The need for using SCAM results from the fact that some of the model effects show partially implausible patterns especially at the boundaries of data ranges. The derived model predicts monotonically increasing levels of tree height with increasing age and temperature sum and decreasing aridity and social rank of a tree within a stand, The definition of constraints leads only to marginal or minor decline in the model statistics like AIC An observed structured spatial trend in tree height is modelled via 2-dimensional surface fitting. Conclusions: We demonstrate that the SCAM approach allows optimal regression modelling flexibility similar to the standard GAM but with the additional possibility of defining specific constraints for the model effects. The longitudinal character of the model allows for tree height imputation for the current status of forests but also for future tree height prediction.展开更多
This research develops a new mathematical modeling method by combining industrial big data and process mechanism analysis under the framework of generalized additive models(GAM)to generate a practical model with gener...This research develops a new mathematical modeling method by combining industrial big data and process mechanism analysis under the framework of generalized additive models(GAM)to generate a practical model with generalization and precision.Specifically,the proposed modeling method includes the following steps.Firstly,the influence factors are screened using mechanism knowledge and data-mining methods.Secondly,the unary GAM without interactions including cleaning the data,building the sub-models,and verifying the sub-models.Subsequently,the interactions between the various factors are explored,and the binary GAM with interactions is constructed.The relationships among the sub-models are analyzed,and the integrated model is built.Finally,based on the proposed modeling method,two prediction models of mechanical property and deformation resistance for hot-rolled strips are established.Industrial actual data verification demonstrates that the new models have good prediction precision,and the mean absolute percentage errors of tensile strength,yield strength and deformation resistance are 2.54%,3.34%and 6.53%,respectively.And experimental results suggest that the proposed method offers a new approach to industrial process modeling.展开更多
Fault monitoring of bioprocess is important to ensure safety of a reactor and maintain high quality of products. It is difficult to build an accurate mechanistic model for a bioprocess, so fault monitoring based on ri...Fault monitoring of bioprocess is important to ensure safety of a reactor and maintain high quality of products. It is difficult to build an accurate mechanistic model for a bioprocess, so fault monitoring based on rich historical or online database is an effective way. A group of data based on bootstrap method could be resampling stochastically, improving generalization capability of model. In this paper, online fault monitoring of generalized additive models (GAMs) combining with bootstrap is proposed for glutamate fermentation process. GAMs and bootstrap are first used to decide confidence interval based on the online and off-line normal sampled data from glutamate fermentation experiments. Then GAMs are used to online fault monitoring for time, dissolved oxygen, oxygen uptake rate, and carbon dioxide evolution rate. The method can provide accurate fault alarm online and is helpful to provide useful information for removing fault and abnormal phenomena in the fermentation.展开更多
In this study, the horizontal and vertical distribution of primary production(PP) and its monthly variations were described based on field data collected from the Daya Bay in January–December of 2016. The relationshi...In this study, the horizontal and vertical distribution of primary production(PP) and its monthly variations were described based on field data collected from the Daya Bay in January–December of 2016. The relationships between PP and environmental factors were analyzed using a general additive model(GAM). Significant seasonal differences were observed in the horizontal distribution of PP, while vertical distribution showed a relatively consistent unimodal pattern. The monthly average PP(calculated by carbon) ranged from 48.03 to 390.56 mg/(m~2·h),with an annual average of 182.77 mg/(m~2·h). The highest PP was observed in May and the lowest in November.Additionally, the overall trend in PP was spring>summer>winter>autumn, and spring PP was approximately three times that of autumn PP. GAM analysis revealed that temperature, bottom salinity, phytoplankton, and photosynthetically active radiation(PAR) had no significant relationships with PP, while longitude, depth, surface salinity, chlorophyll a(Chl a) and transparency were significantly correlated with PP. Overall, the results presented herein indicate that monsoonal changes and terrestrial and offshore water systems have crucial effects on environmental factors that are associated with PP changes.展开更多
There are typical ecosystems of littoral wetlands in the Yellow River Delta.In order to study the relationships between Tamarix chinensis and environmental variables and to predict T.chinensis potential distribution i...There are typical ecosystems of littoral wetlands in the Yellow River Delta.In order to study the relationships between Tamarix chinensis and environmental variables and to predict T.chinensis potential distribution in the Yellow River Delta,641 vegetation samples and 964 soil samples were collected in the area in October of 2004,2005,2006 and 2007.The contents of soil organic matter,total phosphorus,salt,and soluble potassium were determined.Then,the analyzed data were interpolated into spatial raster data by Kriging interpolation method.Meanwhile,the digital elevation model,soil type map and landform unit map of the Yellow River Delta were also collected.Generalized Additive Models(GAMs) were employed to build species-environment model and then simulate the potential distribution of T.chinensis.The results indicated that the distribution of T.chinensis was mainly limited by soil salt content,total soil phosphorus content,soluble potassium content,soil type,landform unit,and elevation.The distribution probability of T.chinensis was produced with a lookup table generated by Grasp Module(based on GAMs) in software ArcView GIS 3.2.The AUC(Area Under Curve) value of validation and cross-validation of ROC(Receive Operating Characteristic) were both higher than 0.8,which suggested that the established model had a high precision for predicting species distribution.展开更多
The component additive modelling approach is based on summing the results from models already calibrated with pure mineral phases. The summation can occur as the sum of results for thermodynamic surface speciation mod...The component additive modelling approach is based on summing the results from models already calibrated with pure mineral phases. The summation can occur as the sum of results for thermodynamic surface speciation models or as the sum of pseudo-thermodynamic models for adsorption on individual mineral phases. Static batch sorption experiments of 63Ni are with different granitic rocks and component minerals. XRD analyses have been used to calculate the percentage mineralogical composition of the granitic rocks. Sorption data has been modelled using non electrostatic correction models to obtain Rdfor the granitic rocks and mineral. Ra values for the granitic rocks predicted from the component additive model have been compared to experimental values. Results showed that predicted Rd values for granite adamellite, biotite granite and rapakivi granite were identical to the experimentally determined values, whereas, for graphic granite and grey Granite, the predicted and experimentally determined Ra values were much different. The results also showed a greater contribution to the bulk Raby feldspar while quartz showed the least contribution to the Rd.展开更多
Interpretability has drawn increasing attention in machine learning.Most works focus on post-hoc explanations rather than building a self-explaining model.So,we propose a Neural Partially Linear Additive Model(NPLAM),...Interpretability has drawn increasing attention in machine learning.Most works focus on post-hoc explanations rather than building a self-explaining model.So,we propose a Neural Partially Linear Additive Model(NPLAM),which automatically distinguishes insignificant,linear,and nonlinear features in neural networks.On the one hand,neural network construction fits data better than spline function under the same parameter amount;on the other hand,learnable gate design and sparsity regular-term maintain the ability of feature selection and structure discovery.We theoretically establish the generalization error bounds of the proposed method with Rademacher complexity.Experiments based on both simulations and real-world datasets verify its good performance and interpretability.展开更多
A probabilistic precipitation forecasting model using generalized additive models (GAMs) and Bayesian model averaging (BMA) was proposed in this paper. GAMs were used to fit the spatial-temporal precipitation mode...A probabilistic precipitation forecasting model using generalized additive models (GAMs) and Bayesian model averaging (BMA) was proposed in this paper. GAMs were used to fit the spatial-temporal precipitation models to individual ensemble member forecasts. The distributions of the precipitation occurrence and the cumulative precipitation amount were represented simultaneously by a single Tweedie distribution. BMA was then used as a post-processing method to combine the individual models to form a more skillful probabilistic forecasting model. The mixing weights were estimated using the expectation-maximization algorithm. The residual diagnostics was used to examine if the fitted BMA forecasting model had fully captured the spatial and temporal variations of precipitation. The proposed method was applied to daily observations at the Yishusi River basin for July 2007 using the National Centers for Environmental Prediction ensemble forecasts. By applying scoring rules, the BMA forecasts were verified and showed better performances compared with the empirical probabilistic ensemble forecasts, particularly for extreme precipitation. Finally, possible improvements and a^plication of this method to the downscaling of climate change scenarios were discussed.展开更多
In this paper,we mainly study how to estimate the error density in the ultrahigh dimensional sparse additive model,where the number of variables is larger than the sample size.First,a smoothing method based on B-splin...In this paper,we mainly study how to estimate the error density in the ultrahigh dimensional sparse additive model,where the number of variables is larger than the sample size.First,a smoothing method based on B-splines is applied to the estimation of regression functions.Second,an improved two-stage refitted crossvalidation(RCV)procedure by random splitting technique is used to obtain the residuals of the model,and then the residual-based kernel method is applied to estimate the error density function.Under suitable sparse conditions,the large sample properties of the estimator,including the weak and strong consistency,as well as normality and the law of the iterated logarithm,are obtained.Especially,the relationship between the sparsity and the convergence rate of the kernel density estimator is given.The methodology is illustrated by simulations and a real data example,which suggests that the proposed method performs well.展开更多
Background: Generalized height-diameter curves based on a re-parameterized version of the Korf function for Norway spruce (Piceo abies (L.) Karst.), Scots pine (Pinus sylvestris L.) and silver birch (Betula pe...Background: Generalized height-diameter curves based on a re-parameterized version of the Korf function for Norway spruce (Piceo abies (L.) Karst.), Scots pine (Pinus sylvestris L.) and silver birch (Betula pendula Roth) in Norwa are presented. The Norwegian National Forest Inventory (NFI) is used as data base for estimating the model parameters. The derived models are developed to enable spatially explicit and site sensitive tree height imputatio in forest inventories as well as future tree height predictions in growth and yield scenario simulations. Methods: Generalized additive mixed models (gamm) are employed to detect and quantify potentially non-linear effects of predictor variables. In doing so the quadratic mean diameter serves as longitudinal covariate since stand ag as measured in the NFI, shows only a weak correlation with a stands developmental status in Norwegian forests. Additionally the models can be locally calibrated by predicting random effects if measured height-diameter pairs are available. Based on the model selection of non-constraint models, shape constraint additive models (scare) were fit tc incorporate expert knowledge and intrinsic relationships by enforcing certain effect patterns like monotonicity. Results: Model comparisons demonstrate that the shape constraints lead to only marginal differences in statistical characteristics but ensure reasonable model predictions. Under constant constraints the developed models predict increasing tree heights with decreasing altitude, increasing soil depth and increasing competition pressure of a tree. / two-dimensional spatially structured effect of UTM-coordinates accounts for the potential effects of large scale spatial correlated covariates, which were not at our disposal. The main result of modelling the spatially structured effect is lower tree height prediction for coastal sites and with increasing latitude. The quadratic mean diameter affects both the level and the slope of the height-diameter curve and both effects are positive. Conclusions: In this investigation it is assumed that model effects in additive modelling of height-diameter curves which are unfeasible and too wiggly from an expert point of view are a result of quantitatively or qualitatively limited data bases. However, this problem can be regarded not to be specific to our investigation but more general since growth and yield data that are balanced over the whole data range with respect to all combinations of predictor variables are exceptional cases. Hence, scare may provide methodological improvements in several applications by combining the flexibility of additive models with expert knowledge.展开更多
A model of deformation resistance during hot strip rolling was established based on generalized additive model.Firstly,a data modeling method based on generalized additive model was given.It included the selection of ...A model of deformation resistance during hot strip rolling was established based on generalized additive model.Firstly,a data modeling method based on generalized additive model was given.It included the selection of dependent variable and independent variables of the model,the link function of dependent variable and smoothing functional form of each independent variable,estimating process of the link function and smooth functions,and the last model modification.Then,the practical modeling test was carried out based on a large amount of hot rolling process data.An integrated variable was proposed to reflect the effects of different chemical compositions such as carbon,silicon,manganese,nickel,chromium,niobium,etc.The integrated chemical composition,strain,strain rate and rolling temperature were selected as independent variables and the cubic spline as the smooth function for them.The modeling process of deformation resistance was realized by SAS software,and the influence curves of the independent variables on deformation resistance were obtained by local scoring algorithm.Some interesting phenomena were found,for example,there is a critical value of strain rate,and the deformation resistance increases before this value and then decreases.The results confirm that the new model has higher prediction accuracy than traditional ones and is suitable for carbon steel,microalloyed steel,alloyed steel and other steel grades.展开更多
This paper considers partially linear additive models with the number of parameters diverging when some linear cons train ts on the parame trie par t are available.This paper proposes a constrained profile least-squar...This paper considers partially linear additive models with the number of parameters diverging when some linear cons train ts on the parame trie par t are available.This paper proposes a constrained profile least-squares estimation for the parametrie components with the nonparametric functions being estimated by basis function approximations.The consistency and asymptotic normality of the restricted estimator are given under some certain conditions.The authors construct a profile likelihood ratio test statistic to test the validity of the linear constraints on the parametrie components,and demonstrate that it follows asymptotically chi-squared distribution under the null and alternative hypo theses.The finite sample performance of the proposed method is illus trated by simulation studies and a data analysis.展开更多
We propose a method which uses functional singular component to establish functional additive models. The proposed methodology reduces the curve regression problem to ordinary(i.e., scalar) additive regression problem...We propose a method which uses functional singular component to establish functional additive models. The proposed methodology reduces the curve regression problem to ordinary(i.e., scalar) additive regression problems of the singular components of the predictor process and response process. Consistency of estimators for the nonparametric function and prediction are proved, respectively. A simulation study is conducted to investigate the finite sample performances of the proposed estimators.展开更多
In dealing with nonparametric regression the GAM procedure is the most versatile of several new procedures. The terminology behind this procedure is more flexible than traditional parametric modeling tools. It relaxes...In dealing with nonparametric regression the GAM procedure is the most versatile of several new procedures. The terminology behind this procedure is more flexible than traditional parametric modeling tools. It relaxes the usual assumptions of parametric model and enables us to uncover structure to establish the relationship between independent variables and dependent variable in exponential family that may not be obvious otherwise. In this paper, we discussed two methods of fitting generalized additive logistic regression model, one based on Newton Raphson method and another based on iterative weighted least square method for first and second order Taylor series expansion. The use of the GAM procedure with the specified set of weights, using local scoring algorithm, was applied to real life data sets. The cubic spline smoother is applied to the independent variables. Based on nonparametric regression and smoothing techniques, this procedure provides powerful tools for data analysis.展开更多
Over the past decade,the presence of mistletoe(Viscum album ssp.austriacum)in Scots pine stands has increased in many European countries.Understanding the factors that influence the occurrence of mistletoe in stands i...Over the past decade,the presence of mistletoe(Viscum album ssp.austriacum)in Scots pine stands has increased in many European countries.Understanding the factors that influence the occurrence of mistletoe in stands is key to making appropriate forest management decisions to limit damage and prevent the spread of mistletoe in the future.Therefore,the main objective of this study was to determine the probability of mistletoe occurrence in Scots pine stands in relation to stand-related endogenous factors such as age,top height,and stand density,as well as topographic and edaphic factors.We used unmanned aerial vehicle(UAV)imagery from 2,247 stands to detect mistletoe in Scots pine stands,while majority stand and site characteristics were calculated from airborne laser scanning(ALS)data.Information on stand age and site type from the State Forest database were also used.We found that mistletoe infestation in Scots pine stands is influenced by stand and site characteristics.We documented that the densest,tallest,and oldest stands were more susceptible to mistletoe infestation.Site type and specific microsite conditions associated with topography were also important factors driving mistletoe occurrence.In addition,climatic water balance was a significant factor in increasing the probability of mistletoe occurrence,which is important in the context of predicted temperature increases associated with climate change.Our results are important for better understanding patterns of mistletoe infestation and ecosystem functioning under climate change.In an era of climate change and technological development,the use of remote sensing methods to determine the risk of mistletoe infestation can be a very useful tool for managing forest ecosystems to maintain forest sustainability and prevent forest disturbance.展开更多
This article discusses regression analysis of failure time under the additive hazards model, when the regression coefficients are time-varying. The regression coefficients are estimated locally based on the pseudo-sco...This article discusses regression analysis of failure time under the additive hazards model, when the regression coefficients are time-varying. The regression coefficients are estimated locally based on the pseudo-score function [12] in a window around each time point. The proposed method can be easily implemented, and the resulting estimators are shown to be consistent and asymptotically normal with easily estimated variances. The simulation studies show that our estimation procedure is reliable and useful.展开更多
Habitat suitability index(HSI)models have been widely used to analyze the relationship between species abundance and environmental factors,and ultimately inform management of marine species.The response of species abu...Habitat suitability index(HSI)models have been widely used to analyze the relationship between species abundance and environmental factors,and ultimately inform management of marine species.The response of species abundance to each environmental variable is different and habitat requirements may change over life history stages and seasons.Therefore,it is necessary to determine the optimal combination of environmental variables in HSI modelling.In this study,generalized additive models(GAMs)were used to determine which environmental variables to be included in the HSI models.Significant variables were retained and weighted in the HSI model according to their relative contribution(%)to the total deviation explained by the boosted regression tree(BRT).The HSI models were applied to evaluate the habitat suitability of mantis shrimp Oratosquilla oratoria in the Haizhou Bay and adjacent areas in 2011 and 2013–2017.Ontogenetic and seasonal variations in HSI models of mantis shrimp were also examined.Among the four models(non-optimized model,BRT informed HSI model,GAM informed HSI model,and both BRT and GAM informed HSI model),both BRT and GAM informed HSI model showed the best performance.Four environmental variables(bottom temperature,depth,distance offshore and sediment type)were selected in the HSI models for four groups(spring-juvenile,spring-adult,falljuvenile and fall-adult)of mantis shrimp.The distribution of habitat suitability showed similar patterns between juveniles and adults,but obvious seasonal variations were observed.This study suggests that the process of optimizing environmental variables in HSI models improves the performance of HSI models,and this optimization strategy could be extended to other marine organisms to enhance the understanding of the habitat suitability of target species.展开更多
The spatiotemporal distribution and relationship between nominal catch-per-unit-ef fort(CPUE) and environment for the jumbo flying squid( Dosidicus gigas) were examined in of fshore Peruvian waters during 2009–2013. ...The spatiotemporal distribution and relationship between nominal catch-per-unit-ef fort(CPUE) and environment for the jumbo flying squid( Dosidicus gigas) were examined in of fshore Peruvian waters during 2009–2013. Three typical oceanographic factors aff ecting the squid habitat were investigated in this research, including sea surface temperature(SST), sea surface salinity(SSS) and sea surface height(SSH). We studied the CPUE-environment relationships for D. gigas using a spatially-lagged version of spatial autoregressive(SAR) model and a generalized additive model(GAM), with the latter for auxiliary and comparative purposes. The annual fishery centroids were distributed broadly in an area bounded by 79.5°–82.7°W and 11.9°–17.1°S, while the monthly fishery centroids were spatially close and lay in a smaller area bounded by 81.0°–81.2°W and 14.3°–15.4°S. Our results show that the preferred environmental ranges for D. gigas offshore Peru were 20.9°–21.9°C for SST, 35.16–35.32 for SSS and 27.2–31.5 cm for SSH in the areas bounded by 78°–80°W/82–84°W and 15°–18°S. Monthly spatial distributions during October to December were predicted using the calibrated GAM and SAR models and general similarities were found between the observed and predicted patterns for the nominal CPUE of D. gigas. The overall accuracies for the hotspots generated by the SAR model were much higher than those produced by the GAM model for all three months. Our results contribute to a better understanding of the spatiotemporal distributions of D. gigas off shore Peru, and off er a new SAR modeling method for advancing fishery science.展开更多
This research investigates the appropriateness of the linear specification of the market model for modeling and forecasting the cryptocurrency prices during the pre-COVID-19 and COVID-19 periods.Two extensions are off...This research investigates the appropriateness of the linear specification of the market model for modeling and forecasting the cryptocurrency prices during the pre-COVID-19 and COVID-19 periods.Two extensions are offered to compare the performance of the linear specification of the market model(LMM),which allows for the measurement of the cryptocurrency price beta risk.The first is the generalized additive model,which permits flexibility in the rigid shape of the linearity of the LMM.The second is the time-varying linearity specification of the LMM(Tv-LMM),which is based on the state space model form via the Kalman filter,allowing for the measurement of the time-varying beta risk of the cryptocurrency price.The analysis is performed using daily data from both time periods on the top 10 cryptocurrencies by adjusted market capitalization,using the Crypto Currency Index 30(CCI30)as a market proxy and 1-day and 7-day forward predictions.Such a comparison of cryptocurrency prices has yet to be undertaken in the literature.The empirical findings favor the Tv-LMM,which outperforms the others in terms of modeling and forecasting performance.This result suggests that the relationship between each cryptocurrency price and the CCI30 index should be locally instead of globally linear,especially during the COVID-19 period.展开更多
基金Under the auspices of National Natural Science Foundation of China(No.41001363)
文摘This study aims to provide a predictive vegetation mapping approach based on the spectral data, DEM and Generalized Additive Models (GAMs). GAMs were used as a prediction tool to describe the relationship between vegetation and environmental variables, as well as spectral variables. Based on the fitted GAMs model, probability map of species occurrence was generated and then vegetation type of each grid was defined according to the probability of species occurrence. Deviance analysis was employed to test the goodness of curve fitting and drop contribution calculation was used to evaluate the contribution of each predictor in the fitted GAMs models. Area under curve (AUC) of Receiver Operating Characteristic (ROC) curve was employed to assess the results maps of probability. The results showed that: 1) AUC values of the fitted GAMs models are very high which proves that integrating spectral data and environmental variables based on the GAMs is a feasible way to map the vegetation. 2) Prediction accuracy varies with plant community, and community with dense cover is better predicted than sparse plant community. 3) Both spectral variables and environmental variables play an important role in mapping the vegetation. However, the contribution of the same predictor in the GAMs models for different plant communities is different. 4) Insufficient resolution of spectral data, environmental data and confounding effects of land use and other variables which are not closely related to the environmental conditions are the major causes of imprecision.
文摘Background: Measurements of tree heights and diameters are essential in forest assessment and modelling. Tree heights are used for estimating timber volume, site index and other important variables related to forest growth and yield, succession and carbon budget models. However, the diameter at breast height (dbh) can be more accurately obtained and at lower cost, than total tree height. Hence, generalized height-diameter (h-d) models that predict tree height from dbh, age and other covariates are needed. For a more flexible but biologically plausible estimation of covariate effects we use shape constrained generalized additive models as an extension of existing h-d model approaches. We use causal site parameters such as index of aridity to enhance the generality and causality of the models and to enable predictions under projected changeable climatic conditions. Methods: We develop unconstrained generalized additive models (GAM) and shape constrained generalized additive models (SCAM) for investigating the possible effects of tree-specific parameters such as tree age, relative diameter at breast height, and site-specific parameters such as index of aridity and sum of daily mean temperature during vegetation period, on the h-d relationship of forests in Lower Saxony, Germany. Results: Some of the derived effects, e.g. effects of age, index of aridity and sum of daily mean temperature have significantly non-linear pattern. The need for using SCAM results from the fact that some of the model effects show partially implausible patterns especially at the boundaries of data ranges. The derived model predicts monotonically increasing levels of tree height with increasing age and temperature sum and decreasing aridity and social rank of a tree within a stand, The definition of constraints leads only to marginal or minor decline in the model statistics like AIC An observed structured spatial trend in tree height is modelled via 2-dimensional surface fitting. Conclusions: We demonstrate that the SCAM approach allows optimal regression modelling flexibility similar to the standard GAM but with the additional possibility of defining specific constraints for the model effects. The longitudinal character of the model allows for tree height imputation for the current status of forests but also for future tree height prediction.
基金Project(51774219)supported by the National Natural Science Foundation of China
文摘This research develops a new mathematical modeling method by combining industrial big data and process mechanism analysis under the framework of generalized additive models(GAM)to generate a practical model with generalization and precision.Specifically,the proposed modeling method includes the following steps.Firstly,the influence factors are screened using mechanism knowledge and data-mining methods.Secondly,the unary GAM without interactions including cleaning the data,building the sub-models,and verifying the sub-models.Subsequently,the interactions between the various factors are explored,and the binary GAM with interactions is constructed.The relationships among the sub-models are analyzed,and the integrated model is built.Finally,based on the proposed modeling method,two prediction models of mechanical property and deformation resistance for hot-rolled strips are established.Industrial actual data verification demonstrates that the new models have good prediction precision,and the mean absolute percentage errors of tensile strength,yield strength and deformation resistance are 2.54%,3.34%and 6.53%,respectively.And experimental results suggest that the proposed method offers a new approach to industrial process modeling.
基金Supported by the National Natural Science Foundation of China (61273131) 111 Project (B12018)+1 种基金 the Innovation Project of Graduate in Jiangsu Province (CXZZ12_0741) the Fundamental Research Funds for the Central Universities (JUDCF12034)
文摘Fault monitoring of bioprocess is important to ensure safety of a reactor and maintain high quality of products. It is difficult to build an accurate mechanistic model for a bioprocess, so fault monitoring based on rich historical or online database is an effective way. A group of data based on bootstrap method could be resampling stochastically, improving generalization capability of model. In this paper, online fault monitoring of generalized additive models (GAMs) combining with bootstrap is proposed for glutamate fermentation process. GAMs and bootstrap are first used to decide confidence interval based on the online and off-line normal sampled data from glutamate fermentation experiments. Then GAMs are used to online fault monitoring for time, dissolved oxygen, oxygen uptake rate, and carbon dioxide evolution rate. The method can provide accurate fault alarm online and is helpful to provide useful information for removing fault and abnormal phenomena in the fermentation.
基金The National Natural Science Foundation of China under contract No.41506136the Scientific Research Foundation of Third Institute of Oceanography,SOA under contract No.2015005
文摘In this study, the horizontal and vertical distribution of primary production(PP) and its monthly variations were described based on field data collected from the Daya Bay in January–December of 2016. The relationships between PP and environmental factors were analyzed using a general additive model(GAM). Significant seasonal differences were observed in the horizontal distribution of PP, while vertical distribution showed a relatively consistent unimodal pattern. The monthly average PP(calculated by carbon) ranged from 48.03 to 390.56 mg/(m~2·h),with an annual average of 182.77 mg/(m~2·h). The highest PP was observed in May and the lowest in November.Additionally, the overall trend in PP was spring>summer>winter>autumn, and spring PP was approximately three times that of autumn PP. GAM analysis revealed that temperature, bottom salinity, phytoplankton, and photosynthetically active radiation(PAR) had no significant relationships with PP, while longitude, depth, surface salinity, chlorophyll a(Chl a) and transparency were significantly correlated with PP. Overall, the results presented herein indicate that monsoonal changes and terrestrial and offshore water systems have crucial effects on environmental factors that are associated with PP changes.
基金Under the auspices of the Project of National Natural Science Foundation of China ( No. 41001363)Autonomous Project of State Key Laboratory of Resources and Environmental Information System,Geo-information Tupu Theory and Virtual Geoscience
文摘There are typical ecosystems of littoral wetlands in the Yellow River Delta.In order to study the relationships between Tamarix chinensis and environmental variables and to predict T.chinensis potential distribution in the Yellow River Delta,641 vegetation samples and 964 soil samples were collected in the area in October of 2004,2005,2006 and 2007.The contents of soil organic matter,total phosphorus,salt,and soluble potassium were determined.Then,the analyzed data were interpolated into spatial raster data by Kriging interpolation method.Meanwhile,the digital elevation model,soil type map and landform unit map of the Yellow River Delta were also collected.Generalized Additive Models(GAMs) were employed to build species-environment model and then simulate the potential distribution of T.chinensis.The results indicated that the distribution of T.chinensis was mainly limited by soil salt content,total soil phosphorus content,soluble potassium content,soil type,landform unit,and elevation.The distribution probability of T.chinensis was produced with a lookup table generated by Grasp Module(based on GAMs) in software ArcView GIS 3.2.The AUC(Area Under Curve) value of validation and cross-validation of ROC(Receive Operating Characteristic) were both higher than 0.8,which suggested that the established model had a high precision for predicting species distribution.
文摘The component additive modelling approach is based on summing the results from models already calibrated with pure mineral phases. The summation can occur as the sum of results for thermodynamic surface speciation models or as the sum of pseudo-thermodynamic models for adsorption on individual mineral phases. Static batch sorption experiments of 63Ni are with different granitic rocks and component minerals. XRD analyses have been used to calculate the percentage mineralogical composition of the granitic rocks. Sorption data has been modelled using non electrostatic correction models to obtain Rdfor the granitic rocks and mineral. Ra values for the granitic rocks predicted from the component additive model have been compared to experimental values. Results showed that predicted Rd values for granite adamellite, biotite granite and rapakivi granite were identical to the experimentally determined values, whereas, for graphic granite and grey Granite, the predicted and experimentally determined Ra values were much different. The results also showed a greater contribution to the bulk Raby feldspar while quartz showed the least contribution to the Rd.
基金the National Natural Science Foundation of China(Grant No.12071166)the Fundamental Research Funds for the Central Universities of China(Nos.2662023LXPY005,2662022XXYJ005)HZAU-AGIS Cooperation Fund(No.SZYJY2023010)。
文摘Interpretability has drawn increasing attention in machine learning.Most works focus on post-hoc explanations rather than building a self-explaining model.So,we propose a Neural Partially Linear Additive Model(NPLAM),which automatically distinguishes insignificant,linear,and nonlinear features in neural networks.On the one hand,neural network construction fits data better than spline function under the same parameter amount;on the other hand,learnable gate design and sparsity regular-term maintain the ability of feature selection and structure discovery.We theoretically establish the generalization error bounds of the proposed method with Rademacher complexity.Experiments based on both simulations and real-world datasets verify its good performance and interpretability.
基金Supported by the National Basic Research and Development (973) Program of China (2010CB428402)China Meteorological Administration Special Public Welfare Research Fund (GYHY200706001)
文摘A probabilistic precipitation forecasting model using generalized additive models (GAMs) and Bayesian model averaging (BMA) was proposed in this paper. GAMs were used to fit the spatial-temporal precipitation models to individual ensemble member forecasts. The distributions of the precipitation occurrence and the cumulative precipitation amount were represented simultaneously by a single Tweedie distribution. BMA was then used as a post-processing method to combine the individual models to form a more skillful probabilistic forecasting model. The mixing weights were estimated using the expectation-maximization algorithm. The residual diagnostics was used to examine if the fitted BMA forecasting model had fully captured the spatial and temporal variations of precipitation. The proposed method was applied to daily observations at the Yishusi River basin for July 2007 using the National Centers for Environmental Prediction ensemble forecasts. By applying scoring rules, the BMA forecasts were verified and showed better performances compared with the empirical probabilistic ensemble forecasts, particularly for extreme precipitation. Finally, possible improvements and a^plication of this method to the downscaling of climate change scenarios were discussed.
基金supported by National Natural Science Foundation of China (Grant Nos. 11971324 and 11471223)Interdisciplinary Construction of Bioinformatics and StatisticsAcademy for Multidisciplinary Studies, Capital Normal University
文摘In this paper,we mainly study how to estimate the error density in the ultrahigh dimensional sparse additive model,where the number of variables is larger than the sample size.First,a smoothing method based on B-splines is applied to the estimation of regression functions.Second,an improved two-stage refitted crossvalidation(RCV)procedure by random splitting technique is used to obtain the residuals of the model,and then the residual-based kernel method is applied to estimate the error density function.Under suitable sparse conditions,the large sample properties of the estimator,including the weak and strong consistency,as well as normality and the law of the iterated logarithm,are obtained.Especially,the relationship between the sparsity and the convergence rate of the kernel density estimator is given.The methodology is illustrated by simulations and a real data example,which suggests that the proposed method performs well.
基金supported by the Norwegian Institute of Bioeconomy Research(NIBIO)
文摘Background: Generalized height-diameter curves based on a re-parameterized version of the Korf function for Norway spruce (Piceo abies (L.) Karst.), Scots pine (Pinus sylvestris L.) and silver birch (Betula pendula Roth) in Norwa are presented. The Norwegian National Forest Inventory (NFI) is used as data base for estimating the model parameters. The derived models are developed to enable spatially explicit and site sensitive tree height imputatio in forest inventories as well as future tree height predictions in growth and yield scenario simulations. Methods: Generalized additive mixed models (gamm) are employed to detect and quantify potentially non-linear effects of predictor variables. In doing so the quadratic mean diameter serves as longitudinal covariate since stand ag as measured in the NFI, shows only a weak correlation with a stands developmental status in Norwegian forests. Additionally the models can be locally calibrated by predicting random effects if measured height-diameter pairs are available. Based on the model selection of non-constraint models, shape constraint additive models (scare) were fit tc incorporate expert knowledge and intrinsic relationships by enforcing certain effect patterns like monotonicity. Results: Model comparisons demonstrate that the shape constraints lead to only marginal differences in statistical characteristics but ensure reasonable model predictions. Under constant constraints the developed models predict increasing tree heights with decreasing altitude, increasing soil depth and increasing competition pressure of a tree. / two-dimensional spatially structured effect of UTM-coordinates accounts for the potential effects of large scale spatial correlated covariates, which were not at our disposal. The main result of modelling the spatially structured effect is lower tree height prediction for coastal sites and with increasing latitude. The quadratic mean diameter affects both the level and the slope of the height-diameter curve and both effects are positive. Conclusions: In this investigation it is assumed that model effects in additive modelling of height-diameter curves which are unfeasible and too wiggly from an expert point of view are a result of quantitatively or qualitatively limited data bases. However, this problem can be regarded not to be specific to our investigation but more general since growth and yield data that are balanced over the whole data range with respect to all combinations of predictor variables are exceptional cases. Hence, scare may provide methodological improvements in several applications by combining the flexibility of additive models with expert knowledge.
基金supported by National Natural Science Foundation of China (51774219)Science and Technology Research Program of Hubei Ministry of Education(D20161103)Youth Science and technology Program of Wuhan(2016070204010099)
文摘A model of deformation resistance during hot strip rolling was established based on generalized additive model.Firstly,a data modeling method based on generalized additive model was given.It included the selection of dependent variable and independent variables of the model,the link function of dependent variable and smoothing functional form of each independent variable,estimating process of the link function and smooth functions,and the last model modification.Then,the practical modeling test was carried out based on a large amount of hot rolling process data.An integrated variable was proposed to reflect the effects of different chemical compositions such as carbon,silicon,manganese,nickel,chromium,niobium,etc.The integrated chemical composition,strain,strain rate and rolling temperature were selected as independent variables and the cubic spline as the smooth function for them.The modeling process of deformation resistance was realized by SAS software,and the influence curves of the independent variables on deformation resistance were obtained by local scoring algorithm.Some interesting phenomena were found,for example,there is a critical value of strain rate,and the deformation resistance increases before this value and then decreases.The results confirm that the new model has higher prediction accuracy than traditional ones and is suitable for carbon steel,microalloyed steel,alloyed steel and other steel grades.
基金supported by the National Natural Science Foundation of China under Grant No.11771250the Natural Science Foundation of Shandong Province under Grant No.ZR2019MA002the Program for Scientific Research Innovation of Graduate Dissertation under Grant No.LWCXB201803
文摘This paper considers partially linear additive models with the number of parameters diverging when some linear cons train ts on the parame trie par t are available.This paper proposes a constrained profile least-squares estimation for the parametrie components with the nonparametric functions being estimated by basis function approximations.The consistency and asymptotic normality of the restricted estimator are given under some certain conditions.The authors construct a profile likelihood ratio test statistic to test the validity of the linear constraints on the parametrie components,and demonstrate that it follows asymptotically chi-squared distribution under the null and alternative hypo theses.The finite sample performance of the proposed method is illus trated by simulation studies and a data analysis.
基金supported by National Natural Science Foundation of China (Grant Nos. 11171331, 11561006, 11331011)Program for Creative Research Group of National Natural Science Foundation of China (Grant No. 61621003)+4 种基金a Grant from the Key Lab of Random Complex Structure and Data Science, Chinese Academy of Sciencesthe Natural Science Foundation of Shenzhen UniversityResearch Projects of Colleges and Universities in Guangxi (Grant No. KY2015YB171)Innovation Project of Guangxi Graduate Education (Grant No. JGY2015122)a Grant from the Key Base of Humanities and Social Sciences in Guangxi College
文摘We propose a method which uses functional singular component to establish functional additive models. The proposed methodology reduces the curve regression problem to ordinary(i.e., scalar) additive regression problems of the singular components of the predictor process and response process. Consistency of estimators for the nonparametric function and prediction are proved, respectively. A simulation study is conducted to investigate the finite sample performances of the proposed estimators.
文摘In dealing with nonparametric regression the GAM procedure is the most versatile of several new procedures. The terminology behind this procedure is more flexible than traditional parametric modeling tools. It relaxes the usual assumptions of parametric model and enables us to uncover structure to establish the relationship between independent variables and dependent variable in exponential family that may not be obvious otherwise. In this paper, we discussed two methods of fitting generalized additive logistic regression model, one based on Newton Raphson method and another based on iterative weighted least square method for first and second order Taylor series expansion. The use of the GAM procedure with the specified set of weights, using local scoring algorithm, was applied to real life data sets. The cubic spline smoother is applied to the independent variables. Based on nonparametric regression and smoothing techniques, this procedure provides powerful tools for data analysis.
基金funded by National Science Centre,Poland under the project"Assessment of the impact of weather conditions on forest health status and forest disturbances at regional and national scale based on the integration of ground and space-based remote sensing datasets"(project no.2021/41/B/ST10/)Data collection and research was also supported by the project no.EZ.271.3.19.2021"Modele ryzyka zamierania drzewostanow glownych gatunkow lasotworczych Polski"funded by the General Directorate of State Forests in Poland。
文摘Over the past decade,the presence of mistletoe(Viscum album ssp.austriacum)in Scots pine stands has increased in many European countries.Understanding the factors that influence the occurrence of mistletoe in stands is key to making appropriate forest management decisions to limit damage and prevent the spread of mistletoe in the future.Therefore,the main objective of this study was to determine the probability of mistletoe occurrence in Scots pine stands in relation to stand-related endogenous factors such as age,top height,and stand density,as well as topographic and edaphic factors.We used unmanned aerial vehicle(UAV)imagery from 2,247 stands to detect mistletoe in Scots pine stands,while majority stand and site characteristics were calculated from airborne laser scanning(ALS)data.Information on stand age and site type from the State Forest database were also used.We found that mistletoe infestation in Scots pine stands is influenced by stand and site characteristics.We documented that the densest,tallest,and oldest stands were more susceptible to mistletoe infestation.Site type and specific microsite conditions associated with topography were also important factors driving mistletoe occurrence.In addition,climatic water balance was a significant factor in increasing the probability of mistletoe occurrence,which is important in the context of predicted temperature increases associated with climate change.Our results are important for better understanding patterns of mistletoe infestation and ecosystem functioning under climate change.In an era of climate change and technological development,the use of remote sensing methods to determine the risk of mistletoe infestation can be a very useful tool for managing forest ecosystems to maintain forest sustainability and prevent forest disturbance.
基金supported by the Fundamental Research Funds for the Central Universities (QN0914)
文摘This article discusses regression analysis of failure time under the additive hazards model, when the regression coefficients are time-varying. The regression coefficients are estimated locally based on the pseudo-score function [12] in a window around each time point. The proposed method can be easily implemented, and the resulting estimators are shown to be consistent and asymptotically normal with easily estimated variances. The simulation studies show that our estimation procedure is reliable and useful.
基金The National Key R&D Program of China under contract No.2017YFE0104400the National Natural Science Foundation of China under contract No.31772852the Marine S&T Fund of Shandong Province for Pilot National Laboratory for Marine Science and Technology(Qingdao)under contract No.2018SDKJ0501-2。
文摘Habitat suitability index(HSI)models have been widely used to analyze the relationship between species abundance and environmental factors,and ultimately inform management of marine species.The response of species abundance to each environmental variable is different and habitat requirements may change over life history stages and seasons.Therefore,it is necessary to determine the optimal combination of environmental variables in HSI modelling.In this study,generalized additive models(GAMs)were used to determine which environmental variables to be included in the HSI models.Significant variables were retained and weighted in the HSI model according to their relative contribution(%)to the total deviation explained by the boosted regression tree(BRT).The HSI models were applied to evaluate the habitat suitability of mantis shrimp Oratosquilla oratoria in the Haizhou Bay and adjacent areas in 2011 and 2013–2017.Ontogenetic and seasonal variations in HSI models of mantis shrimp were also examined.Among the four models(non-optimized model,BRT informed HSI model,GAM informed HSI model,and both BRT and GAM informed HSI model),both BRT and GAM informed HSI model showed the best performance.Four environmental variables(bottom temperature,depth,distance offshore and sediment type)were selected in the HSI models for four groups(spring-juvenile,spring-adult,falljuvenile and fall-adult)of mantis shrimp.The distribution of habitat suitability showed similar patterns between juveniles and adults,but obvious seasonal variations were observed.This study suggests that the process of optimizing environmental variables in HSI models improves the performance of HSI models,and this optimization strategy could be extended to other marine organisms to enhance the understanding of the habitat suitability of target species.
基金Supported by the National Natural Science Foundation of China(Nos.41406146,41476129)the Natural Science Foundation of Shanghai Municipality(No.13ZR1419300)the Shanghai Universities FirstClass Disciplines Project-Fisheries(A)
文摘The spatiotemporal distribution and relationship between nominal catch-per-unit-ef fort(CPUE) and environment for the jumbo flying squid( Dosidicus gigas) were examined in of fshore Peruvian waters during 2009–2013. Three typical oceanographic factors aff ecting the squid habitat were investigated in this research, including sea surface temperature(SST), sea surface salinity(SSS) and sea surface height(SSH). We studied the CPUE-environment relationships for D. gigas using a spatially-lagged version of spatial autoregressive(SAR) model and a generalized additive model(GAM), with the latter for auxiliary and comparative purposes. The annual fishery centroids were distributed broadly in an area bounded by 79.5°–82.7°W and 11.9°–17.1°S, while the monthly fishery centroids were spatially close and lay in a smaller area bounded by 81.0°–81.2°W and 14.3°–15.4°S. Our results show that the preferred environmental ranges for D. gigas offshore Peru were 20.9°–21.9°C for SST, 35.16–35.32 for SSS and 27.2–31.5 cm for SSH in the areas bounded by 78°–80°W/82–84°W and 15°–18°S. Monthly spatial distributions during October to December were predicted using the calibrated GAM and SAR models and general similarities were found between the observed and predicted patterns for the nominal CPUE of D. gigas. The overall accuracies for the hotspots generated by the SAR model were much higher than those produced by the GAM model for all three months. Our results contribute to a better understanding of the spatiotemporal distributions of D. gigas off shore Peru, and off er a new SAR modeling method for advancing fishery science.
文摘This research investigates the appropriateness of the linear specification of the market model for modeling and forecasting the cryptocurrency prices during the pre-COVID-19 and COVID-19 periods.Two extensions are offered to compare the performance of the linear specification of the market model(LMM),which allows for the measurement of the cryptocurrency price beta risk.The first is the generalized additive model,which permits flexibility in the rigid shape of the linearity of the LMM.The second is the time-varying linearity specification of the LMM(Tv-LMM),which is based on the state space model form via the Kalman filter,allowing for the measurement of the time-varying beta risk of the cryptocurrency price.The analysis is performed using daily data from both time periods on the top 10 cryptocurrencies by adjusted market capitalization,using the Crypto Currency Index 30(CCI30)as a market proxy and 1-day and 7-day forward predictions.Such a comparison of cryptocurrency prices has yet to be undertaken in the literature.The empirical findings favor the Tv-LMM,which outperforms the others in terms of modeling and forecasting performance.This result suggests that the relationship between each cryptocurrency price and the CCI30 index should be locally instead of globally linear,especially during the COVID-19 period.