Efficient water quality monitoring and ensuring the safety of drinking water by government agencies in areas where the resource is constantly depleted due to anthropogenic or natural factors cannot be overemphasized. ...Efficient water quality monitoring and ensuring the safety of drinking water by government agencies in areas where the resource is constantly depleted due to anthropogenic or natural factors cannot be overemphasized. The above statement holds for West Texas, Midland, and Odessa Precisely. Two machine learning regression algorithms (Random Forest and XGBoost) were employed to develop models for the prediction of total dissolved solids (TDS) and sodium absorption ratio (SAR) for efficient water quality monitoring of two vital aquifers: Edward-Trinity (plateau), and Ogallala aquifers. These two aquifers have contributed immensely to providing water for different uses ranging from domestic, agricultural, industrial, etc. The data was obtained from the Texas Water Development Board (TWDB). The XGBoost and Random Forest models used in this study gave an accurate prediction of observed data (TDS and SAR) for both the Edward-Trinity (plateau) and Ogallala aquifers with the R<sup>2</sup> values consistently greater than 0.83. The Random Forest model gave a better prediction of TDS and SAR concentration with an average R, MAE, RMSE and MSE of 0.977, 0.015, 0.029 and 0.00, respectively. For the XGBoost, an average R, MAE, RMSE, and MSE of 0.953, 0.016, 0.037 and 0.00, respectively, were achieved. The overall performance of the models produced was impressive. From this study, we can clearly understand that Random Forest and XGBoost are appropriate for water quality prediction and monitoring in an area of high hydrocarbon activities like Midland and Odessa and West Texas at large.展开更多
The Sentinel-2 satellites are providing an unparalleled wealth of high-resolution remotely sensed information with a short revisit cycle, which is ideal for mapping burned areas both accurately and timely. This paper ...The Sentinel-2 satellites are providing an unparalleled wealth of high-resolution remotely sensed information with a short revisit cycle, which is ideal for mapping burned areas both accurately and timely. This paper proposes an automated methodology for mapping burn scars using pairs of Sentinel-2 imagery, exploiting the state-of-the-art eXtreme Gradient Boosting (XGB) machine learning framework. A large database of 64 reference wildfire perimeters in Greece from 2016 to 2019 is used to train the classifier. An empirical methodology for appropriately sampling the training patterns from this database is formulated, which guarantees the effectiveness of the approach and its computational efficiency. A difference (pre-fire minus post-fire) spectral index is used for this purpose, upon which we appropriately identify the clear and fuzzy value ranges. To reduce the data volume, a super-pixel segmentation of the images is also employed, implemented via the QuickShift algorithm. The cross-validation results showcase the effectiveness of the proposed algorithm, with the average commission and omission errors being 9% and 2%, respectively, and the average Matthews correlation coefficient (MCC) equal to 0.93.展开更多
Accurate assessment of undrained shear strength(USS)for soft sensitive clays is a great concern in geotechnical engineering practice.This study applies novel data-driven extreme gradient boosting(XGBoost)and random fo...Accurate assessment of undrained shear strength(USS)for soft sensitive clays is a great concern in geotechnical engineering practice.This study applies novel data-driven extreme gradient boosting(XGBoost)and random forest(RF)ensemble learning methods for capturing the relationships between the USS and various basic soil parameters.Based on the soil data sets from TC304 database,a general approach is developed to predict the USS of soft clays using the two machine learning methods above,where five feature variables including the preconsolidation stress(PS),vertical effective stress(VES),liquid limit(LL),plastic limit(PL)and natural water content(W)are adopted.To reduce the dependence on the rule of thumb and inefficient brute-force search,the Bayesian optimization method is applied to determine the appropriate model hyper-parameters of both XGBoost and RF.The developed models are comprehensively compared with three comparison machine learning methods and two transformation models with respect to predictive accuracy and robustness under 5-fold cross-validation(CV).It is shown that XGBoost-based and RF-based methods outperform these approaches.Besides,the XGBoostbased model provides feature importance ranks,which makes it a promising tool in the prediction of geotechnical parameters and enhances the interpretability of model.展开更多
It is important for regional water resources management to know the agricultural water consumption information several months in advance.Forecasting reference evapotranspiration(ET_(0))in the next few months is import...It is important for regional water resources management to know the agricultural water consumption information several months in advance.Forecasting reference evapotranspiration(ET_(0))in the next few months is important for irrigation and reservoir management.Studies on forecasting of multiple-month ahead ET_(0) using machine learning models have not been reported yet.Besides,machine learning models such as the XGBoost model has multiple parameters that need to be tuned,and traditional methods can get stuck in a regional optimal solution and fail to obtain a global optimal solution.This study investigated the performance of the hybrid extreme gradient boosting(XGBoost)model coupled with the Grey Wolf Optimizer(GWO)algorithm for forecasting multi-step ahead ET_(0)(1-3 months ahead),compared with three conventional machine learning models,i.e.,standalone XGBoost,multi-layer perceptron(MLP)and M5 model tree(M5)models in the subtropical zone of China.The results showed that theGWO-XGB model generally performed better than the other three machine learning models in forecasting 1-3 months ahead ET_(0),followed by the XGB,M5 and MLP models with very small differences among the three models.The GWO-XGB model performed best in autumn,while the MLP model performed slightly better than the other three models in summer.It is thus suggested to apply the MLP model for ET_(0) forecasting in summer but use the GWO-XGB model in other seasons.展开更多
为改善极端梯度提升(extreme gradient boosting,XGBoost)集成算法的信贷风险预测准确率,提出了一种改进的麻雀算法(improved sparrow search algorithm based on golden sine search,Cauchy mutation and oppositionbased learning,GCO...为改善极端梯度提升(extreme gradient boosting,XGBoost)集成算法的信贷风险预测准确率,提出了一种改进的麻雀算法(improved sparrow search algorithm based on golden sine search,Cauchy mutation and oppositionbased learning,GCOSSA)来优化XGBoost参数。采用黄金正弦搜索策略来更新发现者位置,既增强全局搜索能力又增强局部搜索能力;在算法中引入反向学习策略和柯西变异进行扰动来扩大搜索领域改善陷入局部最优,同时使用贪婪规则确定最优解;将改进的算法用6个基准函数进行测试,并对SSA和GCOSSA进行对比,评估GCOSSA寻优性能;用GCOSSA优化XGBoost参数。在数据集上测试,并与网格搜索寻优、SSA及其混合正余弦改进算法(improved sparrow search algorithm based on sine and cosine,ISSA)方法进行对比。结果表明改进后的GCOSSA优化XGBoost参数,在信贷风险预测中准确率更高。展开更多
文摘Efficient water quality monitoring and ensuring the safety of drinking water by government agencies in areas where the resource is constantly depleted due to anthropogenic or natural factors cannot be overemphasized. The above statement holds for West Texas, Midland, and Odessa Precisely. Two machine learning regression algorithms (Random Forest and XGBoost) were employed to develop models for the prediction of total dissolved solids (TDS) and sodium absorption ratio (SAR) for efficient water quality monitoring of two vital aquifers: Edward-Trinity (plateau), and Ogallala aquifers. These two aquifers have contributed immensely to providing water for different uses ranging from domestic, agricultural, industrial, etc. The data was obtained from the Texas Water Development Board (TWDB). The XGBoost and Random Forest models used in this study gave an accurate prediction of observed data (TDS and SAR) for both the Edward-Trinity (plateau) and Ogallala aquifers with the R<sup>2</sup> values consistently greater than 0.83. The Random Forest model gave a better prediction of TDS and SAR concentration with an average R, MAE, RMSE and MSE of 0.977, 0.015, 0.029 and 0.00, respectively. For the XGBoost, an average R, MAE, RMSE, and MSE of 0.953, 0.016, 0.037 and 0.00, respectively, were achieved. The overall performance of the models produced was impressive. From this study, we can clearly understand that Random Forest and XGBoost are appropriate for water quality prediction and monitoring in an area of high hydrocarbon activities like Midland and Odessa and West Texas at large.
文摘The Sentinel-2 satellites are providing an unparalleled wealth of high-resolution remotely sensed information with a short revisit cycle, which is ideal for mapping burned areas both accurately and timely. This paper proposes an automated methodology for mapping burn scars using pairs of Sentinel-2 imagery, exploiting the state-of-the-art eXtreme Gradient Boosting (XGB) machine learning framework. A large database of 64 reference wildfire perimeters in Greece from 2016 to 2019 is used to train the classifier. An empirical methodology for appropriately sampling the training patterns from this database is formulated, which guarantees the effectiveness of the approach and its computational efficiency. A difference (pre-fire minus post-fire) spectral index is used for this purpose, upon which we appropriately identify the clear and fuzzy value ranges. To reduce the data volume, a super-pixel segmentation of the images is also employed, implemented via the QuickShift algorithm. The cross-validation results showcase the effectiveness of the proposed algorithm, with the average commission and omission errors being 9% and 2%, respectively, and the average Matthews correlation coefficient (MCC) equal to 0.93.
基金financial support from High-end Foreign Expert Introduction program(No.G20190022002)Chongqing Construction Science and Technology Plan Project(2019-0045)as well as Chongqing Engineering Research Center of Disaster Prevention&Control for Banks and Structures in Three Gorges Reservoir Area(Nos.SXAPGC18ZD01 and SXAPGC18YB03)。
文摘Accurate assessment of undrained shear strength(USS)for soft sensitive clays is a great concern in geotechnical engineering practice.This study applies novel data-driven extreme gradient boosting(XGBoost)and random forest(RF)ensemble learning methods for capturing the relationships between the USS and various basic soil parameters.Based on the soil data sets from TC304 database,a general approach is developed to predict the USS of soft clays using the two machine learning methods above,where five feature variables including the preconsolidation stress(PS),vertical effective stress(VES),liquid limit(LL),plastic limit(PL)and natural water content(W)are adopted.To reduce the dependence on the rule of thumb and inefficient brute-force search,the Bayesian optimization method is applied to determine the appropriate model hyper-parameters of both XGBoost and RF.The developed models are comprehensively compared with three comparison machine learning methods and two transformation models with respect to predictive accuracy and robustness under 5-fold cross-validation(CV).It is shown that XGBoost-based and RF-based methods outperform these approaches.Besides,the XGBoostbased model provides feature importance ranks,which makes it a promising tool in the prediction of geotechnical parameters and enhances the interpretability of model.
基金This study was jointly supported by the National Natural Science Foundation of China(Nos.51879196,51790533,51709143)Jiangxi Natural Science Foundation of China(No.20181BAB206045).
文摘It is important for regional water resources management to know the agricultural water consumption information several months in advance.Forecasting reference evapotranspiration(ET_(0))in the next few months is important for irrigation and reservoir management.Studies on forecasting of multiple-month ahead ET_(0) using machine learning models have not been reported yet.Besides,machine learning models such as the XGBoost model has multiple parameters that need to be tuned,and traditional methods can get stuck in a regional optimal solution and fail to obtain a global optimal solution.This study investigated the performance of the hybrid extreme gradient boosting(XGBoost)model coupled with the Grey Wolf Optimizer(GWO)algorithm for forecasting multi-step ahead ET_(0)(1-3 months ahead),compared with three conventional machine learning models,i.e.,standalone XGBoost,multi-layer perceptron(MLP)and M5 model tree(M5)models in the subtropical zone of China.The results showed that theGWO-XGB model generally performed better than the other three machine learning models in forecasting 1-3 months ahead ET_(0),followed by the XGB,M5 and MLP models with very small differences among the three models.The GWO-XGB model performed best in autumn,while the MLP model performed slightly better than the other three models in summer.It is thus suggested to apply the MLP model for ET_(0) forecasting in summer but use the GWO-XGB model in other seasons.
文摘为改善极端梯度提升(extreme gradient boosting,XGBoost)集成算法的信贷风险预测准确率,提出了一种改进的麻雀算法(improved sparrow search algorithm based on golden sine search,Cauchy mutation and oppositionbased learning,GCOSSA)来优化XGBoost参数。采用黄金正弦搜索策略来更新发现者位置,既增强全局搜索能力又增强局部搜索能力;在算法中引入反向学习策略和柯西变异进行扰动来扩大搜索领域改善陷入局部最优,同时使用贪婪规则确定最优解;将改进的算法用6个基准函数进行测试,并对SSA和GCOSSA进行对比,评估GCOSSA寻优性能;用GCOSSA优化XGBoost参数。在数据集上测试,并与网格搜索寻优、SSA及其混合正余弦改进算法(improved sparrow search algorithm based on sine and cosine,ISSA)方法进行对比。结果表明改进后的GCOSSA优化XGBoost参数,在信贷风险预测中准确率更高。