Efficient water quality monitoring and ensuring the safety of drinking water by government agencies in areas where the resource is constantly depleted due to anthropogenic or natural factors cannot be overemphasized. ...Efficient water quality monitoring and ensuring the safety of drinking water by government agencies in areas where the resource is constantly depleted due to anthropogenic or natural factors cannot be overemphasized. The above statement holds for West Texas, Midland, and Odessa Precisely. Two machine learning regression algorithms (Random Forest and XGBoost) were employed to develop models for the prediction of total dissolved solids (TDS) and sodium absorption ratio (SAR) for efficient water quality monitoring of two vital aquifers: Edward-Trinity (plateau), and Ogallala aquifers. These two aquifers have contributed immensely to providing water for different uses ranging from domestic, agricultural, industrial, etc. The data was obtained from the Texas Water Development Board (TWDB). The XGBoost and Random Forest models used in this study gave an accurate prediction of observed data (TDS and SAR) for both the Edward-Trinity (plateau) and Ogallala aquifers with the R<sup>2</sup> values consistently greater than 0.83. The Random Forest model gave a better prediction of TDS and SAR concentration with an average R, MAE, RMSE and MSE of 0.977, 0.015, 0.029 and 0.00, respectively. For the XGBoost, an average R, MAE, RMSE, and MSE of 0.953, 0.016, 0.037 and 0.00, respectively, were achieved. The overall performance of the models produced was impressive. From this study, we can clearly understand that Random Forest and XGBoost are appropriate for water quality prediction and monitoring in an area of high hydrocarbon activities like Midland and Odessa and West Texas at large.展开更多
随着国家大力推进能源供给侧结构性改革,新能源装机容量不断提升,电力市场竞争愈加激烈。另一方面,全球煤炭市场的复杂多变,导致以煤炭为能量来源的发电企业成本上涨。燃煤发热量是衡量煤质的重要评价标准之一,也是采购煤炭最重要的依据...随着国家大力推进能源供给侧结构性改革,新能源装机容量不断提升,电力市场竞争愈加激烈。另一方面,全球煤炭市场的复杂多变,导致以煤炭为能量来源的发电企业成本上涨。燃煤发热量是衡量煤质的重要评价标准之一,也是采购煤炭最重要的依据,对燃煤发热量进行准确预测能够有效地控制电厂运行采购成本。为了实现燃煤发热量的高效预测,采用Pearson系数对相关变量进行特征选取,采用基于密度的噪点空间聚类(Density-Based Spatial Clustering of Applications with Noise,DBSCAN)算法对某电厂自备煤厂近2年1733条化验数据进行去噪,对去噪后数据进行谱聚类(Spectral Clustering,SC)分析。将分类后的子样本集采用极致梯度提升(Extreme Gradient Boosting,XGBoost)算法分别建立预测模型,并与最小二乘法回归(Ordinary Least Squares,OLS)、支持向量机(Support Vector Machines,SVM)模型进行性能比较。结果表明,基于XGBoost的电站燃煤发热量预测模型相较于其他算法准确性有明显提升,泛化能力更强。对经过SC算法分类后的燃煤分别建立预测模型能够进一步提高模型的精细化水平,为燃煤电站发热量预测提供一种可靠高效的方法。展开更多
为改善极端梯度提升(extreme gradient boosting,XGBoost)集成算法的信贷风险预测准确率,提出了一种改进的麻雀算法(improved sparrow search algorithm based on golden sine search,Cauchy mutation and oppositionbased learning,GCO...为改善极端梯度提升(extreme gradient boosting,XGBoost)集成算法的信贷风险预测准确率,提出了一种改进的麻雀算法(improved sparrow search algorithm based on golden sine search,Cauchy mutation and oppositionbased learning,GCOSSA)来优化XGBoost参数。采用黄金正弦搜索策略来更新发现者位置,既增强全局搜索能力又增强局部搜索能力;在算法中引入反向学习策略和柯西变异进行扰动来扩大搜索领域改善陷入局部最优,同时使用贪婪规则确定最优解;将改进的算法用6个基准函数进行测试,并对SSA和GCOSSA进行对比,评估GCOSSA寻优性能;用GCOSSA优化XGBoost参数。在数据集上测试,并与网格搜索寻优、SSA及其混合正余弦改进算法(improved sparrow search algorithm based on sine and cosine,ISSA)方法进行对比。结果表明改进后的GCOSSA优化XGBoost参数,在信贷风险预测中准确率更高。展开更多
文摘Efficient water quality monitoring and ensuring the safety of drinking water by government agencies in areas where the resource is constantly depleted due to anthropogenic or natural factors cannot be overemphasized. The above statement holds for West Texas, Midland, and Odessa Precisely. Two machine learning regression algorithms (Random Forest and XGBoost) were employed to develop models for the prediction of total dissolved solids (TDS) and sodium absorption ratio (SAR) for efficient water quality monitoring of two vital aquifers: Edward-Trinity (plateau), and Ogallala aquifers. These two aquifers have contributed immensely to providing water for different uses ranging from domestic, agricultural, industrial, etc. The data was obtained from the Texas Water Development Board (TWDB). The XGBoost and Random Forest models used in this study gave an accurate prediction of observed data (TDS and SAR) for both the Edward-Trinity (plateau) and Ogallala aquifers with the R<sup>2</sup> values consistently greater than 0.83. The Random Forest model gave a better prediction of TDS and SAR concentration with an average R, MAE, RMSE and MSE of 0.977, 0.015, 0.029 and 0.00, respectively. For the XGBoost, an average R, MAE, RMSE, and MSE of 0.953, 0.016, 0.037 and 0.00, respectively, were achieved. The overall performance of the models produced was impressive. From this study, we can clearly understand that Random Forest and XGBoost are appropriate for water quality prediction and monitoring in an area of high hydrocarbon activities like Midland and Odessa and West Texas at large.
文摘随着国家大力推进能源供给侧结构性改革,新能源装机容量不断提升,电力市场竞争愈加激烈。另一方面,全球煤炭市场的复杂多变,导致以煤炭为能量来源的发电企业成本上涨。燃煤发热量是衡量煤质的重要评价标准之一,也是采购煤炭最重要的依据,对燃煤发热量进行准确预测能够有效地控制电厂运行采购成本。为了实现燃煤发热量的高效预测,采用Pearson系数对相关变量进行特征选取,采用基于密度的噪点空间聚类(Density-Based Spatial Clustering of Applications with Noise,DBSCAN)算法对某电厂自备煤厂近2年1733条化验数据进行去噪,对去噪后数据进行谱聚类(Spectral Clustering,SC)分析。将分类后的子样本集采用极致梯度提升(Extreme Gradient Boosting,XGBoost)算法分别建立预测模型,并与最小二乘法回归(Ordinary Least Squares,OLS)、支持向量机(Support Vector Machines,SVM)模型进行性能比较。结果表明,基于XGBoost的电站燃煤发热量预测模型相较于其他算法准确性有明显提升,泛化能力更强。对经过SC算法分类后的燃煤分别建立预测模型能够进一步提高模型的精细化水平,为燃煤电站发热量预测提供一种可靠高效的方法。
文摘为改善极端梯度提升(extreme gradient boosting,XGBoost)集成算法的信贷风险预测准确率,提出了一种改进的麻雀算法(improved sparrow search algorithm based on golden sine search,Cauchy mutation and oppositionbased learning,GCOSSA)来优化XGBoost参数。采用黄金正弦搜索策略来更新发现者位置,既增强全局搜索能力又增强局部搜索能力;在算法中引入反向学习策略和柯西变异进行扰动来扩大搜索领域改善陷入局部最优,同时使用贪婪规则确定最优解;将改进的算法用6个基准函数进行测试,并对SSA和GCOSSA进行对比,评估GCOSSA寻优性能;用GCOSSA优化XGBoost参数。在数据集上测试,并与网格搜索寻优、SSA及其混合正余弦改进算法(improved sparrow search algorithm based on sine and cosine,ISSA)方法进行对比。结果表明改进后的GCOSSA优化XGBoost参数,在信贷风险预测中准确率更高。