摘要
Suspended particulate matter(SPM)in lakes exerts strong impact on light propagation,aquatic ecosystem productivity,which co-varies with nutrients,heavy metal and micro-pollutant in waters.In lakes,SPM exerts strong absorption and backscattering,ultimately affects water leaving signals that can be detected by satellite sensors.Simple regression models based on specific band or hand ratios have been widely used for SPM estimate in the past with moderate accuracy.There are still rooms for model accuracy improvements,and machine learning models may solve the non-linear relationships between spectral variable and SPM in waters.We assembled more than 16,400 in situ measured SPM in lakes from six continents(excluding the Antarctica continent),of which 9640 samples were matched with Landsat overpasses within±7 days.Seven machine learning algorithms and two simple regression methods(linear and partial least squares models)were used to estimate SPM in lakes and the performance were compared.To overcome the problem of imbalance datasets in regression,a Synthetic Minority Over-Sampling technique for regression with Gaussian Noise(SMOGN)was adopted in this study.Through comparison,we found that gradient boosting decision tree(GBDT),random forest(RF),and extreme gradient boosting(XGBoost)models demonstrated good spatiotemporal transferability with SMOGN processed dataset,and has potential to map SPM at different year with good quality of Landsat land surface reflectance images.In all the tested modeling approaches,the GBDT model has accurate calibration(n=6428,R^(2)=0.95,MAPE=29.8%)from SPM collected in 2235 lakes across the world,and the validation(n=3214,R^(2)=0.84,MAPE=38.8%)also exhibited stable performance.Further,the good performances were also exhibited by RF model with calibration(R^(2)=0.93)and validation(R^(2)=0.86,MAPE=24.2%)datasets.We applied GBDT and RF models to map SPM of typical lakes,and satisfactory result was obtained.In addition,the GBDT model was evaluated by historical SPM measurements coincident with different Landsat sensors(L5-TM,L7-ETM+,and L8-OLI),thus the model has the potential to map SPM of lakes for monitoring temporal variations,and tracks lake water SPM dynamics in approximately the past four decades(1984-2021)since Landsat-5/TM was launched in 1984.
基金
The research was jointly supported by the National Key Research and Development Project of China(2021YFB3901101)
the National Natural Science Foundation of China(42171374,42071336,42001311,42101366)
the Natural Science Foundation of Jilin Province,China(20220203024SF)
Youth Innovation Promotion Association of Chinese Academy of Sciences,China(2020234)
Young Scientist Group Project of Northeast Institute of Geography and Agroecology,China(2023QNXZ01)
Chinese Academy of Sciences and Postdoctoral Fellowship of Jilin Province of China to Yingxin Shang.