One important step in binary modeling of environmental problems is the generation of absence-datasets that are traditionally generated by random sampling and can undermine the quality of outputs.To solve this problem,...One important step in binary modeling of environmental problems is the generation of absence-datasets that are traditionally generated by random sampling and can undermine the quality of outputs.To solve this problem,this study develops the Absence Point Generation(APG)toolbox which is a Python-based ArcGIS toolbox for automated construction of absence-datasets for geospatial studies.The APG employs a frequency ratio analysis of four commonly used and important driving factors such as altitude,slope degree,topographic wetness index,and distance from rivers,and considers the presence locations buffer and density layers to define the low potential or susceptibility zones where absence-datasets are generated.To test the APG toolbox,we applied two benchmark algorithms of random forest(RF)and boosted regression trees(BRT)in a case study to investigate groundwater potential using three absence datasets i.e.,the APG,random,and selection of absence samples(SAS)toolbox.The BRT-APG and RF-APG had the area under receiver operating curve(AUC)values of 0.947 and 0.942,while BRT and RF had weaker performances with the SAS and Random datasets.This effect resulted in AUC improvements for BRT and RF by 7.2,and 9.7%from the Random dataset,and AUC improvements for BRT and RF by 6.1,and 5.4%from the SAS dataset,respectively.The APG also impacted the importance of the input factors and the pattern of the groundwater potential maps,which proves the importance of absence points in environmental binary issues.The proposed APG toolbox could be easily applied in other environmental hazards such as landslides,floods,and gully erosion,and land subsidence.展开更多
基金This research is supported by the MECW research programthe Centre for Advanced Middle Eastern Studies,Lund University.
文摘One important step in binary modeling of environmental problems is the generation of absence-datasets that are traditionally generated by random sampling and can undermine the quality of outputs.To solve this problem,this study develops the Absence Point Generation(APG)toolbox which is a Python-based ArcGIS toolbox for automated construction of absence-datasets for geospatial studies.The APG employs a frequency ratio analysis of four commonly used and important driving factors such as altitude,slope degree,topographic wetness index,and distance from rivers,and considers the presence locations buffer and density layers to define the low potential or susceptibility zones where absence-datasets are generated.To test the APG toolbox,we applied two benchmark algorithms of random forest(RF)and boosted regression trees(BRT)in a case study to investigate groundwater potential using three absence datasets i.e.,the APG,random,and selection of absence samples(SAS)toolbox.The BRT-APG and RF-APG had the area under receiver operating curve(AUC)values of 0.947 and 0.942,while BRT and RF had weaker performances with the SAS and Random datasets.This effect resulted in AUC improvements for BRT and RF by 7.2,and 9.7%from the Random dataset,and AUC improvements for BRT and RF by 6.1,and 5.4%from the SAS dataset,respectively.The APG also impacted the importance of the input factors and the pattern of the groundwater potential maps,which proves the importance of absence points in environmental binary issues.The proposed APG toolbox could be easily applied in other environmental hazards such as landslides,floods,and gully erosion,and land subsidence.