摘要
Landslide susceptibility mapping is a typical two-class classification problem where generating pseudo absence (non-slide) data plays an important role.In this paper,a new method,target space exteriorization sampling method (TSES),is presented to generate pseudo absence data based on presence data directly in feature space.TSES exteriorizes a presence sample to become a pseudo absence one by replacing the value of one of its features with a new one outside the value range of this feature of all presence data.This method is compared with two existing methods,buffer controlled sampling (BCS) and iteratively refined sampling (IRS),in a study area of Shenzhen city.The pseudo absence data generated by each of these three methods are organized into 20 subsets with increasing data sizes to study the effects of the proportion of pseudo absence data to presence data.The landslide susceptibility maps of the study area are calculated with all these datasets by general additive model (GAM).It can be concluded that,through a 10-fold validation,TSES and IRS-based models have similar AUC values that are both greater than that of BCS,but TSES outperforms BCS and IRS in prediction efficiency.TSES results also have more reasonable spatial and histogram distributions than BCS and IRS,which can support categorization of an area into more susceptibility ranks,while IRS shows a tendency to separate the whole study area into two susceptibility extremes.It can be also concluded that when using BCS,the pseudo absence data proportion to the presence data would be about 50% to get a considerable result,while for IRS or TSES the minimum proportion is 40%.
Landslide susceptibility mapping is a typical two-class classification problem where generating pseudo absence (non-slide) data plays an important role.In this paper,a new method,target space exteriorization sampling method (TSES),is presented to generate pseudo absence data based on presence data directly in feature space.TSES exteriorizes a presence sample to become a pseudo absence one by replacing the value of one of its features with a new one outside the value range of this feature of all presence data.This method is compared with two existing methods,buffer controlled sampling (BCS) and iteratively refined sampling (IRS),in a study area of Shenzhen city.The pseudo absence data generated by each of these three methods are organized into 20 subsets with increasing data sizes to study the effects of the proportion of pseudo absence data to presence data.The landslide susceptibility maps of the study area are calculated with all these datasets by general additive model (GAM).It can be concluded that,through a 10-fold validation,TSES and IRS-based models have similar AUC values that are both greater than that of BCS,but TSES outperforms BCS and IRS in prediction efficiency.TSES results also have more reasonable spatial and histogram distributions than BCS and IRS,which can support categorization of an area into more susceptibility ranks,while IRS shows a tendency to separate the whole study area into two susceptibility extremes.It can be also concluded that when using BCS,the pseudo absence data proportion to the presence data would be about 50% to get a considerable result,while for IRS or TSES the minimum proportion is 40%.
基金
supported by the Research Fund from Hong Kong Polytechnic University(Grant Nos.G-U632,G-YF24)
National Key Technologies Research and Development Program of China(Grant Nos.2008BAJ11B04,2006BAJ14B04)
National Natural Science Foundation of China(Grant Nos.40928001,40701134,40771171)
National High technology Research and Development Program of China("863"Program)(Grant No.2007AA120502)