In digital soil mapping(DSM),a fundamental assumption is that the spatial variability of the target variable can be explained by the predictors or environmental covariates.Strategies to adequately sample the predictor...In digital soil mapping(DSM),a fundamental assumption is that the spatial variability of the target variable can be explained by the predictors or environmental covariates.Strategies to adequately sample the predictors have been well documented,with the conditioned Latin hypercube sampling(cLHS)algorithm receiving the most attention in the DSM community.Despite advances in sampling design,a critical gap remains in determining the number of samples required for DSM projects.We propose a simple workflow and function coded in R language to determine the minimum sample size for the cLHS algorithm based on histograms of the predictor variables using the Freedman-Diaconis rule for determining optimal bin width.Data preprocessing was included to correct for multimodal and non-normally distributed data,as these can affect sample size determination from the histogram.Based on a user-selected quantile range(QR)for the sample plan,the densities of the histogram bins at the upper and lower bounds of the QR were used as a scaling factor to determine minimum sample size.This technique was applied to a field-scale set of environmental covariates for a well-sampled agricultural study site near Guelph,Ontario,Canada,and tested across a range of QRs.The results showed increasing minimum sample size with an increase in the QR selected.Minimum sample size increased from 44 to 83 when the QR increased from 50% to 95% and then increased exponentially to 194 for the 99%QR.This technique provides an estimate of minimum sample size that can be used as an input to the cLHS algorithm.展开更多
Digital elevation models(DEMs)are a necessary dataset for modelling the Earth’s surface;however,all DEMs contain error.Researchers can reduce this error using DEM fusion techniques since numerous DEMs can be availabl...Digital elevation models(DEMs)are a necessary dataset for modelling the Earth’s surface;however,all DEMs contain error.Researchers can reduce this error using DEM fusion techniques since numerous DEMs can be available for a region.However,the use of a clustering algorithm in DEM fusion has not been previously reported.In this study a new DEM fusion algorithm based on a clustering approach that works on multiple DEMs to exploit consistency in the estimates as indicators of accuracy and precision is presented.The fusion approach includes slope and elevation thresholding,k-means clustering of the elevation estimates at each cell location,as well as filtering and smoothing of the fusion product.Corroboration of the input DEMs,and the products of each step of the fusion algorithm,with a higher accuracy reference DEM enabled a detailed analysis of the effectiveness of the DEM fusion algorithm.The main findings of the research were:the k-means clustering of the elevations reduced the precision which also impacted the overall accuracy of the estimates;the number of final cluster members and the standard deviation of elevations before clustering both had a strong relationship to the error in the k-means estimates.展开更多
基金the Natural Science and Engineering Research Council(NSERC)of Canada,which supported and funded this project through an NSERC Postgraduate Scholarship—Doctoral(PGS-D)。
文摘In digital soil mapping(DSM),a fundamental assumption is that the spatial variability of the target variable can be explained by the predictors or environmental covariates.Strategies to adequately sample the predictors have been well documented,with the conditioned Latin hypercube sampling(cLHS)algorithm receiving the most attention in the DSM community.Despite advances in sampling design,a critical gap remains in determining the number of samples required for DSM projects.We propose a simple workflow and function coded in R language to determine the minimum sample size for the cLHS algorithm based on histograms of the predictor variables using the Freedman-Diaconis rule for determining optimal bin width.Data preprocessing was included to correct for multimodal and non-normally distributed data,as these can affect sample size determination from the histogram.Based on a user-selected quantile range(QR)for the sample plan,the densities of the histogram bins at the upper and lower bounds of the QR were used as a scaling factor to determine minimum sample size.This technique was applied to a field-scale set of environmental covariates for a well-sampled agricultural study site near Guelph,Ontario,Canada,and tested across a range of QRs.The results showed increasing minimum sample size with an increase in the QR selected.Minimum sample size increased from 44 to 83 when the QR increased from 50% to 95% and then increased exponentially to 194 for the 99%QR.This technique provides an estimate of minimum sample size that can be used as an input to the cLHS algorithm.
基金Canadian Space Agency:[Grant Number New Directions Grant]Ontario Ministry of Agriculture,Food and Rural Affairs+1 种基金GEOmatics for Informed DEcisionsOntario Research Fund.The authors would like to acknowledge the Canadian Space Agency,GEOIDE,Ontario Research Fund,and OMAFRA for providing research funding.
文摘Digital elevation models(DEMs)are a necessary dataset for modelling the Earth’s surface;however,all DEMs contain error.Researchers can reduce this error using DEM fusion techniques since numerous DEMs can be available for a region.However,the use of a clustering algorithm in DEM fusion has not been previously reported.In this study a new DEM fusion algorithm based on a clustering approach that works on multiple DEMs to exploit consistency in the estimates as indicators of accuracy and precision is presented.The fusion approach includes slope and elevation thresholding,k-means clustering of the elevation estimates at each cell location,as well as filtering and smoothing of the fusion product.Corroboration of the input DEMs,and the products of each step of the fusion algorithm,with a higher accuracy reference DEM enabled a detailed analysis of the effectiveness of the DEM fusion algorithm.The main findings of the research were:the k-means clustering of the elevations reduced the precision which also impacted the overall accuracy of the estimates;the number of final cluster members and the standard deviation of elevations before clustering both had a strong relationship to the error in the k-means estimates.