Quantitative descriptions of geochemical patterns and providing geochemical anomaly map are important in applied geochemistry. Several statistical methodologies are presented in order to identify and separate geochemi...Quantitative descriptions of geochemical patterns and providing geochemical anomaly map are important in applied geochemistry. Several statistical methodologies are presented in order to identify and separate geochemical anomalies. The U-statistic method is one of the most important structural methods and is a kind of weighted mean that surrounding points of samples are considered in U value determination. However, it is able to separate the different anomalies based on only one variable. The main aim of the presented study is development of this method in a multivariate mode. For this purpose, U-statistic method should be combined with a multivariate method which devotes a new value to each sample based on several variables. Therefore, at the first step, the optimum p is calculated in p-norm distance and then U-statistic method is applied on p-norm distance values of the samples because p-norm distance is calculated based on several variables. This method is a combination of efficient U-statistic method and p-norm distance and is used for the first time in this research. Results show that p-norm distance of p=2(Euclidean distance) in the case of a fact that Au and As can be considered optimized p-norm distance with the lowest error. The samples indicated by the combination of these methods as anomalous are more regular, less dispersed and more accurate than using just the U-statistic or other nonstructural methods such as Mahalanobis distance. Also it was observed that the combination results are closely associated with the defined Au ore indication within the studied area. Finally, univariate and bivariate geochemical anomaly maps are provided for Au and As, which have been respectively prepared using U-statistic and its combination with Euclidean distance method.展开更多
One hundred and sixty-eight genotypes of cotton from the same growing region were used as a germplasm group to study the validity of different genetic distances in constructing cotton core subset. Mixed linear model a...One hundred and sixty-eight genotypes of cotton from the same growing region were used as a germplasm group to study the validity of different genetic distances in constructing cotton core subset. Mixed linear model approach was employed to unbiasedly predict genotypic values of 20 traits for eliminating the environmental effect. Six commonly used genetic distances(Euclidean,standardized Euclidean,Mahalanobis,city block,cosine and correlation distances) combining four commonly used hierarchical cluster methods(single distance,complete distance,unweighted pair-group average and Ward's methods) were used in the least distance stepwise sampling(LDSS) method for constructing different core subsets. The analyses of variance(ANOVA) of different evaluating parameters showed that the validities of cosine and correlation distances were inferior to those of Euclidean,standardized Euclidean,Mahalanobis and city block distances. Standardized Euclidean distance was slightly more effective than Euclidean,Mahalanobis and city block distances. The principal analysis validated standardized Euclidean distance in the course of constructing practical core subsets. The covariance matrix of accessions might be ill-conditioned when Mahalanobis distance was used to calculate genetic distance at low sampling percentages,which led to bias in small-sized core subset construction. The standardized Euclidean distance is recommended in core subset construction with LDSS method.展开更多
Facility location problems are concerned with the location of one or more facilities in a way that optimizes a certain objective such as minimizing transportation cost, providing equitable service to customers, captur...Facility location problems are concerned with the location of one or more facilities in a way that optimizes a certain objective such as minimizing transportation cost, providing equitable service to customers, capturing the largest market share, etc. Many facility location decisions involving distance objective functions on Spherical Surface have been approached using algorithmic, metaheuristic algorithms, branch-and-bound algorithm, approximation algorithms, simulation, heuristic techniques, and decomposition method. These approaches are most based on Euclidean distance or Great circle distance functions. However, if the location points are widely separated, the difference between driving distance, Euclidean distance and Great circle distance may be significant and this may lead to significant variations in the locations of the corresponding optimal source points. This paper presents a framework and algorithm to use driving distances on spherical surface and explores its use as a facility location decision tool and helps companies assess the optimal locations of facilities.展开更多
When the coordinates of a set of points are known, the pairwise Euclidean distances among the points can be easily computed. Conversely, if the Euclidean distance matrix is given, a set of coordinates for those points...When the coordinates of a set of points are known, the pairwise Euclidean distances among the points can be easily computed. Conversely, if the Euclidean distance matrix is given, a set of coordinates for those points can be computed through the well known classical Multi-Dimensional Scaling (MDS). In this paper, we consider the case where some of the distances are far from being accurate (containing large noises or even missing). In such a situation, the order of the known distances (i.e., some distances are larger than others) is valuable information that often yields far more accurate construction of the points than just using the magnitude of the known distances. The methods making use of the order information is collectively known as nonmetric MDS. A challenging computational issue among all existing nonmetric MDS methods is that there are often a large number of ordinal constraints. In this paper, we cast this problem as a matrix optimization problem with ordinal constraints. We then adapt an existing smoothing Newton method to our matrix problem. Extensive numerical results demonstrate the efficiency of the algorithm, which can potentially handle a very large number of ordinal constraints.展开更多
利用已搜集的180份菜用豌豆材料进行核心种质构建策略研究。分别对所有种质材料进行单株荚数、每荚粒数、荚长、荚宽、荚厚、百荚鲜质量、百粒鲜质量及产量等性状进行调查,结果表明,搜集的材料具有丰富的遗传多样性。利用上述数据,采用...利用已搜集的180份菜用豌豆材料进行核心种质构建策略研究。分别对所有种质材料进行单株荚数、每荚粒数、荚长、荚宽、荚厚、百荚鲜质量、百粒鲜质量及产量等性状进行调查,结果表明,搜集的材料具有丰富的遗传多样性。利用上述数据,采用最小距离逐步取样(minimum distance stepwise sampling,LDSS)法,分别选择4种遗传距离、8种取样比例进行核心种质构建策略研究,并采用极差符合率(coincidence rate of range,CR)和变异系数变化率(variable rate of coefficient of variation,VR)2个参数对构建策略进行评价;同时,利用主成分分析法和聚类分析法对构建的核心种质代表性进行鉴定。结果表明,采用LDSS法构建菜用豌豆核心种质的最佳遗传距离为欧式距离,最佳取样比例为25%。该构建策略将为菜用豌豆核心种质构建与高效利用奠定基础。展开更多
文摘Quantitative descriptions of geochemical patterns and providing geochemical anomaly map are important in applied geochemistry. Several statistical methodologies are presented in order to identify and separate geochemical anomalies. The U-statistic method is one of the most important structural methods and is a kind of weighted mean that surrounding points of samples are considered in U value determination. However, it is able to separate the different anomalies based on only one variable. The main aim of the presented study is development of this method in a multivariate mode. For this purpose, U-statistic method should be combined with a multivariate method which devotes a new value to each sample based on several variables. Therefore, at the first step, the optimum p is calculated in p-norm distance and then U-statistic method is applied on p-norm distance values of the samples because p-norm distance is calculated based on several variables. This method is a combination of efficient U-statistic method and p-norm distance and is used for the first time in this research. Results show that p-norm distance of p=2(Euclidean distance) in the case of a fact that Au and As can be considered optimized p-norm distance with the lowest error. The samples indicated by the combination of these methods as anomalous are more regular, less dispersed and more accurate than using just the U-statistic or other nonstructural methods such as Mahalanobis distance. Also it was observed that the combination results are closely associated with the defined Au ore indication within the studied area. Finally, univariate and bivariate geochemical anomaly maps are provided for Au and As, which have been respectively prepared using U-statistic and its combination with Euclidean distance method.
基金Project supported by the National Natural Science Foundation of China (No. 30270759)the Cooperation Project in Science and Technology between China and Poland Governments (No. 32-38)the Scientific Research Foundation for Doctors in Shandong Academy of Agricultural Sciences (No. [2007]20), China
文摘One hundred and sixty-eight genotypes of cotton from the same growing region were used as a germplasm group to study the validity of different genetic distances in constructing cotton core subset. Mixed linear model approach was employed to unbiasedly predict genotypic values of 20 traits for eliminating the environmental effect. Six commonly used genetic distances(Euclidean,standardized Euclidean,Mahalanobis,city block,cosine and correlation distances) combining four commonly used hierarchical cluster methods(single distance,complete distance,unweighted pair-group average and Ward's methods) were used in the least distance stepwise sampling(LDSS) method for constructing different core subsets. The analyses of variance(ANOVA) of different evaluating parameters showed that the validities of cosine and correlation distances were inferior to those of Euclidean,standardized Euclidean,Mahalanobis and city block distances. Standardized Euclidean distance was slightly more effective than Euclidean,Mahalanobis and city block distances. The principal analysis validated standardized Euclidean distance in the course of constructing practical core subsets. The covariance matrix of accessions might be ill-conditioned when Mahalanobis distance was used to calculate genetic distance at low sampling percentages,which led to bias in small-sized core subset construction. The standardized Euclidean distance is recommended in core subset construction with LDSS method.
文摘Facility location problems are concerned with the location of one or more facilities in a way that optimizes a certain objective such as minimizing transportation cost, providing equitable service to customers, capturing the largest market share, etc. Many facility location decisions involving distance objective functions on Spherical Surface have been approached using algorithmic, metaheuristic algorithms, branch-and-bound algorithm, approximation algorithms, simulation, heuristic techniques, and decomposition method. These approaches are most based on Euclidean distance or Great circle distance functions. However, if the location points are widely separated, the difference between driving distance, Euclidean distance and Great circle distance may be significant and this may lead to significant variations in the locations of the corresponding optimal source points. This paper presents a framework and algorithm to use driving distances on spherical surface and explores its use as a facility location decision tool and helps companies assess the optimal locations of facilities.
文摘When the coordinates of a set of points are known, the pairwise Euclidean distances among the points can be easily computed. Conversely, if the Euclidean distance matrix is given, a set of coordinates for those points can be computed through the well known classical Multi-Dimensional Scaling (MDS). In this paper, we consider the case where some of the distances are far from being accurate (containing large noises or even missing). In such a situation, the order of the known distances (i.e., some distances are larger than others) is valuable information that often yields far more accurate construction of the points than just using the magnitude of the known distances. The methods making use of the order information is collectively known as nonmetric MDS. A challenging computational issue among all existing nonmetric MDS methods is that there are often a large number of ordinal constraints. In this paper, we cast this problem as a matrix optimization problem with ordinal constraints. We then adapt an existing smoothing Newton method to our matrix problem. Extensive numerical results demonstrate the efficiency of the algorithm, which can potentially handle a very large number of ordinal constraints.
文摘利用已搜集的180份菜用豌豆材料进行核心种质构建策略研究。分别对所有种质材料进行单株荚数、每荚粒数、荚长、荚宽、荚厚、百荚鲜质量、百粒鲜质量及产量等性状进行调查,结果表明,搜集的材料具有丰富的遗传多样性。利用上述数据,采用最小距离逐步取样(minimum distance stepwise sampling,LDSS)法,分别选择4种遗传距离、8种取样比例进行核心种质构建策略研究,并采用极差符合率(coincidence rate of range,CR)和变异系数变化率(variable rate of coefficient of variation,VR)2个参数对构建策略进行评价;同时,利用主成分分析法和聚类分析法对构建的核心种质代表性进行鉴定。结果表明,采用LDSS法构建菜用豌豆核心种质的最佳遗传距离为欧式距离,最佳取样比例为25%。该构建策略将为菜用豌豆核心种质构建与高效利用奠定基础。