Clustering is a group of unsupervised statistical techniques commonly used in many disciplines. Considering their applications to fish abundance data, many technical details need to be considered to ensure reasonable ...Clustering is a group of unsupervised statistical techniques commonly used in many disciplines. Considering their applications to fish abundance data, many technical details need to be considered to ensure reasonable interpretation. However, the reliability and stability of the clustering methods have rarely been studied in the contexts of fisheries. This study presents an intensive evaluation of three common clustering methods, including hierarchical clustering(HC), K-means(KM), and expectation-maximization(EM) methods, based on fish community surveys in the coastal waters of Shandong, China. We evaluated the performances of these three methods considering different numbers of clusters, data size, and data transformation approaches, focusing on the consistency validation using the index of average proportion of non-overlap(APN). The results indicate that the three methods tend to be inconsistent in the optimal number of clusters. EM showed relatively better performances to avoid unbalanced classification, whereas HC and KM provided more stable clustering results. Data transformation including scaling, square-root, and log-transformation had substantial influences on the clustering results, especially for KM. Moreover, transformation also influenced clustering stability, wherein scaling tended to provide a stable solution at the same number of clusters. The APN values indicated improved stability with increasing data size, and the effect leveled off over 70 samples in general and most quickly in EM. We conclude that the best clustering method can be chosen depending on the aim of the study and the number of clusters. In general, KM is relatively robust in our tests. We also provide recommendations for future application of clustering analyses. This study is helpful to ensure the credibility of the application and interpretation of clustering methods.展开更多
The upper bound of the optimal number of clusters in clustering algorithm is studied in this paper. A new method is proposed to solve this issue. This method shows that the rule cmax≤n, which is popular in current pa...The upper bound of the optimal number of clusters in clustering algorithm is studied in this paper. A new method is proposed to solve this issue. This method shows that the rule cmax≤n, which is popular in current papers, is reasonable in some sense. The above conclusion is tested and analyzed by some typical examples in the literature, which demonstrates the validity of the new method.展开更多
Refined 3D modeling of mine slopes is pivotal for precise prediction of geological hazards.Aiming at the inadequacy of existing single modeling methods in comprehensively representing the overall and localized charact...Refined 3D modeling of mine slopes is pivotal for precise prediction of geological hazards.Aiming at the inadequacy of existing single modeling methods in comprehensively representing the overall and localized characteristics of mining slopes,this study introduces a new method that fuses model data from Unmanned aerial vehicles(UAV)tilt photogrammetry and 3D laser scanning through a data alignment algorithm based on control points.First,the mini batch K-Medoids algorithm is utilized to cluster the point cloud data from ground 3D laser scanning.Then,the elbow rule is applied to determine the optimal cluster number(K0),and the feature points are extracted.Next,the nearest neighbor point algorithm is employed to match the feature points obtained from UAV tilt photogrammetry,and the internal point coordinates are adjusted through the distanceweighted average to construct a 3D model.Finally,by integrating an engineering case study,the K0 value is determined to be 8,with a matching accuracy between the two model datasets ranging from 0.0669 to 1.0373 mm.Therefore,compared with the modeling method utilizing K-medoids clustering algorithm,the new modeling method significantly enhances the computational efficiency,the accuracy of selecting the optimal number of feature points in 3D laser scanning,and the precision of the 3D model derived from UAV tilt photogrammetry.This method provides a research foundation for constructing mine slope model.展开更多
基金provided by the Marine S&T Fund of Shandong Province for Pilot National Laboratory for Marine Science and Technology (Qingdao) (No.2018SDKJ0501-2)。
文摘Clustering is a group of unsupervised statistical techniques commonly used in many disciplines. Considering their applications to fish abundance data, many technical details need to be considered to ensure reasonable interpretation. However, the reliability and stability of the clustering methods have rarely been studied in the contexts of fisheries. This study presents an intensive evaluation of three common clustering methods, including hierarchical clustering(HC), K-means(KM), and expectation-maximization(EM) methods, based on fish community surveys in the coastal waters of Shandong, China. We evaluated the performances of these three methods considering different numbers of clusters, data size, and data transformation approaches, focusing on the consistency validation using the index of average proportion of non-overlap(APN). The results indicate that the three methods tend to be inconsistent in the optimal number of clusters. EM showed relatively better performances to avoid unbalanced classification, whereas HC and KM provided more stable clustering results. Data transformation including scaling, square-root, and log-transformation had substantial influences on the clustering results, especially for KM. Moreover, transformation also influenced clustering stability, wherein scaling tended to provide a stable solution at the same number of clusters. The APN values indicated improved stability with increasing data size, and the effect leveled off over 70 samples in general and most quickly in EM. We conclude that the best clustering method can be chosen depending on the aim of the study and the number of clusters. In general, KM is relatively robust in our tests. We also provide recommendations for future application of clustering analyses. This study is helpful to ensure the credibility of the application and interpretation of clustering methods.
基金This work was supported by the National Natural Science Foundation of China (Grant Nos. 69872003 and 40035010)
文摘The upper bound of the optimal number of clusters in clustering algorithm is studied in this paper. A new method is proposed to solve this issue. This method shows that the rule cmax≤n, which is popular in current papers, is reasonable in some sense. The above conclusion is tested and analyzed by some typical examples in the literature, which demonstrates the validity of the new method.
基金funded by National Natural Science Foundation of China(Grant Nos.42272333,42277147).
文摘Refined 3D modeling of mine slopes is pivotal for precise prediction of geological hazards.Aiming at the inadequacy of existing single modeling methods in comprehensively representing the overall and localized characteristics of mining slopes,this study introduces a new method that fuses model data from Unmanned aerial vehicles(UAV)tilt photogrammetry and 3D laser scanning through a data alignment algorithm based on control points.First,the mini batch K-Medoids algorithm is utilized to cluster the point cloud data from ground 3D laser scanning.Then,the elbow rule is applied to determine the optimal cluster number(K0),and the feature points are extracted.Next,the nearest neighbor point algorithm is employed to match the feature points obtained from UAV tilt photogrammetry,and the internal point coordinates are adjusted through the distanceweighted average to construct a 3D model.Finally,by integrating an engineering case study,the K0 value is determined to be 8,with a matching accuracy between the two model datasets ranging from 0.0669 to 1.0373 mm.Therefore,compared with the modeling method utilizing K-medoids clustering algorithm,the new modeling method significantly enhances the computational efficiency,the accuracy of selecting the optimal number of feature points in 3D laser scanning,and the precision of the 3D model derived from UAV tilt photogrammetry.This method provides a research foundation for constructing mine slope model.