Clustering, in data mining, is a useful technique for discovering interesting data distributions and patterns in the underlying data, and has many application fields, such as statistical data analysis, pattern recogni...Clustering, in data mining, is a useful technique for discovering interesting data distributions and patterns in the underlying data, and has many application fields, such as statistical data analysis, pattern recognition, image processing, and etc. We combine sampling technique with DBSCAN algorithm to cluster large spatial databases, and two sampling based DBSCAN (SDBSCAN) algorithms are developed. One algorithm introduces sampling technique inside DBSCAN, and the other uses sampling procedure outside DBSCAN. Experimental results demonstrate that our algorithms are effective and efficient in clustering large scale spatial databases.展开更多
To develop a better approach for spatial evaluation of drinking water quality, an intelligent evaluation method integrating a geographical information system(GIS) and an ant colony clustering algorithm(ACCA) was used....To develop a better approach for spatial evaluation of drinking water quality, an intelligent evaluation method integrating a geographical information system(GIS) and an ant colony clustering algorithm(ACCA) was used. Drinking water samples from 29 wells in Zhenping County, China, were collected and analyzed. 35 parameters on water quality were selected, such as chloride concentration, sulphate concentration, total hardness, nitrate concentration, fluoride concentration, turbidity, pH, chromium concentration, COD, bacterium amount, total coliforms and color. The best spatial interpolation methods for the 35 parameters were found and selected from all types of interpolation methods in GIS environment according to the minimum cross-validation errors. The ACCA was improved through three strategies, namely mixed distance function, average similitude degree and probability conversion functions. Then, the ACCA was carried out to obtain different water quality grades in the GIS environment. In the end, the result from the ACCA was compared with those from the competitive Hopfield neural network(CHNN) to validate the feasibility and effectiveness of the ACCA according to three evaluation indexes, which are stochastic sampling method, pixel amount and convergence speed. It is shown that the spatial water quality grades obtained from the ACCA were more effective, accurate and intelligent than those obtained from the CHNN.展开更多
A quick and accurate extraction of dominant colors of background images is the basis of adaptive camouflage design.This paper proposes a Color Image Quick Fuzzy C-Means(CIQFCM)clustering algorithm based on clustering ...A quick and accurate extraction of dominant colors of background images is the basis of adaptive camouflage design.This paper proposes a Color Image Quick Fuzzy C-Means(CIQFCM)clustering algorithm based on clustering spatial mapping.First,the clustering sample space was mapped from the image pixels to the quantized color space,and several methods were adopted to compress the amount of clustering samples.Then,an improved pedigree clustering algorithm was applied to obtain the initial class centers.Finally,CIQFCM clustering algorithm was used for quick extraction of dominant colors of background image.After theoretical analysis of the effect and efficiency of the CIQFCM algorithm,several experiments were carried out to discuss the selection of proper quantization intervals and to verify the effect and efficiency of the CIQFCM algorithm.The results indicated that the value of quantization intervals should be set to 4,and the proposed algorithm could improve the clustering efficiency while maintaining the clustering effect.In addition,as the image size increased from 128×128 to 1024×1024,the efficiency improvement of CIQFCM algorithm was increased from 6.44 times to 36.42 times,which demonstrated the significant advantage of CIQFCM algorithm in dominant colors extraction of large-size images.展开更多
This paper introduces some definitions and defines a set of calculating indexes to facilitate the research, and then presents an algorithm to complete the spatial clustering result comparison between different cluster...This paper introduces some definitions and defines a set of calculating indexes to facilitate the research, and then presents an algorithm to complete the spatial clustering result comparison between different clustering themes. The research shows that some valuable spatial correlation patterns can be further found from the clustering result comparison with multi-themes, based on traditional spatial clustering as the first step. Those patterns can tell us what relations those themes have, and thus will help us have a deeper understanding of the studied spatial entities. An example is also given to demonstrate the principle and process of the method.展开更多
The characteristic of geographic information system(GIS) spatial data operation is that query is much more frequent than insertion and deletion, and a new hybrid spatial clustering method used to build R-tree for GIS ...The characteristic of geographic information system(GIS) spatial data operation is that query is much more frequent than insertion and deletion, and a new hybrid spatial clustering method used to build R-tree for GIS spatial data was proposed in this paper. According to the aggregation of clustering method, R-tree was used to construct rules and specialty of spatial data. HCR-tree was the R-tree built with HCR algorithm. To test the efficiency of HCR algorithm, it was applied not only to the data organization of static R-tree but also to the nodes splitting of dynamic R-tree. The results show that R-tree with HCR has some advantages such as higher searching efficiency, less disk accesses and so on.展开更多
As location information of numerous Internet of Thing(IoT)devices can be recognized through IoT sensor technology,the need for technology to efficiently analyze spatial data is increasing.One of the famous algorithms ...As location information of numerous Internet of Thing(IoT)devices can be recognized through IoT sensor technology,the need for technology to efficiently analyze spatial data is increasing.One of the famous algorithms for classifying dense data into one cluster is Density-Based Spatial Clustering of Applications with Noise(DBSCAN).Existing DBSCAN research focuses on efficiently finding clusters in numeric data or categorical data.In this paper,we propose the novel problem of discovering a set of adjacent clusters among the cluster results derived for each keyword in the keyword-based DBSCAN algorithm.The existing DBSCAN algorithm has a problem in that it is necessary to calculate the number of all cases in order to find adjacent clusters among clusters derived as a result of the algorithm.To solve this problem,we developed the Genetic algorithm-based Keyword Matching DBSCAN(GKM-DBSCAN)algorithm to which the genetic algorithm was applied to discover the set of adjacent clusters among the cluster results derived for each keyword.In order to improve the performance of GKM-DBSCAN,we improved the general genetic algorithm by performing a genetic operation in groups.We conducted extensive experiments on both real and synthetic datasets to show the effectiveness of GKM-DBSCAN than the brute-force method.The experimental results show that GKM-DBSCAN outperforms the brute-force method by up to 21 times.GKM-DBSCAN with the index number binarization(INB)is 1.8 times faster than GKM-DBSCAN with the cluster number binarization(CNB).展开更多
针对出租车随意停靠造成城市交通拥堵甚至交通事故的问题,利用成都实际区域的出租车GPS(Global Position System)数据和爬取的POI(Point of Interest)数据,使用DBSCAN(Density-Based Spatial Clustering of Application with Noise)聚...针对出租车随意停靠造成城市交通拥堵甚至交通事故的问题,利用成都实际区域的出租车GPS(Global Position System)数据和爬取的POI(Point of Interest)数据,使用DBSCAN(Density-Based Spatial Clustering of Application with Noise)聚类算法对上下客点进行聚类,得到出租车的载客热点,根据POI的类型划定载客热点区域的类型,对出租车不同时间的出行需求进行分析,进而划分出出租车的固定停车区域。研究结果表明,出租车固定停车区域的设定与出行者的出行需求有关,即将固定停车区域设置在出行者出行需求多的区域,可以满足出行者的不同出行需求。结合出租车载客热点和爬取POI数据划定固定停车区域的方法具有较高的实用性,可为城市交通安全方面提供理论和现实意义。展开更多
构建了系列球形中空结构的纳米线(NW),采用分子动力学(MD)对每个模型300个不同初始态的样本开展拉伸形变模拟。并利用基于密度的噪声应用空间聚类(density-based spatial clustering of applications with noise,DBSCAN)机器学习算法,...构建了系列球形中空结构的纳米线(NW),采用分子动力学(MD)对每个模型300个不同初始态的样本开展拉伸形变模拟。并利用基于密度的噪声应用空间聚类(density-based spatial clustering of applications with noise,DBSCAN)机器学习算法,获得了初始滑移面的位置。基于大数据统计,分析了初始滑移位置分布以及断裂位置分布两者之间的相关性。研究结果表明:当内部中空半径较小时,断裂位置分布形成于塑性形变阶段,初始滑移分布与断裂位置分布之间无显著的相关性;但是对于脆性特征明显的大中空半径的NW,高能内表面诱导产生的滑移面迅速积累,产生颈缩并导致最终的断裂。因此当内部中空结构达到一定尺寸时初始滑移位置的分布与最终断裂位置的分布之间有明确的因果关系。展开更多
Gobi spans a large area of China,surpassing the combined expanse of mobile dunes and semi-fixed dunes.Its presence significantly influences the movement of sand and dust.However,the complex origins and diverse materia...Gobi spans a large area of China,surpassing the combined expanse of mobile dunes and semi-fixed dunes.Its presence significantly influences the movement of sand and dust.However,the complex origins and diverse materials constituting the Gobi result in notable differences in saltation processes across various Gobi surfaces.It is challenging to describe these processes according to a uniform morphology.Therefore,it becomes imperative to articulate surface characteristics through parameters such as the three-dimensional(3D)size and shape of gravel.Collecting morphology information for Gobi gravels is essential for studying its genesis and sand saltation.To enhance the efficiency and information yield of gravel parameter measurements,this study conducted field experiments in the Gobi region across Dunhuang City,Guazhou County,and Yumen City(administrated by Jiuquan City),Gansu Province,China in March 2023.A research framework and methodology for measuring 3D parameters of gravel using point cloud were developed,alongside improved calculation formulas for 3D parameters including gravel grain size,volume,flatness,roundness,sphericity,and equivalent grain size.Leveraging multi-view geometry technology for 3D reconstruction allowed for establishing an optimal data acquisition scheme characterized by high point cloud reconstruction efficiency and clear quality.Additionally,the proposed methodology incorporated point cloud clustering,segmentation,and filtering techniques to isolate individual gravel point clouds.Advanced point cloud algorithms,including the Oriented Bounding Box(OBB),point cloud slicing method,and point cloud triangulation,were then deployed to calculate the 3D parameters of individual gravels.These systematic processes allow precise and detailed characterization of individual gravels.For gravel grain size and volume,the correlation coefficients between point cloud and manual measurements all exceeded 0.9000,confirming the feasibility of the proposed methodology for measuring 3D parameters of individual gravels.The proposed workflow yields accurate calculations of relevant parameters for Gobi gravels,providing essential data support for subsequent studies on Gobi environments.展开更多
基金Supported by the Open Researches Fund Program of L IESMARS(WKL(0 0 ) 0 30 2 )
文摘Clustering, in data mining, is a useful technique for discovering interesting data distributions and patterns in the underlying data, and has many application fields, such as statistical data analysis, pattern recognition, image processing, and etc. We combine sampling technique with DBSCAN algorithm to cluster large spatial databases, and two sampling based DBSCAN (SDBSCAN) algorithms are developed. One algorithm introduces sampling technique inside DBSCAN, and the other uses sampling procedure outside DBSCAN. Experimental results demonstrate that our algorithms are effective and efficient in clustering large scale spatial databases.
基金Projects(41161020,41261026) supported by the National Natural Science Foundation of ChinaProject(BQD2012013) supported by the Research starting Funds for Imported Talents,Ningxia University,China+1 种基金Project(ZR1209) supported by the Natural Science Funds,Ningxia University,ChinaProject(NGY2013005) supported by the Key Science Project of Colleges and Universities in Ningxia,China
文摘To develop a better approach for spatial evaluation of drinking water quality, an intelligent evaluation method integrating a geographical information system(GIS) and an ant colony clustering algorithm(ACCA) was used. Drinking water samples from 29 wells in Zhenping County, China, were collected and analyzed. 35 parameters on water quality were selected, such as chloride concentration, sulphate concentration, total hardness, nitrate concentration, fluoride concentration, turbidity, pH, chromium concentration, COD, bacterium amount, total coliforms and color. The best spatial interpolation methods for the 35 parameters were found and selected from all types of interpolation methods in GIS environment according to the minimum cross-validation errors. The ACCA was improved through three strategies, namely mixed distance function, average similitude degree and probability conversion functions. Then, the ACCA was carried out to obtain different water quality grades in the GIS environment. In the end, the result from the ACCA was compared with those from the competitive Hopfield neural network(CHNN) to validate the feasibility and effectiveness of the ACCA according to three evaluation indexes, which are stochastic sampling method, pixel amount and convergence speed. It is shown that the spatial water quality grades obtained from the ACCA were more effective, accurate and intelligent than those obtained from the CHNN.
文摘A quick and accurate extraction of dominant colors of background images is the basis of adaptive camouflage design.This paper proposes a Color Image Quick Fuzzy C-Means(CIQFCM)clustering algorithm based on clustering spatial mapping.First,the clustering sample space was mapped from the image pixels to the quantized color space,and several methods were adopted to compress the amount of clustering samples.Then,an improved pedigree clustering algorithm was applied to obtain the initial class centers.Finally,CIQFCM clustering algorithm was used for quick extraction of dominant colors of background image.After theoretical analysis of the effect and efficiency of the CIQFCM algorithm,several experiments were carried out to discuss the selection of proper quantization intervals and to verify the effect and efficiency of the CIQFCM algorithm.The results indicated that the value of quantization intervals should be set to 4,and the proposed algorithm could improve the clustering efficiency while maintaining the clustering effect.In addition,as the image size increased from 128×128 to 1024×1024,the efficiency improvement of CIQFCM algorithm was increased from 6.44 times to 36.42 times,which demonstrated the significant advantage of CIQFCM algorithm in dominant colors extraction of large-size images.
文摘This paper introduces some definitions and defines a set of calculating indexes to facilitate the research, and then presents an algorithm to complete the spatial clustering result comparison between different clustering themes. The research shows that some valuable spatial correlation patterns can be further found from the clustering result comparison with multi-themes, based on traditional spatial clustering as the first step. Those patterns can tell us what relations those themes have, and thus will help us have a deeper understanding of the studied spatial entities. An example is also given to demonstrate the principle and process of the method.
文摘The characteristic of geographic information system(GIS) spatial data operation is that query is much more frequent than insertion and deletion, and a new hybrid spatial clustering method used to build R-tree for GIS spatial data was proposed in this paper. According to the aggregation of clustering method, R-tree was used to construct rules and specialty of spatial data. HCR-tree was the R-tree built with HCR algorithm. To test the efficiency of HCR algorithm, it was applied not only to the data organization of static R-tree but also to the nodes splitting of dynamic R-tree. The results show that R-tree with HCR has some advantages such as higher searching efficiency, less disk accesses and so on.
基金supported by the Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Korea government (MSIT) (No.2021R1F1A1049387).
文摘As location information of numerous Internet of Thing(IoT)devices can be recognized through IoT sensor technology,the need for technology to efficiently analyze spatial data is increasing.One of the famous algorithms for classifying dense data into one cluster is Density-Based Spatial Clustering of Applications with Noise(DBSCAN).Existing DBSCAN research focuses on efficiently finding clusters in numeric data or categorical data.In this paper,we propose the novel problem of discovering a set of adjacent clusters among the cluster results derived for each keyword in the keyword-based DBSCAN algorithm.The existing DBSCAN algorithm has a problem in that it is necessary to calculate the number of all cases in order to find adjacent clusters among clusters derived as a result of the algorithm.To solve this problem,we developed the Genetic algorithm-based Keyword Matching DBSCAN(GKM-DBSCAN)algorithm to which the genetic algorithm was applied to discover the set of adjacent clusters among the cluster results derived for each keyword.In order to improve the performance of GKM-DBSCAN,we improved the general genetic algorithm by performing a genetic operation in groups.We conducted extensive experiments on both real and synthetic datasets to show the effectiveness of GKM-DBSCAN than the brute-force method.The experimental results show that GKM-DBSCAN outperforms the brute-force method by up to 21 times.GKM-DBSCAN with the index number binarization(INB)is 1.8 times faster than GKM-DBSCAN with the cluster number binarization(CNB).
文摘针对出租车随意停靠造成城市交通拥堵甚至交通事故的问题,利用成都实际区域的出租车GPS(Global Position System)数据和爬取的POI(Point of Interest)数据,使用DBSCAN(Density-Based Spatial Clustering of Application with Noise)聚类算法对上下客点进行聚类,得到出租车的载客热点,根据POI的类型划定载客热点区域的类型,对出租车不同时间的出行需求进行分析,进而划分出出租车的固定停车区域。研究结果表明,出租车固定停车区域的设定与出行者的出行需求有关,即将固定停车区域设置在出行者出行需求多的区域,可以满足出行者的不同出行需求。结合出租车载客热点和爬取POI数据划定固定停车区域的方法具有较高的实用性,可为城市交通安全方面提供理论和现实意义。
文摘构建了系列球形中空结构的纳米线(NW),采用分子动力学(MD)对每个模型300个不同初始态的样本开展拉伸形变模拟。并利用基于密度的噪声应用空间聚类(density-based spatial clustering of applications with noise,DBSCAN)机器学习算法,获得了初始滑移面的位置。基于大数据统计,分析了初始滑移位置分布以及断裂位置分布两者之间的相关性。研究结果表明:当内部中空半径较小时,断裂位置分布形成于塑性形变阶段,初始滑移分布与断裂位置分布之间无显著的相关性;但是对于脆性特征明显的大中空半径的NW,高能内表面诱导产生的滑移面迅速积累,产生颈缩并导致最终的断裂。因此当内部中空结构达到一定尺寸时初始滑移位置的分布与最终断裂位置的分布之间有明确的因果关系。
基金funded by the National Natural Science Foundation of China(42071014).
文摘Gobi spans a large area of China,surpassing the combined expanse of mobile dunes and semi-fixed dunes.Its presence significantly influences the movement of sand and dust.However,the complex origins and diverse materials constituting the Gobi result in notable differences in saltation processes across various Gobi surfaces.It is challenging to describe these processes according to a uniform morphology.Therefore,it becomes imperative to articulate surface characteristics through parameters such as the three-dimensional(3D)size and shape of gravel.Collecting morphology information for Gobi gravels is essential for studying its genesis and sand saltation.To enhance the efficiency and information yield of gravel parameter measurements,this study conducted field experiments in the Gobi region across Dunhuang City,Guazhou County,and Yumen City(administrated by Jiuquan City),Gansu Province,China in March 2023.A research framework and methodology for measuring 3D parameters of gravel using point cloud were developed,alongside improved calculation formulas for 3D parameters including gravel grain size,volume,flatness,roundness,sphericity,and equivalent grain size.Leveraging multi-view geometry technology for 3D reconstruction allowed for establishing an optimal data acquisition scheme characterized by high point cloud reconstruction efficiency and clear quality.Additionally,the proposed methodology incorporated point cloud clustering,segmentation,and filtering techniques to isolate individual gravel point clouds.Advanced point cloud algorithms,including the Oriented Bounding Box(OBB),point cloud slicing method,and point cloud triangulation,were then deployed to calculate the 3D parameters of individual gravels.These systematic processes allow precise and detailed characterization of individual gravels.For gravel grain size and volume,the correlation coefficients between point cloud and manual measurements all exceeded 0.9000,confirming the feasibility of the proposed methodology for measuring 3D parameters of individual gravels.The proposed workflow yields accurate calculations of relevant parameters for Gobi gravels,providing essential data support for subsequent studies on Gobi environments.