Traditional Fuzzy C-Means(FCM)and Possibilistic C-Means(PCM)clustering algorithms are data-driven,and their objective function minimization process is based on the available numeric data.Recently,knowledge hints have ...Traditional Fuzzy C-Means(FCM)and Possibilistic C-Means(PCM)clustering algorithms are data-driven,and their objective function minimization process is based on the available numeric data.Recently,knowledge hints have been introduced to formknowledge-driven clustering algorithms,which reveal a data structure that considers not only the relationships between data but also the compatibility with knowledge hints.However,these algorithms cannot produce the optimal number of clusters by the clustering algorithm itself;they require the assistance of evaluation indices.Moreover,knowledge hints are usually used as part of the data structure(directly replacing some clustering centers),which severely limits the flexibility of the algorithm and can lead to knowledgemisguidance.To solve this problem,this study designs a newknowledge-driven clustering algorithmcalled the PCM clusteringwith High-density Points(HP-PCM),in which domain knowledge is represented in the form of so-called high-density points.First,a newdatadensitycalculation function is proposed.The Density Knowledge Points Extraction(DKPE)method is established to filter out high-density points from the dataset to form knowledge hints.Then,these hints are incorporated into the PCM objective function so that the clustering algorithm is guided by high-density points to discover the natural data structure.Finally,the initial number of clusters is set to be greater than the true one based on the number of knowledge hints.Then,the HP-PCM algorithm automatically determines the final number of clusters during the clustering process by considering the cluster elimination mechanism.Through experimental studies,including some comparative analyses,the results highlight the effectiveness of the proposed algorithm,such as the increased success rate in clustering,the ability to determine the optimal cluster number,and the faster convergence speed.展开更多
Refined 3D modeling of mine slopes is pivotal for precise prediction of geological hazards.Aiming at the inadequacy of existing single modeling methods in comprehensively representing the overall and localized charact...Refined 3D modeling of mine slopes is pivotal for precise prediction of geological hazards.Aiming at the inadequacy of existing single modeling methods in comprehensively representing the overall and localized characteristics of mining slopes,this study introduces a new method that fuses model data from Unmanned aerial vehicles(UAV)tilt photogrammetry and 3D laser scanning through a data alignment algorithm based on control points.First,the mini batch K-Medoids algorithm is utilized to cluster the point cloud data from ground 3D laser scanning.Then,the elbow rule is applied to determine the optimal cluster number(K0),and the feature points are extracted.Next,the nearest neighbor point algorithm is employed to match the feature points obtained from UAV tilt photogrammetry,and the internal point coordinates are adjusted through the distanceweighted average to construct a 3D model.Finally,by integrating an engineering case study,the K0 value is determined to be 8,with a matching accuracy between the two model datasets ranging from 0.0669 to 1.0373 mm.Therefore,compared with the modeling method utilizing K-medoids clustering algorithm,the new modeling method significantly enhances the computational efficiency,the accuracy of selecting the optimal number of feature points in 3D laser scanning,and the precision of the 3D model derived from UAV tilt photogrammetry.This method provides a research foundation for constructing mine slope model.展开更多
Flying Ad hoc Network(FANET)has drawn significant consideration due to its rapid advancements and extensive use in civil applications.However,the characteristics of FANET including high mobility,limited resources,and ...Flying Ad hoc Network(FANET)has drawn significant consideration due to its rapid advancements and extensive use in civil applications.However,the characteristics of FANET including high mobility,limited resources,and distributed nature,have posed a new challenge to develop a secure and ef-ficient routing scheme for FANET.To overcome these challenges,this paper proposes a novel cluster based secure routing scheme,which aims to solve the routing and data security problem of FANET.In this scheme,the optimal cluster head selection is based on residual energy,online time,reputation,blockchain transactions,mobility,and connectivity by using Improved Artificial Bee Colony Optimization(IABC).The proposed IABC utilizes two different search equations for employee bee and onlooker bee to enhance convergence rate and exploitation abilities.Further,a lightweight blockchain consensus algorithm,AI-Proof of Witness Consensus Algorithm(AI-PoWCA)is proposed,which utilizes the optimal cluster head for mining.In AI-PoWCA,the concept of the witness for block verification is also involved to make the proposed scheme resource efficient and highly resilient against 51%attack.Simulation results demonstrate that the proposed scheme outperforms its counterparts and achieves up to 90%packet delivery ratio,lowest end-to-end delay,highest throughput,resilience against security attacks,and superior in block processing time.展开更多
This paper introduces niching particle swarm optimiza- tion (nichePSO) into clustering analysis and puts forward a cluster- ing algorithm which uses nichePSO to optimize density functions. Firstly, this paper improv...This paper introduces niching particle swarm optimiza- tion (nichePSO) into clustering analysis and puts forward a cluster- ing algorithm which uses nichePSO to optimize density functions. Firstly, this paper improves main swarm training models and in- creases their ability of space searching. Secondly, the radius of sub-swarms is defined adaptively according to the actual clus- tering problem, which can be useful for the niches' forming and searching. At last, a novel method that distributes samples to the corresponding cluster is proposed. Numerical results illustrate that this algorithm based on the density function and nichePSO could cluster unbalanced density datasets into the correct clusters auto- matically and accurately.展开更多
The current mathematical models for the storage assignment problem are generally established based on the traveling salesman problem(TSP),which has been widely applied in the conventional automated storage and retri...The current mathematical models for the storage assignment problem are generally established based on the traveling salesman problem(TSP),which has been widely applied in the conventional automated storage and retrieval system(AS/RS).However,the previous mathematical models in conventional AS/RS do not match multi-tier shuttle warehousing systems(MSWS) because the characteristics of parallel retrieval in multiple tiers and progressive vertical movement destroy the foundation of TSP.In this study,a two-stage open queuing network model in which shuttles and a lift are regarded as servers at different stages is proposed to analyze system performance in the terms of shuttle waiting period(SWP) and lift idle period(LIP) during transaction cycle time.A mean arrival time difference matrix for pairwise stock keeping units(SKUs) is presented to determine the mean waiting time and queue length to optimize the storage assignment problem on the basis of SKU correlation.The decomposition method is applied to analyze the interactions among outbound task time,SWP,and LIP.The ant colony clustering algorithm is designed to determine storage partitions using clustering items.In addition,goods are assigned for storage according to the rearranging permutation and the combination of storage partitions in a 2D plane.This combination is derived based on the analysis results of the queuing network model and on three basic principles.The storage assignment method and its entire optimization algorithm method as applied in a MSWS are verified through a practical engineering project conducted in the tobacco industry.The applying results show that the total SWP and LIP can be reduced effectively to improve the utilization rates of all devices and to increase the throughput of the distribution center.展开更多
The paper study improved K-means algorithm and establish indicators to classify customers according to RFM model. Experimental results show that, the new algorithm has good convergence and stability, it has better tha...The paper study improved K-means algorithm and establish indicators to classify customers according to RFM model. Experimental results show that, the new algorithm has good convergence and stability, it has better than single use of FKP algorithms for clustering. Finally the paper study the application of clustering in customer segmentation of mobile communication enterprise. It discusses the basic theory, customer segmentation methods and steps, the customer segmentation model based on consumption behavior psychology, and the segmentation model is successfully applied to the process of marketing decision support.展开更多
Clustering is a group of unsupervised statistical techniques commonly used in many disciplines. Considering their applications to fish abundance data, many technical details need to be considered to ensure reasonable ...Clustering is a group of unsupervised statistical techniques commonly used in many disciplines. Considering their applications to fish abundance data, many technical details need to be considered to ensure reasonable interpretation. However, the reliability and stability of the clustering methods have rarely been studied in the contexts of fisheries. This study presents an intensive evaluation of three common clustering methods, including hierarchical clustering(HC), K-means(KM), and expectation-maximization(EM) methods, based on fish community surveys in the coastal waters of Shandong, China. We evaluated the performances of these three methods considering different numbers of clusters, data size, and data transformation approaches, focusing on the consistency validation using the index of average proportion of non-overlap(APN). The results indicate that the three methods tend to be inconsistent in the optimal number of clusters. EM showed relatively better performances to avoid unbalanced classification, whereas HC and KM provided more stable clustering results. Data transformation including scaling, square-root, and log-transformation had substantial influences on the clustering results, especially for KM. Moreover, transformation also influenced clustering stability, wherein scaling tended to provide a stable solution at the same number of clusters. The APN values indicated improved stability with increasing data size, and the effect leveled off over 70 samples in general and most quickly in EM. We conclude that the best clustering method can be chosen depending on the aim of the study and the number of clusters. In general, KM is relatively robust in our tests. We also provide recommendations for future application of clustering analyses. This study is helpful to ensure the credibility of the application and interpretation of clustering methods.展开更多
Clustering data with varying densities and complicated structures is important,while many existing clustering algorithms face difficulties for this problem. The reason is that varying densities and complicated structu...Clustering data with varying densities and complicated structures is important,while many existing clustering algorithms face difficulties for this problem. The reason is that varying densities and complicated structure make single algorithms perform badly for different parts of data. More intensive parts are assumed to have more information probably,an algorithm clustering from high density part is proposed,which begins from a tiny distance to find the highest density-connected partition and form corresponding super cores,then distance is iteratively increased by a global heuristic method to cluster parts with different densities. Mean of silhouette coefficient indicates the cluster performance. Denoising function is implemented to eliminate influence of noise and outliers. Many challenging experiments indicate that the algorithm has good performance on data with widely varying densities and extremely complex structures. It decides the optimal number of clusters automatically.Background knowledge is not needed and parameters tuning is easy. It is robust against noise and outliers.展开更多
We propose a novel scheme based on clustering analysis in color space to solve text segmentation in complex color images. Text segmentation includes automatic clustering of color space and foreground image generation....We propose a novel scheme based on clustering analysis in color space to solve text segmentation in complex color images. Text segmentation includes automatic clustering of color space and foreground image generation. Two methods are also proposed for automatic clustering: The first one is to determine the optimal number of clusters and the second one is the fuzzy competitively clustering method based on competitively learning techniques. Essential foreground images obtained from any of the color clusters are combined into foreground images. Further performance analysis reveals the advantages of the proposed methods.展开更多
As each cluster head(CH)sensor node is used to aggregate,fuse,and forward data from different sensor nodes in an underwater acoustic sensor network(UASN),guaranteeing the data security in a CH is very critical.In this...As each cluster head(CH)sensor node is used to aggregate,fuse,and forward data from different sensor nodes in an underwater acoustic sensor network(UASN),guaranteeing the data security in a CH is very critical.In this paper,a cooperative security monitoring mechanism aided by multiple slave cluster heads(SCHs)is proposed to keep track of the data security of a CH.By designing a low complexity“equilateral triangle algorithm(ETA)”,the optimal SCHs(named as ETA-based multiple SCHs)are selected from the candidate SCHs so as to improve the dispersion and coverage of SCHs and achieve largescale data security monitoring.In addition,by analyzing the entire monitoring process,the close form expression of the probability of the failure attack identification for the SCHs with respect to the probability of attack launched by ordinary nodes is deduced.The simulation results show that the proposed optimal ETA-based multiple SCH cooperation scheme has lower probability of the failure attack identification than that of the existing schemes.In addition,the numerical simulation results are consistent with the theoretical analysis results,thus verifying the effectiveness of the proposed scheme.展开更多
Spatial clustering is widely used in many fields such as WSN (Wireless Sensor Networks), web clustering, remote sensing and so on for discovery groups and to identify interesting distributions in the underlying databa...Spatial clustering is widely used in many fields such as WSN (Wireless Sensor Networks), web clustering, remote sensing and so on for discovery groups and to identify interesting distributions in the underlying database. By discussing the relationships between the optimal clustering and the initial seeds, a clustering validity index and the principle of seeking initial seeds were proposed, and on this principle we recommend an initial seed-seeking strategy: SSPG (Single-Shortest-Path Graph). With SSPG strategy used in clustering algorithms, we find that the result of clustering is optimized with more probability. At the end of the paper, according to the combinational theory of optimization, a method is proposed to obtain optimal reference k value of cluster number, and is proven to be efficient.展开更多
It is significant to combine multiple tasks into an optimal work package in decision-making of aircraft maintenance to reduce cost,so a cost rate model of combinatorial maintenance is an urgent need.However,the optima...It is significant to combine multiple tasks into an optimal work package in decision-making of aircraft maintenance to reduce cost,so a cost rate model of combinatorial maintenance is an urgent need.However,the optimal combination under various constraints not only involves numerical calculations but also is an NP-hard combinatorial problem.To solve the problem,an adaptive genetic algorithm based on cluster search,which is divided into two phases,is put forward.In the first phase,according to the density,all individuals can be homogeneously scattered over the whole solution space through crossover and mutation and better individuals are collected as candidate cluster centres.In the second phase,the search is confined to the neighbourhood of some selected possible solutions to accurately solve with cluster radius decreasing slowly,meanwhile all clusters continuously move to better regions until all the peaks in the question space is searched.This algorithm can efficiently solve the combination problem.Taking the optimization on decision-making of aircraft maintenance by the algorithm for an example,maintenance which combines multiple parts or tasks can significantly enhance economic benefit when the halt cost is rather high.展开更多
The upper bound of the optimal number of clusters in clustering algorithm is studied in this paper. A new method is proposed to solve this issue. This method shows that the rule cmax≤N^(1/N), which is popular in curr...The upper bound of the optimal number of clusters in clustering algorithm is studied in this paper. A new method is proposed to solve this issue. This method shows that the rule cmax≤N^(1/N), which is popular in current papers, is reasonable in some sense. The above conclusion is tested and analyzed by some typical examples in the literature, which demonstrates the validity of the new method.展开更多
基金supported by the National Key Research and Development Program of China(No.2022YFB3304400)the National Natural Science Foundation of China(Nos.6230311,62303111,62076060,61932007,and 62176083)the Key Research and Development Program of Jiangsu Province of China(No.BE2022157).
文摘Traditional Fuzzy C-Means(FCM)and Possibilistic C-Means(PCM)clustering algorithms are data-driven,and their objective function minimization process is based on the available numeric data.Recently,knowledge hints have been introduced to formknowledge-driven clustering algorithms,which reveal a data structure that considers not only the relationships between data but also the compatibility with knowledge hints.However,these algorithms cannot produce the optimal number of clusters by the clustering algorithm itself;they require the assistance of evaluation indices.Moreover,knowledge hints are usually used as part of the data structure(directly replacing some clustering centers),which severely limits the flexibility of the algorithm and can lead to knowledgemisguidance.To solve this problem,this study designs a newknowledge-driven clustering algorithmcalled the PCM clusteringwith High-density Points(HP-PCM),in which domain knowledge is represented in the form of so-called high-density points.First,a newdatadensitycalculation function is proposed.The Density Knowledge Points Extraction(DKPE)method is established to filter out high-density points from the dataset to form knowledge hints.Then,these hints are incorporated into the PCM objective function so that the clustering algorithm is guided by high-density points to discover the natural data structure.Finally,the initial number of clusters is set to be greater than the true one based on the number of knowledge hints.Then,the HP-PCM algorithm automatically determines the final number of clusters during the clustering process by considering the cluster elimination mechanism.Through experimental studies,including some comparative analyses,the results highlight the effectiveness of the proposed algorithm,such as the increased success rate in clustering,the ability to determine the optimal cluster number,and the faster convergence speed.
基金funded by National Natural Science Foundation of China(Grant Nos.42272333,42277147).
文摘Refined 3D modeling of mine slopes is pivotal for precise prediction of geological hazards.Aiming at the inadequacy of existing single modeling methods in comprehensively representing the overall and localized characteristics of mining slopes,this study introduces a new method that fuses model data from Unmanned aerial vehicles(UAV)tilt photogrammetry and 3D laser scanning through a data alignment algorithm based on control points.First,the mini batch K-Medoids algorithm is utilized to cluster the point cloud data from ground 3D laser scanning.Then,the elbow rule is applied to determine the optimal cluster number(K0),and the feature points are extracted.Next,the nearest neighbor point algorithm is employed to match the feature points obtained from UAV tilt photogrammetry,and the internal point coordinates are adjusted through the distanceweighted average to construct a 3D model.Finally,by integrating an engineering case study,the K0 value is determined to be 8,with a matching accuracy between the two model datasets ranging from 0.0669 to 1.0373 mm.Therefore,compared with the modeling method utilizing K-medoids clustering algorithm,the new modeling method significantly enhances the computational efficiency,the accuracy of selecting the optimal number of feature points in 3D laser scanning,and the precision of the 3D model derived from UAV tilt photogrammetry.This method provides a research foundation for constructing mine slope model.
基金This paper is supported in part by the National Natural Science Foundation of China(61701322)the Young and Middle-aged Science and Technology Innovation Talent Support Plan of Shenyang(RC190026)+1 种基金the Natural Science Foundation of Liaoning Province(2020-MS-237)the Liaoning Provincial Department of Education Science Foundation(JYT19052).
文摘Flying Ad hoc Network(FANET)has drawn significant consideration due to its rapid advancements and extensive use in civil applications.However,the characteristics of FANET including high mobility,limited resources,and distributed nature,have posed a new challenge to develop a secure and ef-ficient routing scheme for FANET.To overcome these challenges,this paper proposes a novel cluster based secure routing scheme,which aims to solve the routing and data security problem of FANET.In this scheme,the optimal cluster head selection is based on residual energy,online time,reputation,blockchain transactions,mobility,and connectivity by using Improved Artificial Bee Colony Optimization(IABC).The proposed IABC utilizes two different search equations for employee bee and onlooker bee to enhance convergence rate and exploitation abilities.Further,a lightweight blockchain consensus algorithm,AI-Proof of Witness Consensus Algorithm(AI-PoWCA)is proposed,which utilizes the optimal cluster head for mining.In AI-PoWCA,the concept of the witness for block verification is also involved to make the proposed scheme resource efficient and highly resilient against 51%attack.Simulation results demonstrate that the proposed scheme outperforms its counterparts and achieves up to 90%packet delivery ratio,lowest end-to-end delay,highest throughput,resilience against security attacks,and superior in block processing time.
基金supported by the National Natural Science Foundation of China (708710157103100271171030)
文摘This paper introduces niching particle swarm optimiza- tion (nichePSO) into clustering analysis and puts forward a cluster- ing algorithm which uses nichePSO to optimize density functions. Firstly, this paper improves main swarm training models and in- creases their ability of space searching. Secondly, the radius of sub-swarms is defined adaptively according to the actual clus- tering problem, which can be useful for the niches' forming and searching. At last, a novel method that distributes samples to the corresponding cluster is proposed. Numerical results illustrate that this algorithm based on the density function and nichePSO could cluster unbalanced density datasets into the correct clusters auto- matically and accurately.
基金Supported by National Natural Science Foundation of China(Grant No.661403234)Shandong Provincial Science and Techhnology Development Plan of China(Grant No.2014GGX106009)
文摘The current mathematical models for the storage assignment problem are generally established based on the traveling salesman problem(TSP),which has been widely applied in the conventional automated storage and retrieval system(AS/RS).However,the previous mathematical models in conventional AS/RS do not match multi-tier shuttle warehousing systems(MSWS) because the characteristics of parallel retrieval in multiple tiers and progressive vertical movement destroy the foundation of TSP.In this study,a two-stage open queuing network model in which shuttles and a lift are regarded as servers at different stages is proposed to analyze system performance in the terms of shuttle waiting period(SWP) and lift idle period(LIP) during transaction cycle time.A mean arrival time difference matrix for pairwise stock keeping units(SKUs) is presented to determine the mean waiting time and queue length to optimize the storage assignment problem on the basis of SKU correlation.The decomposition method is applied to analyze the interactions among outbound task time,SWP,and LIP.The ant colony clustering algorithm is designed to determine storage partitions using clustering items.In addition,goods are assigned for storage according to the rearranging permutation and the combination of storage partitions in a 2D plane.This combination is derived based on the analysis results of the queuing network model and on three basic principles.The storage assignment method and its entire optimization algorithm method as applied in a MSWS are verified through a practical engineering project conducted in the tobacco industry.The applying results show that the total SWP and LIP can be reduced effectively to improve the utilization rates of all devices and to increase the throughput of the distribution center.
文摘The paper study improved K-means algorithm and establish indicators to classify customers according to RFM model. Experimental results show that, the new algorithm has good convergence and stability, it has better than single use of FKP algorithms for clustering. Finally the paper study the application of clustering in customer segmentation of mobile communication enterprise. It discusses the basic theory, customer segmentation methods and steps, the customer segmentation model based on consumption behavior psychology, and the segmentation model is successfully applied to the process of marketing decision support.
基金provided by the Marine S&T Fund of Shandong Province for Pilot National Laboratory for Marine Science and Technology (Qingdao) (No.2018SDKJ0501-2)。
文摘Clustering is a group of unsupervised statistical techniques commonly used in many disciplines. Considering their applications to fish abundance data, many technical details need to be considered to ensure reasonable interpretation. However, the reliability and stability of the clustering methods have rarely been studied in the contexts of fisheries. This study presents an intensive evaluation of three common clustering methods, including hierarchical clustering(HC), K-means(KM), and expectation-maximization(EM) methods, based on fish community surveys in the coastal waters of Shandong, China. We evaluated the performances of these three methods considering different numbers of clusters, data size, and data transformation approaches, focusing on the consistency validation using the index of average proportion of non-overlap(APN). The results indicate that the three methods tend to be inconsistent in the optimal number of clusters. EM showed relatively better performances to avoid unbalanced classification, whereas HC and KM provided more stable clustering results. Data transformation including scaling, square-root, and log-transformation had substantial influences on the clustering results, especially for KM. Moreover, transformation also influenced clustering stability, wherein scaling tended to provide a stable solution at the same number of clusters. The APN values indicated improved stability with increasing data size, and the effect leveled off over 70 samples in general and most quickly in EM. We conclude that the best clustering method can be chosen depending on the aim of the study and the number of clusters. In general, KM is relatively robust in our tests. We also provide recommendations for future application of clustering analyses. This study is helpful to ensure the credibility of the application and interpretation of clustering methods.
基金Supported by the National Key Research and Development Program of China(No.2016YFB0201305)National Science and Technology Major Project(No.2013ZX0102-8001-001-001)National Natural Science Foundation of China(No.91430218,31327901,61472395,61272134,61432018)
文摘Clustering data with varying densities and complicated structures is important,while many existing clustering algorithms face difficulties for this problem. The reason is that varying densities and complicated structure make single algorithms perform badly for different parts of data. More intensive parts are assumed to have more information probably,an algorithm clustering from high density part is proposed,which begins from a tiny distance to find the highest density-connected partition and form corresponding super cores,then distance is iteratively increased by a global heuristic method to cluster parts with different densities. Mean of silhouette coefficient indicates the cluster performance. Denoising function is implemented to eliminate influence of noise and outliers. Many challenging experiments indicate that the algorithm has good performance on data with widely varying densities and extremely complex structures. It decides the optimal number of clusters automatically.Background knowledge is not needed and parameters tuning is easy. It is robust against noise and outliers.
文摘We propose a novel scheme based on clustering analysis in color space to solve text segmentation in complex color images. Text segmentation includes automatic clustering of color space and foreground image generation. Two methods are also proposed for automatic clustering: The first one is to determine the optimal number of clusters and the second one is the fuzzy competitively clustering method based on competitively learning techniques. Essential foreground images obtained from any of the color clusters are combined into foreground images. Further performance analysis reveals the advantages of the proposed methods.
基金supported in part by the Joint Fund of Science and Technology Department of Liaoning Province and State Key Laboratory of Robotics,China under Grant 2021-KF-22-08in part by the Basic Research Program of Science and Technology of Shenzhen,China under Grant JCYJ20190809161805508in part by the National Natural Science Foundation of China under Grant 62271423 and Grant 41976178.
文摘As each cluster head(CH)sensor node is used to aggregate,fuse,and forward data from different sensor nodes in an underwater acoustic sensor network(UASN),guaranteeing the data security in a CH is very critical.In this paper,a cooperative security monitoring mechanism aided by multiple slave cluster heads(SCHs)is proposed to keep track of the data security of a CH.By designing a low complexity“equilateral triangle algorithm(ETA)”,the optimal SCHs(named as ETA-based multiple SCHs)are selected from the candidate SCHs so as to improve the dispersion and coverage of SCHs and achieve largescale data security monitoring.In addition,by analyzing the entire monitoring process,the close form expression of the probability of the failure attack identification for the SCHs with respect to the probability of attack launched by ordinary nodes is deduced.The simulation results show that the proposed optimal ETA-based multiple SCH cooperation scheme has lower probability of the failure attack identification than that of the existing schemes.In addition,the numerical simulation results are consistent with the theoretical analysis results,thus verifying the effectiveness of the proposed scheme.
基金Supported by the National Natural Science Foundation of China (No.60502028, No. 90204008).
文摘Spatial clustering is widely used in many fields such as WSN (Wireless Sensor Networks), web clustering, remote sensing and so on for discovery groups and to identify interesting distributions in the underlying database. By discussing the relationships between the optimal clustering and the initial seeds, a clustering validity index and the principle of seeking initial seeds were proposed, and on this principle we recommend an initial seed-seeking strategy: SSPG (Single-Shortest-Path Graph). With SSPG strategy used in clustering algorithms, we find that the result of clustering is optimized with more probability. At the end of the paper, according to the combinational theory of optimization, a method is proposed to obtain optimal reference k value of cluster number, and is proven to be efficient.
基金supported by the National Natural Science Foundation of China(6107901361079014+4 种基金61403198)the National Natural Science Funds and Civil Aviaiton Mutual Funds(U1533128U1233114)the Programs of Natural Science Foundation of China and China Civil Aviation Joint Fund(60939003)the Natural Science Foundation of Jiangsu Province in China(BK2011737)
文摘It is significant to combine multiple tasks into an optimal work package in decision-making of aircraft maintenance to reduce cost,so a cost rate model of combinatorial maintenance is an urgent need.However,the optimal combination under various constraints not only involves numerical calculations but also is an NP-hard combinatorial problem.To solve the problem,an adaptive genetic algorithm based on cluster search,which is divided into two phases,is put forward.In the first phase,according to the density,all individuals can be homogeneously scattered over the whole solution space through crossover and mutation and better individuals are collected as candidate cluster centres.In the second phase,the search is confined to the neighbourhood of some selected possible solutions to accurately solve with cluster radius decreasing slowly,meanwhile all clusters continuously move to better regions until all the peaks in the question space is searched.This algorithm can efficiently solve the combination problem.Taking the optimization on decision-making of aircraft maintenance by the algorithm for an example,maintenance which combines multiple parts or tasks can significantly enhance economic benefit when the halt cost is rather high.
基金This work was supported by the National Natural Science Foundation of China (Grant Nos. 69872003 and 40035010)
文摘The upper bound of the optimal number of clusters in clustering algorithm is studied in this paper. A new method is proposed to solve this issue. This method shows that the rule cmax≤N^(1/N), which is popular in current papers, is reasonable in some sense. The above conclusion is tested and analyzed by some typical examples in the literature, which demonstrates the validity of the new method.