Although many multi-view clustering(MVC) algorithms with acceptable performances have been presented, to the best of our knowledge, nearly all of them need to be fed with the correct number of clusters. In addition, t...Although many multi-view clustering(MVC) algorithms with acceptable performances have been presented, to the best of our knowledge, nearly all of them need to be fed with the correct number of clusters. In addition, these existing algorithms create only the hard and fuzzy partitions for multi-view objects,which are often located in highly-overlapping areas of multi-view feature space. The adoption of hard and fuzzy partition ignores the ambiguity and uncertainty in the assignment of objects, likely leading to performance degradation. To address these issues, we propose a novel sparse reconstructive multi-view evidential clustering algorithm(SRMVEC). Based on a sparse reconstructive procedure, SRMVEC learns a shared affinity matrix across views, and maps multi-view objects to a 2-dimensional humanreadable chart by calculating 2 newly defined mathematical metrics for each object. From this chart, users can detect the number of clusters and select several objects existing in the dataset as cluster centers. Then, SRMVEC derives a credal partition under the framework of evidence theory, improving the fault tolerance of clustering. Ablation studies show the benefits of adopting the sparse reconstructive procedure and evidence theory. Besides,SRMVEC delivers effectiveness on benchmark datasets by outperforming some state-of-the-art methods.展开更多
Deep multi-view subspace clustering (DMVSC) based on self-expression has attracted increasing attention dueto its outstanding performance and nonlinear application. However, most existing methods neglect that viewpriv...Deep multi-view subspace clustering (DMVSC) based on self-expression has attracted increasing attention dueto its outstanding performance and nonlinear application. However, most existing methods neglect that viewprivatemeaningless information or noise may interfere with the learning of self-expression, which may lead to thedegeneration of clustering performance. In this paper, we propose a novel framework of Contrastive Consistencyand Attentive Complementarity (CCAC) for DMVsSC. CCAC aligns all the self-expressions of multiple viewsand fuses them based on their discrimination, so that it can effectively explore consistent and complementaryinformation for achieving precise clustering. Specifically, the view-specific self-expression is learned by a selfexpressionlayer embedded into the auto-encoder network for each view. To guarantee consistency across views andreduce the effect of view-private information or noise, we align all the view-specific self-expressions by contrastivelearning. The aligned self-expressions are assigned adaptive weights by channel attention mechanism according totheir discrimination. Then they are fused by convolution kernel to obtain consensus self-expression withmaximumcomplementarity ofmultiple views. Extensive experimental results on four benchmark datasets and one large-scaledataset of the CCAC method outperformother state-of-the-artmethods, demonstrating its clustering effectiveness.展开更多
The observation error model of the underwater acous-tic positioning system is an important factor to influence the positioning accuracy of the underwater target.For the position inconsistency error caused by consideri...The observation error model of the underwater acous-tic positioning system is an important factor to influence the positioning accuracy of the underwater target.For the position inconsistency error caused by considering the underwater tar-get as a mass point,as well as the observation system error,the traditional error model best estimation trajectory(EMBET)with little observed data and too many parameters can lead to the ill-condition of the parameter model.In this paper,a multi-station fusion system error model based on the optimal polynomial con-straint is constructed,and the corresponding observation sys-tem error identification based on improved spectral clustering is designed.Firstly,the reduced parameter unified modeling for the underwater target position parameters and the system error is achieved through the polynomial optimization.Then a multi-sta-tion non-oriented graph network is established,which can address the problem of the inaccurate identification for the sys-tem errors.Moreover,the similarity matrix of the spectral cluster-ing is improved,and the iterative identification for the system errors based on the improved spectral clustering is proposed.Finally,the comprehensive measured data of long baseline lake test and sea test show that the proposed method can accu-rately identify the system errors,and moreover can improve the positioning accuracy for the underwater target positioning.展开更多
Multi-view Subspace Clustering (MVSC) emerges as an advanced clustering method, designed to integrate diverse views to uncover a common subspace, enhancing the accuracy and robustness of clustering results. The signif...Multi-view Subspace Clustering (MVSC) emerges as an advanced clustering method, designed to integrate diverse views to uncover a common subspace, enhancing the accuracy and robustness of clustering results. The significance of low-rank prior in MVSC is emphasized, highlighting its role in capturing the global data structure across views for improved performance. However, it faces challenges with outlier sensitivity due to its reliance on the Frobenius norm for error measurement. Addressing this, our paper proposes a Low-Rank Multi-view Subspace Clustering Based on Sparse Regularization (LMVSC- Sparse) approach. Sparse regularization helps in selecting the most relevant features or views for clustering while ignoring irrelevant or noisy ones. This leads to a more efficient and effective representation of the data, improving the clustering accuracy and robustness, especially in the presence of outliers or noisy data. By incorporating sparse regularization, LMVSC-Sparse can effectively handle outlier sensitivity, which is a common challenge in traditional MVSC methods relying solely on low-rank priors. Then Alternating Direction Method of Multipliers (ADMM) algorithm is employed to solve the proposed optimization problems. Our comprehensive experiments demonstrate the efficiency and effectiveness of LMVSC-Sparse, offering a robust alternative to traditional MVSC methods.展开更多
A new algorithm for clustering multiple data streams is proposed.The algorithm can effectively cluster data streams which show similar behavior with some unknown time delays.The algorithm uses the autoregressive (AR...A new algorithm for clustering multiple data streams is proposed.The algorithm can effectively cluster data streams which show similar behavior with some unknown time delays.The algorithm uses the autoregressive (AR) modeling technique to measure correlations between data streams.It exploits estimated frequencies spectra to extract the essential features of streams.Each stream is represented as the sum of spectral components and the correlation is measured component-wise.Each spectral component is described by four parameters,namely,amplitude,phase,damping rate and frequency.The ε-lag-correlation between two spectral components is calculated.The algorithm uses such information as similarity measures in clustering data streams.Based on a sliding window model,the algorithm can continuously report the most recent clustering results and adjust the number of clusters.Experiments on real and synthetic streams show that the proposed clustering method has a higher speed and clustering quality than other similar methods.展开更多
Vehicles can establish a collaborative environment cognition through sharing the original or processed sensor data from the vehicular sensors and status map. Clustering in the vehicular ad-hoc network(VANET) is crucia...Vehicles can establish a collaborative environment cognition through sharing the original or processed sensor data from the vehicular sensors and status map. Clustering in the vehicular ad-hoc network(VANET) is crucial for enhancing the stability of the collaborative environment. In this paper, the problem for clustering is innovatively transformed into a cutting graph problem. A novel clustering algorithm based on the Spectral Clustering algorithm and the improved force-directed algorithm is designed. It takes the average lifetime of all clusters as an optimization goal so that the stability of the entire system can be enhanced. A series of close-to-practical scenarios are generated by the Simulation of Urban Mobility(SUMO). The numerical results indicate that our approach has superior performance in maintaining whole cluster stability.展开更多
Traditional unsupervised seismic facies analysis techniques need to assume that seismic data obey mixed Gaussian distribution.However,fi eld seismic data may not meet this condition,thereby leading to wrong classifi c...Traditional unsupervised seismic facies analysis techniques need to assume that seismic data obey mixed Gaussian distribution.However,fi eld seismic data may not meet this condition,thereby leading to wrong classifi cation in the application of this technology.This paper introduces a spectral clustering technique for unsupervised seismic facies analysis.This algorithm is based on on the idea of a graph to cluster the data.Its kem is that seismic data are regarded as points in space,points can be connected with the edge and construct to graphs.When the graphs are divided,the weights of the edges between the different subgraphs are as low as possible,whereas the weights of the inner edges of the subgraph should be as high as possible.That has high computational complexity and entails large memory consumption for spectral clustering algorithm.To solve the problem this paper introduces the idea of sparse representation into spectral clustering.Through the selection of a small number of local sparse representation points,the spectral clustering matrix of all sample points is approximately represented to reduce the cost of spectral clustering operation.Verifi cation of physical model and fi eld data shows that the proposed approach can obtain more accurate seismic facies classification results without considering the data meet any hypothesis.The computing efficiency of this new method is better than that of the conventional spectral clustering method,thereby meeting the application needs of fi eld seismic data.展开更多
The similarity measure is crucial to the performance of spectral clustering. The Gaussian kernel function based on the Euclidean distance is usual y adopted as the similarity measure. However, the Euclidean distance m...The similarity measure is crucial to the performance of spectral clustering. The Gaussian kernel function based on the Euclidean distance is usual y adopted as the similarity measure. However, the Euclidean distance measure cannot ful y reveal the complex distribution data, and the result of spectral clustering is very sensitive to the scaling parameter. To solve these problems, a new manifold distance measure and a novel simulated anneal-ing spectral clustering (SASC) algorithm based on the manifold distance measure are proposed. The simulated annealing based on genetic algorithm (SAGA), characterized by its rapid convergence to the global optimum, is used to cluster the sample points in the spectral mapping space. The proposed algorithm can not only reflect local and global consistency better, but also reduce the sensitivity of spectral clustering to the kernel parameter, which improves the algorithm’s clustering performance. To efficiently apply the algorithm to image segmentation, the Nystrom method is used to reduce the computation complexity. Experimental results show that compared with traditional clustering algorithms and those popular spectral clustering algorithms, the proposed algorithm can achieve better clustering performances on several synthetic datasets, texture images and real images.展开更多
Clustering is one of the most widely used techniques for exploratory data analysis. Spectral clustering algorithm, a popular modern cluslering algorithm, has been shown to be more effective in detecting clusters than ...Clustering is one of the most widely used techniques for exploratory data analysis. Spectral clustering algorithm, a popular modern cluslering algorithm, has been shown to be more effective in detecting clusters than many traditional algorithms. It has applications ranging from computer vision and information retrieval to social sienee and biology. With the size of databases soaring, cluostering algorithms bare saling computational time and memory use. In this paper, we propose a parallel spectral elustering implementation based on MapRednee. Both the computation and data storage are dislributed, which solves the sealability problems for most existing algorithms. We empirically analyze the proposed implementation on both benchmark net- works and a real social network dataset of about two million vertices and two billion edges crawled from Sina Weibo. It is shown that the proposed implementation scales well, speeds up the clustering without sacrificing quality, and processes massive datasets efficiently on commodity machine clusters.展开更多
The defense techniques for machine learning are critical yet challenging due tothe number and type of attacks for widely applied machine learning algorithms aresignificantly increasing. Among these attacks, the poison...The defense techniques for machine learning are critical yet challenging due tothe number and type of attacks for widely applied machine learning algorithms aresignificantly increasing. Among these attacks, the poisoning attack, which disturbsmachine learning algorithms by injecting poisoning samples, is an attack with the greatestthreat. In this paper, we focus on analyzing the characteristics of positioning samples andpropose a novel sample evaluation method to defend against the poisoning attack cateringfor the characteristics of poisoning samples. To capture the intrinsic data characteristicsfrom heterogeneous aspects, we first evaluate training data by multiple criteria, each ofwhich is reformulated from a spectral clustering. Then, we integrate the multipleevaluation scores generated by the multiple criteria through the proposed multiplespectral clustering aggregation (MSCA) method. Finally, we use the unified score as theindicator of poisoning attack samples. Experimental results on intrusion detection datasets show that MSCA significantly outperforms the K-means outlier detection in terms ofdata legality evaluation and poisoning attack detection.展开更多
[ Objective] The research aimed to study assessment index system of the rainstorm disaster in Fujian Province based on spectral cluste- ring model with grey correlation analysis. [Method] According to meteorological d...[ Objective] The research aimed to study assessment index system of the rainstorm disaster in Fujian Province based on spectral cluste- ring model with grey correlation analysis. [Method] According to meteorological disaster yearbook in Fujian Province, by comprehensively consider- ing disaster-inducing factor, disaster-inducing environment, disaster-sustaining body and regional disaster-prevention level, evaluation index system of the regional rainstorm disaster in Fujian was established. By spectral clustering model based on grey correlation analysis, dsk zoning of the rain- storm disaster was conducted in each area of Fujian. Finally, effect and application of the clustering model were analyzed by case research. [ Re- sult] In order to dig immanent connection among regional characteristics and improve disaster-preventing linkage performance of the evaluation unit, a spectral clustering model based on grey correlation analysis was used to conduct risk zoning of the rainstorm disaster in Fujian Province. Moreo- ver, combined weight was introduced to judge each evaluation index, so as to adjust clustering model. By case study, rainstorm disaster levels in 67 counties were obtained. Internal characteristics of each type were analyzed, and main correlation factors of each type were extracted. It was compared with statistical result of the rainstorm disaster, verifying validity and feasibility of the model. [ Conclusion] The method was feasible, and its evaluated result had better differentiation and decision accuracv.展开更多
This paper proposes a novel phishing web image segmentation algorithm which based on improving spectral clustering.Firstly,we construct a set of points which are composed of spatial location pixels and gray levels fro...This paper proposes a novel phishing web image segmentation algorithm which based on improving spectral clustering.Firstly,we construct a set of points which are composed of spatial location pixels and gray levels from a given image.Secondly,the data is clustered in spectral space of the similar matrix of the set points,in order to avoid the drawbacks of K-means algorithm in the conventional spectral clustering method that is sensitive to initial clustering centroids and convergence to local optimal solution,we introduce the clone operator,Cauthy mutation to enlarge the scale of clustering centers,quantum-inspired evolutionary algorithm to find the global optimal clustering centroids.Compared with phishing web image segmentation based on K-means,experimental results show that the segmentation performance of our method gains much improvement.Moreover,our method can convergence to global optimal solution and is better in accuracy of phishing web segmentation.展开更多
The complexity of large-scale network systems made of a large number of nonlinearly interconnected components is a restrictive facet for their modeling and analysis. In this paper, we propose a framework of hierarchic...The complexity of large-scale network systems made of a large number of nonlinearly interconnected components is a restrictive facet for their modeling and analysis. In this paper, we propose a framework of hierarchical modeling of a complex network system, based on a recursive unsupervised spectral clustering method. The hierarchical model serves the purpose of facilitating the management of complexity in the analysis of real-world critical infrastructures. We exemplify this by referring to the reliability analysis of the 380 kV Italian Power Transmission Network (IPTN). In this work of analysis, the classical component Importance Measures (IMs) of reliability theory have been extended to render them compatible and applicable to a complex distributed network system. By utilizing these extended IMs, the reliability properties of the IPTN system can be evaluated in the framework of the hierarchical system model, with the aim of providing risk managers with information on the risk/safety significance of system structures and components.展开更多
In the process of clothing image researching,how to segment the clothing quickly and accurately and retain the clothing style details as much as possible is the basis of subsequent image analysis.Spectral clustering c...In the process of clothing image researching,how to segment the clothing quickly and accurately and retain the clothing style details as much as possible is the basis of subsequent image analysis.Spectral clustering clothing image segmentation algorithm is a common method in the process of clothing image extraction.However,the traditional model requires high computing power and is easily affected by the initial center of clustering.It often falls into local optimization.Aiming at the above two points,an improved spectral clustering clothing image segmentation algorithm is proposed in this paper.The Nystrom approximation strategy is introduced into the spectral mapping process to reduce the computational complexity.In the clustering stage,this algorithm uses the global optimization advantage of the particle swarm optimization algorithm and selects the sparrow search algorithm to search the optimal initial clustering point,to effectively avoid the occurrence of local optimization.In the end,the effectiveness of this algorithm is verified on clothing images in each environment.展开更多
A new fuzzy support vector machine algorithm with dual membership values based on spectral clustering method is pro- posed to overcome the shortcoming of the normal support vector machine algorithm, which divides the ...A new fuzzy support vector machine algorithm with dual membership values based on spectral clustering method is pro- posed to overcome the shortcoming of the normal support vector machine algorithm, which divides the training datasets into two absolutely exclusive classes in the binary classification, ignoring the possibility of "overlapping" region between the two training classes. The proposed method handles sample "overlap" effi- ciently with spectral clustering, overcoming the disadvantages of over-fitting well, and improving the data mining efficiency greatly. Simulation provides clear evidences to the new method.展开更多
A simple and fast approach based on eigenvalue similarity metric for Polarimetric SAR image segmentation of Land Cover is proposed in this paper. The approach uses eigenvalues of the coherency matrix as to construct s...A simple and fast approach based on eigenvalue similarity metric for Polarimetric SAR image segmentation of Land Cover is proposed in this paper. The approach uses eigenvalues of the coherency matrix as to construct similarity metric of clustering algorithm to segment SAR image. The Mahalanobis distance is used to metric pairwise similarity between pixels to avoid the manual scale parameter tuning in previous spectral clustering method. Furthermore, the spatial coherence constraints and spectral clustering ensemble are employed to stabilize and improve the segmentation performance. All experiments are carried out on three sets of Polarimetric SAR data. The experimental results show that the proposed method is superior to other comparison methods.展开更多
In order to realize the intelligent mechanization of the last process of the fruit industry chains,the identification of fruit packing boxes is researched.A multi-view database is established to describe the omnidirec...In order to realize the intelligent mechanization of the last process of the fruit industry chains,the identification of fruit packing boxes is researched.A multi-view database is established to describe the omnidirectional attitudes of the fruit packing boxes.In order to reduce the data redundancy caused by multi-view acquisition,a new binary multi-view kernel principal component analysis network(BMKPCANet) is built,and a multi-view recognition method of fruit packing boxes is proposed based on the BMKPCANet and support vector machine(SVM).The experimental results show that the recognition accuracy of proposed BMKPCANet is 12.82% higher than PCANet and3.51% higher than KPCANet on average.The time consumption of proposed BMKPCANet is 7.74%lower than PCANet and 29.01% lower than KPCANet on average.This work has laid a theoretical foundation for multi-view recognition of 3 D objects and has a good practical application value.展开更多
The K-multiple-means(KMM)retains the simple and efficient advantages of the K-means algorithm by setting multiple subclasses,and improves its effect on non-convex data sets.And aiming at the problem that it cannot be ...The K-multiple-means(KMM)retains the simple and efficient advantages of the K-means algorithm by setting multiple subclasses,and improves its effect on non-convex data sets.And aiming at the problem that it cannot be applied to the Internet on a multi-view data set,a multi-view K-multiple-means(MKMM)clustering method is proposed in this paper.The new algorithm introduces view weight parameter,reserves the design of setting multiple subclasses,makes the number of clusters as constraint and obtains clusters by solving optimization problem.The new algorithm is compared with some popular multi-view clustering algorithms.The effectiveness of the new algorithm is proved through the analysis of the experimental results.展开更多
Due to the development of E-Commerce, collaboration filtering (CF) recommendation algorithm becomes popular in recent years. It has some limitations such as cold start, data sparseness and low operation efficiency. In...Due to the development of E-Commerce, collaboration filtering (CF) recommendation algorithm becomes popular in recent years. It has some limitations such as cold start, data sparseness and low operation efficiency. In this paper, a CF recommendation algorithm is propose based on the latent factor model and improved spectral clustering (CFRALFMISC) to improve the forecasting precision. The latent factor model was firstly adopted to predict the missing score. Then, the cluster validity index was used to determine the number of clusters. Finally, the spectral clustering was improved by using the FCM algorithm to replace the K-means in the spectral clustering. The simulation results show that CFRALFMISC can effectively improve the recommendation precision compared with other algorithms.展开更多
Feature selection is very important to obtain meaningful and interpretive clustering results from a clustering analysis. In the application of soil data clustering, there is a lack of good understanding of the respons...Feature selection is very important to obtain meaningful and interpretive clustering results from a clustering analysis. In the application of soil data clustering, there is a lack of good understanding of the response of clustering performance to different features subsets. In the present paper, we analyzed the performance differences between k-means, fuzzy c-means, and spectral clustering algorithms in the conditions of different feature subsets of soil data sets. The experimental results demonstrated that the performances of spectral clustering algorithm were generally better than those of k-means and fuzzy c-means with different features subsets. The feature subsets containing environmental attributes helped to improve clustering performances better than those having spatial attributes and produced more accurate and meaningful clustering results. Our results demonstrated that combination of spectral clustering algorithm with the feature subsets containing environmental attributes rather than spatial attributes may be a better choice in applications of soil data clustering.展开更多
基金supported in part by NUS startup grantthe National Natural Science Foundation of China (52076037)。
文摘Although many multi-view clustering(MVC) algorithms with acceptable performances have been presented, to the best of our knowledge, nearly all of them need to be fed with the correct number of clusters. In addition, these existing algorithms create only the hard and fuzzy partitions for multi-view objects,which are often located in highly-overlapping areas of multi-view feature space. The adoption of hard and fuzzy partition ignores the ambiguity and uncertainty in the assignment of objects, likely leading to performance degradation. To address these issues, we propose a novel sparse reconstructive multi-view evidential clustering algorithm(SRMVEC). Based on a sparse reconstructive procedure, SRMVEC learns a shared affinity matrix across views, and maps multi-view objects to a 2-dimensional humanreadable chart by calculating 2 newly defined mathematical metrics for each object. From this chart, users can detect the number of clusters and select several objects existing in the dataset as cluster centers. Then, SRMVEC derives a credal partition under the framework of evidence theory, improving the fault tolerance of clustering. Ablation studies show the benefits of adopting the sparse reconstructive procedure and evidence theory. Besides,SRMVEC delivers effectiveness on benchmark datasets by outperforming some state-of-the-art methods.
文摘Deep multi-view subspace clustering (DMVSC) based on self-expression has attracted increasing attention dueto its outstanding performance and nonlinear application. However, most existing methods neglect that viewprivatemeaningless information or noise may interfere with the learning of self-expression, which may lead to thedegeneration of clustering performance. In this paper, we propose a novel framework of Contrastive Consistencyand Attentive Complementarity (CCAC) for DMVsSC. CCAC aligns all the self-expressions of multiple viewsand fuses them based on their discrimination, so that it can effectively explore consistent and complementaryinformation for achieving precise clustering. Specifically, the view-specific self-expression is learned by a selfexpressionlayer embedded into the auto-encoder network for each view. To guarantee consistency across views andreduce the effect of view-private information or noise, we align all the view-specific self-expressions by contrastivelearning. The aligned self-expressions are assigned adaptive weights by channel attention mechanism according totheir discrimination. Then they are fused by convolution kernel to obtain consensus self-expression withmaximumcomplementarity ofmultiple views. Extensive experimental results on four benchmark datasets and one large-scaledataset of the CCAC method outperformother state-of-the-artmethods, demonstrating its clustering effectiveness.
基金This work was supported by the National Natural Science Foundation of China(61903086,61903366,62001115)the Natural Science Foundation of Hunan Province(2019JJ50745,2020JJ4280,2021JJ40133)the Fundamentals and Basic of Applications Research Foundation of Guangdong Province(2019A1515110136).
文摘The observation error model of the underwater acous-tic positioning system is an important factor to influence the positioning accuracy of the underwater target.For the position inconsistency error caused by considering the underwater tar-get as a mass point,as well as the observation system error,the traditional error model best estimation trajectory(EMBET)with little observed data and too many parameters can lead to the ill-condition of the parameter model.In this paper,a multi-station fusion system error model based on the optimal polynomial con-straint is constructed,and the corresponding observation sys-tem error identification based on improved spectral clustering is designed.Firstly,the reduced parameter unified modeling for the underwater target position parameters and the system error is achieved through the polynomial optimization.Then a multi-sta-tion non-oriented graph network is established,which can address the problem of the inaccurate identification for the sys-tem errors.Moreover,the similarity matrix of the spectral cluster-ing is improved,and the iterative identification for the system errors based on the improved spectral clustering is proposed.Finally,the comprehensive measured data of long baseline lake test and sea test show that the proposed method can accu-rately identify the system errors,and moreover can improve the positioning accuracy for the underwater target positioning.
文摘Multi-view Subspace Clustering (MVSC) emerges as an advanced clustering method, designed to integrate diverse views to uncover a common subspace, enhancing the accuracy and robustness of clustering results. The significance of low-rank prior in MVSC is emphasized, highlighting its role in capturing the global data structure across views for improved performance. However, it faces challenges with outlier sensitivity due to its reliance on the Frobenius norm for error measurement. Addressing this, our paper proposes a Low-Rank Multi-view Subspace Clustering Based on Sparse Regularization (LMVSC- Sparse) approach. Sparse regularization helps in selecting the most relevant features or views for clustering while ignoring irrelevant or noisy ones. This leads to a more efficient and effective representation of the data, improving the clustering accuracy and robustness, especially in the presence of outliers or noisy data. By incorporating sparse regularization, LMVSC-Sparse can effectively handle outlier sensitivity, which is a common challenge in traditional MVSC methods relying solely on low-rank priors. Then Alternating Direction Method of Multipliers (ADMM) algorithm is employed to solve the proposed optimization problems. Our comprehensive experiments demonstrate the efficiency and effectiveness of LMVSC-Sparse, offering a robust alternative to traditional MVSC methods.
基金The National Natural Science Foundation of China(No.60673060)the Natural Science Foundation of Jiangsu Province(No.BK2005047)
文摘A new algorithm for clustering multiple data streams is proposed.The algorithm can effectively cluster data streams which show similar behavior with some unknown time delays.The algorithm uses the autoregressive (AR) modeling technique to measure correlations between data streams.It exploits estimated frequencies spectra to extract the essential features of streams.Each stream is represented as the sum of spectral components and the correlation is measured component-wise.Each spectral component is described by four parameters,namely,amplitude,phase,damping rate and frequency.The ε-lag-correlation between two spectral components is calculated.The algorithm uses such information as similarity measures in clustering data streams.Based on a sliding window model,the algorithm can continuously report the most recent clustering results and adjust the number of clusters.Experiments on real and synthetic streams show that the proposed clustering method has a higher speed and clustering quality than other similar methods.
基金supported in part by National Key R&D Program of China under Grant 2018YFB1800800National NSF of China under Grant 61827801,61801218+2 种基金by the open research fund of Key Laboratory of Dynamic Cognitive System of Electromagnetic Spectrum Space(Nanjing Univ.Aeronaut.Astronaut.)(No.KF20181913)in part by the Natural Science Foundation of Jiangsu Province under Grant BK20180420by the Open Foundation for Graduate Innovation of NUAA(Grant NO.kfjj20190417).
文摘Vehicles can establish a collaborative environment cognition through sharing the original or processed sensor data from the vehicular sensors and status map. Clustering in the vehicular ad-hoc network(VANET) is crucial for enhancing the stability of the collaborative environment. In this paper, the problem for clustering is innovatively transformed into a cutting graph problem. A novel clustering algorithm based on the Spectral Clustering algorithm and the improved force-directed algorithm is designed. It takes the average lifetime of all clusters as an optimization goal so that the stability of the entire system can be enhanced. A series of close-to-practical scenarios are generated by the Simulation of Urban Mobility(SUMO). The numerical results indicate that our approach has superior performance in maintaining whole cluster stability.
基金This work was supported by National Natural Science Foundation of China(Nos.U1562218,41604107,and 41804126).
文摘Traditional unsupervised seismic facies analysis techniques need to assume that seismic data obey mixed Gaussian distribution.However,fi eld seismic data may not meet this condition,thereby leading to wrong classifi cation in the application of this technology.This paper introduces a spectral clustering technique for unsupervised seismic facies analysis.This algorithm is based on on the idea of a graph to cluster the data.Its kem is that seismic data are regarded as points in space,points can be connected with the edge and construct to graphs.When the graphs are divided,the weights of the edges between the different subgraphs are as low as possible,whereas the weights of the inner edges of the subgraph should be as high as possible.That has high computational complexity and entails large memory consumption for spectral clustering algorithm.To solve the problem this paper introduces the idea of sparse representation into spectral clustering.Through the selection of a small number of local sparse representation points,the spectral clustering matrix of all sample points is approximately represented to reduce the cost of spectral clustering operation.Verifi cation of physical model and fi eld data shows that the proposed approach can obtain more accurate seismic facies classification results without considering the data meet any hypothesis.The computing efficiency of this new method is better than that of the conventional spectral clustering method,thereby meeting the application needs of fi eld seismic data.
基金supported by the National Natural Science Foundationof China(61272119)
文摘The similarity measure is crucial to the performance of spectral clustering. The Gaussian kernel function based on the Euclidean distance is usual y adopted as the similarity measure. However, the Euclidean distance measure cannot ful y reveal the complex distribution data, and the result of spectral clustering is very sensitive to the scaling parameter. To solve these problems, a new manifold distance measure and a novel simulated anneal-ing spectral clustering (SASC) algorithm based on the manifold distance measure are proposed. The simulated annealing based on genetic algorithm (SAGA), characterized by its rapid convergence to the global optimum, is used to cluster the sample points in the spectral mapping space. The proposed algorithm can not only reflect local and global consistency better, but also reduce the sensitivity of spectral clustering to the kernel parameter, which improves the algorithm’s clustering performance. To efficiently apply the algorithm to image segmentation, the Nystrom method is used to reduce the computation complexity. Experimental results show that compared with traditional clustering algorithms and those popular spectral clustering algorithms, the proposed algorithm can achieve better clustering performances on several synthetic datasets, texture images and real images.
文摘Clustering is one of the most widely used techniques for exploratory data analysis. Spectral clustering algorithm, a popular modern cluslering algorithm, has been shown to be more effective in detecting clusters than many traditional algorithms. It has applications ranging from computer vision and information retrieval to social sienee and biology. With the size of databases soaring, cluostering algorithms bare saling computational time and memory use. In this paper, we propose a parallel spectral elustering implementation based on MapRednee. Both the computation and data storage are dislributed, which solves the sealability problems for most existing algorithms. We empirically analyze the proposed implementation on both benchmark net- works and a real social network dataset of about two million vertices and two billion edges crawled from Sina Weibo. It is shown that the proposed implementation scales well, speeds up the clustering without sacrificing quality, and processes massive datasets efficiently on commodity machine clusters.
文摘The defense techniques for machine learning are critical yet challenging due tothe number and type of attacks for widely applied machine learning algorithms aresignificantly increasing. Among these attacks, the poisoning attack, which disturbsmachine learning algorithms by injecting poisoning samples, is an attack with the greatestthreat. In this paper, we focus on analyzing the characteristics of positioning samples andpropose a novel sample evaluation method to defend against the poisoning attack cateringfor the characteristics of poisoning samples. To capture the intrinsic data characteristicsfrom heterogeneous aspects, we first evaluate training data by multiple criteria, each ofwhich is reformulated from a spectral clustering. Then, we integrate the multipleevaluation scores generated by the multiple criteria through the proposed multiplespectral clustering aggregation (MSCA) method. Finally, we use the unified score as theindicator of poisoning attack samples. Experimental results on intrusion detection datasets show that MSCA significantly outperforms the K-means outlier detection in terms ofdata legality evaluation and poisoning attack detection.
基金Supported by Special Item of the Public Sector(Meteorological) Science Research(GYHY201106040)
文摘[ Objective] The research aimed to study assessment index system of the rainstorm disaster in Fujian Province based on spectral cluste- ring model with grey correlation analysis. [Method] According to meteorological disaster yearbook in Fujian Province, by comprehensively consider- ing disaster-inducing factor, disaster-inducing environment, disaster-sustaining body and regional disaster-prevention level, evaluation index system of the regional rainstorm disaster in Fujian was established. By spectral clustering model based on grey correlation analysis, dsk zoning of the rain- storm disaster was conducted in each area of Fujian. Finally, effect and application of the clustering model were analyzed by case research. [ Re- sult] In order to dig immanent connection among regional characteristics and improve disaster-preventing linkage performance of the evaluation unit, a spectral clustering model based on grey correlation analysis was used to conduct risk zoning of the rainstorm disaster in Fujian Province. Moreo- ver, combined weight was introduced to judge each evaluation index, so as to adjust clustering model. By case study, rainstorm disaster levels in 67 counties were obtained. Internal characteristics of each type were analyzed, and main correlation factors of each type were extracted. It was compared with statistical result of the rainstorm disaster, verifying validity and feasibility of the model. [ Conclusion] The method was feasible, and its evaluated result had better differentiation and decision accuracv.
基金Supported by the Fundamental Research Funds for the Central Universities in North China Electric Power University(11MG13)the Natural Science Foundation of Hebei Province(F2011502038)
文摘This paper proposes a novel phishing web image segmentation algorithm which based on improving spectral clustering.Firstly,we construct a set of points which are composed of spatial location pixels and gray levels from a given image.Secondly,the data is clustered in spectral space of the similar matrix of the set points,in order to avoid the drawbacks of K-means algorithm in the conventional spectral clustering method that is sensitive to initial clustering centroids and convergence to local optimal solution,we introduce the clone operator,Cauthy mutation to enlarge the scale of clustering centers,quantum-inspired evolutionary algorithm to find the global optimal clustering centroids.Compared with phishing web image segmentation based on K-means,experimental results show that the segmentation performance of our method gains much improvement.Moreover,our method can convergence to global optimal solution and is better in accuracy of phishing web segmentation.
文摘The complexity of large-scale network systems made of a large number of nonlinearly interconnected components is a restrictive facet for their modeling and analysis. In this paper, we propose a framework of hierarchical modeling of a complex network system, based on a recursive unsupervised spectral clustering method. The hierarchical model serves the purpose of facilitating the management of complexity in the analysis of real-world critical infrastructures. We exemplify this by referring to the reliability analysis of the 380 kV Italian Power Transmission Network (IPTN). In this work of analysis, the classical component Importance Measures (IMs) of reliability theory have been extended to render them compatible and applicable to a complex distributed network system. By utilizing these extended IMs, the reliability properties of the IPTN system can be evaluated in the framework of the hierarchical system model, with the aim of providing risk managers with information on the risk/safety significance of system structures and components.
文摘In the process of clothing image researching,how to segment the clothing quickly and accurately and retain the clothing style details as much as possible is the basis of subsequent image analysis.Spectral clustering clothing image segmentation algorithm is a common method in the process of clothing image extraction.However,the traditional model requires high computing power and is easily affected by the initial center of clustering.It often falls into local optimization.Aiming at the above two points,an improved spectral clustering clothing image segmentation algorithm is proposed in this paper.The Nystrom approximation strategy is introduced into the spectral mapping process to reduce the computational complexity.In the clustering stage,this algorithm uses the global optimization advantage of the particle swarm optimization algorithm and selects the sparrow search algorithm to search the optimal initial clustering point,to effectively avoid the occurrence of local optimization.In the end,the effectiveness of this algorithm is verified on clothing images in each environment.
基金supported by the National Natural Science Foundation of China (7083100170821061)
文摘A new fuzzy support vector machine algorithm with dual membership values based on spectral clustering method is pro- posed to overcome the shortcoming of the normal support vector machine algorithm, which divides the training datasets into two absolutely exclusive classes in the binary classification, ignoring the possibility of "overlapping" region between the two training classes. The proposed method handles sample "overlap" effi- ciently with spectral clustering, overcoming the disadvantages of over-fitting well, and improving the data mining efficiency greatly. Simulation provides clear evidences to the new method.
文摘A simple and fast approach based on eigenvalue similarity metric for Polarimetric SAR image segmentation of Land Cover is proposed in this paper. The approach uses eigenvalues of the coherency matrix as to construct similarity metric of clustering algorithm to segment SAR image. The Mahalanobis distance is used to metric pairwise similarity between pixels to avoid the manual scale parameter tuning in previous spectral clustering method. Furthermore, the spatial coherence constraints and spectral clustering ensemble are employed to stabilize and improve the segmentation performance. All experiments are carried out on three sets of Polarimetric SAR data. The experimental results show that the proposed method is superior to other comparison methods.
基金Supported by the National Natural Science Foundation of China(No.52075306).
文摘In order to realize the intelligent mechanization of the last process of the fruit industry chains,the identification of fruit packing boxes is researched.A multi-view database is established to describe the omnidirectional attitudes of the fruit packing boxes.In order to reduce the data redundancy caused by multi-view acquisition,a new binary multi-view kernel principal component analysis network(BMKPCANet) is built,and a multi-view recognition method of fruit packing boxes is proposed based on the BMKPCANet and support vector machine(SVM).The experimental results show that the recognition accuracy of proposed BMKPCANet is 12.82% higher than PCANet and3.51% higher than KPCANet on average.The time consumption of proposed BMKPCANet is 7.74%lower than PCANet and 29.01% lower than KPCANet on average.This work has laid a theoretical foundation for multi-view recognition of 3 D objects and has a good practical application value.
基金National Youth Natural Science Foundationof China(No.61806006)Innovation Program for Graduate of Jiangsu Province(No.KYLX160-781)Project Supported by Jiangsu University Superior Discipline Construction Project。
文摘The K-multiple-means(KMM)retains the simple and efficient advantages of the K-means algorithm by setting multiple subclasses,and improves its effect on non-convex data sets.And aiming at the problem that it cannot be applied to the Internet on a multi-view data set,a multi-view K-multiple-means(MKMM)clustering method is proposed in this paper.The new algorithm introduces view weight parameter,reserves the design of setting multiple subclasses,makes the number of clusters as constraint and obtains clusters by solving optimization problem.The new algorithm is compared with some popular multi-view clustering algorithms.The effectiveness of the new algorithm is proved through the analysis of the experimental results.
基金the National Natural Science Foundation of China (Grant No. 61762031)Guangxi Key Research and Development Plan (Gui Science AB17195029, Gui Science AB18126006)+3 种基金Guangxi key Laboratory Fund of Embedded Technology and Intelligent System, 2017 Innovation Project of Guangxi Graduate Education (No. YCSW2017156)2018 Innovation Project of Guangxi Graduate Education (No. YCSW2018157)Subsidies for the Project of Promoting the Ability of Young and Middleaged Scientific Research in Universities and Colleges of Guangxi (KY2016YB184)2016 Guilin Science and Technology Project (Gui Science 2016010202).
文摘Due to the development of E-Commerce, collaboration filtering (CF) recommendation algorithm becomes popular in recent years. It has some limitations such as cold start, data sparseness and low operation efficiency. In this paper, a CF recommendation algorithm is propose based on the latent factor model and improved spectral clustering (CFRALFMISC) to improve the forecasting precision. The latent factor model was firstly adopted to predict the missing score. Then, the cluster validity index was used to determine the number of clusters. Finally, the spectral clustering was improved by using the FCM algorithm to replace the K-means in the spectral clustering. The simulation results show that CFRALFMISC can effectively improve the recommendation precision compared with other algorithms.
文摘Feature selection is very important to obtain meaningful and interpretive clustering results from a clustering analysis. In the application of soil data clustering, there is a lack of good understanding of the response of clustering performance to different features subsets. In the present paper, we analyzed the performance differences between k-means, fuzzy c-means, and spectral clustering algorithms in the conditions of different feature subsets of soil data sets. The experimental results demonstrated that the performances of spectral clustering algorithm were generally better than those of k-means and fuzzy c-means with different features subsets. The feature subsets containing environmental attributes helped to improve clustering performances better than those having spatial attributes and produced more accurate and meaningful clustering results. Our results demonstrated that combination of spectral clustering algorithm with the feature subsets containing environmental attributes rather than spatial attributes may be a better choice in applications of soil data clustering.