The observation error model of the underwater acous-tic positioning system is an important factor to influence the positioning accuracy of the underwater target.For the position inconsistency error caused by consideri...The observation error model of the underwater acous-tic positioning system is an important factor to influence the positioning accuracy of the underwater target.For the position inconsistency error caused by considering the underwater tar-get as a mass point,as well as the observation system error,the traditional error model best estimation trajectory(EMBET)with little observed data and too many parameters can lead to the ill-condition of the parameter model.In this paper,a multi-station fusion system error model based on the optimal polynomial con-straint is constructed,and the corresponding observation sys-tem error identification based on improved spectral clustering is designed.Firstly,the reduced parameter unified modeling for the underwater target position parameters and the system error is achieved through the polynomial optimization.Then a multi-sta-tion non-oriented graph network is established,which can address the problem of the inaccurate identification for the sys-tem errors.Moreover,the similarity matrix of the spectral cluster-ing is improved,and the iterative identification for the system errors based on the improved spectral clustering is proposed.Finally,the comprehensive measured data of long baseline lake test and sea test show that the proposed method can accu-rately identify the system errors,and moreover can improve the positioning accuracy for the underwater target positioning.展开更多
The similarity measure is crucial to the performance of spectral clustering. The Gaussian kernel function based on the Euclidean distance is usual y adopted as the similarity measure. However, the Euclidean distance m...The similarity measure is crucial to the performance of spectral clustering. The Gaussian kernel function based on the Euclidean distance is usual y adopted as the similarity measure. However, the Euclidean distance measure cannot ful y reveal the complex distribution data, and the result of spectral clustering is very sensitive to the scaling parameter. To solve these problems, a new manifold distance measure and a novel simulated anneal-ing spectral clustering (SASC) algorithm based on the manifold distance measure are proposed. The simulated annealing based on genetic algorithm (SAGA), characterized by its rapid convergence to the global optimum, is used to cluster the sample points in the spectral mapping space. The proposed algorithm can not only reflect local and global consistency better, but also reduce the sensitivity of spectral clustering to the kernel parameter, which improves the algorithm’s clustering performance. To efficiently apply the algorithm to image segmentation, the Nystrom method is used to reduce the computation complexity. Experimental results show that compared with traditional clustering algorithms and those popular spectral clustering algorithms, the proposed algorithm can achieve better clustering performances on several synthetic datasets, texture images and real images.展开更多
A new algorithm for clustering multiple data streams is proposed.The algorithm can effectively cluster data streams which show similar behavior with some unknown time delays.The algorithm uses the autoregressive (AR...A new algorithm for clustering multiple data streams is proposed.The algorithm can effectively cluster data streams which show similar behavior with some unknown time delays.The algorithm uses the autoregressive (AR) modeling technique to measure correlations between data streams.It exploits estimated frequencies spectra to extract the essential features of streams.Each stream is represented as the sum of spectral components and the correlation is measured component-wise.Each spectral component is described by four parameters,namely,amplitude,phase,damping rate and frequency.The ε-lag-correlation between two spectral components is calculated.The algorithm uses such information as similarity measures in clustering data streams.Based on a sliding window model,the algorithm can continuously report the most recent clustering results and adjust the number of clusters.Experiments on real and synthetic streams show that the proposed clustering method has a higher speed and clustering quality than other similar methods.展开更多
Vehicles can establish a collaborative environment cognition through sharing the original or processed sensor data from the vehicular sensors and status map. Clustering in the vehicular ad-hoc network(VANET) is crucia...Vehicles can establish a collaborative environment cognition through sharing the original or processed sensor data from the vehicular sensors and status map. Clustering in the vehicular ad-hoc network(VANET) is crucial for enhancing the stability of the collaborative environment. In this paper, the problem for clustering is innovatively transformed into a cutting graph problem. A novel clustering algorithm based on the Spectral Clustering algorithm and the improved force-directed algorithm is designed. It takes the average lifetime of all clusters as an optimization goal so that the stability of the entire system can be enhanced. A series of close-to-practical scenarios are generated by the Simulation of Urban Mobility(SUMO). The numerical results indicate that our approach has superior performance in maintaining whole cluster stability.展开更多
Traditional unsupervised seismic facies analysis techniques need to assume that seismic data obey mixed Gaussian distribution.However,fi eld seismic data may not meet this condition,thereby leading to wrong classifi c...Traditional unsupervised seismic facies analysis techniques need to assume that seismic data obey mixed Gaussian distribution.However,fi eld seismic data may not meet this condition,thereby leading to wrong classifi cation in the application of this technology.This paper introduces a spectral clustering technique for unsupervised seismic facies analysis.This algorithm is based on on the idea of a graph to cluster the data.Its kem is that seismic data are regarded as points in space,points can be connected with the edge and construct to graphs.When the graphs are divided,the weights of the edges between the different subgraphs are as low as possible,whereas the weights of the inner edges of the subgraph should be as high as possible.That has high computational complexity and entails large memory consumption for spectral clustering algorithm.To solve the problem this paper introduces the idea of sparse representation into spectral clustering.Through the selection of a small number of local sparse representation points,the spectral clustering matrix of all sample points is approximately represented to reduce the cost of spectral clustering operation.Verifi cation of physical model and fi eld data shows that the proposed approach can obtain more accurate seismic facies classification results without considering the data meet any hypothesis.The computing efficiency of this new method is better than that of the conventional spectral clustering method,thereby meeting the application needs of fi eld seismic data.展开更多
Clustering is one of the most widely used techniques for exploratory data analysis. Spectral clustering algorithm, a popular modern cluslering algorithm, has been shown to be more effective in detecting clusters than ...Clustering is one of the most widely used techniques for exploratory data analysis. Spectral clustering algorithm, a popular modern cluslering algorithm, has been shown to be more effective in detecting clusters than many traditional algorithms. It has applications ranging from computer vision and information retrieval to social sienee and biology. With the size of databases soaring, cluostering algorithms bare saling computational time and memory use. In this paper, we propose a parallel spectral elustering implementation based on MapRednee. Both the computation and data storage are dislributed, which solves the sealability problems for most existing algorithms. We empirically analyze the proposed implementation on both benchmark net- works and a real social network dataset of about two million vertices and two billion edges crawled from Sina Weibo. It is shown that the proposed implementation scales well, speeds up the clustering without sacrificing quality, and processes massive datasets efficiently on commodity machine clusters.展开更多
The defense techniques for machine learning are critical yet challenging due tothe number and type of attacks for widely applied machine learning algorithms aresignificantly increasing. Among these attacks, the poison...The defense techniques for machine learning are critical yet challenging due tothe number and type of attacks for widely applied machine learning algorithms aresignificantly increasing. Among these attacks, the poisoning attack, which disturbsmachine learning algorithms by injecting poisoning samples, is an attack with the greatestthreat. In this paper, we focus on analyzing the characteristics of positioning samples andpropose a novel sample evaluation method to defend against the poisoning attack cateringfor the characteristics of poisoning samples. To capture the intrinsic data characteristicsfrom heterogeneous aspects, we first evaluate training data by multiple criteria, each ofwhich is reformulated from a spectral clustering. Then, we integrate the multipleevaluation scores generated by the multiple criteria through the proposed multiplespectral clustering aggregation (MSCA) method. Finally, we use the unified score as theindicator of poisoning attack samples. Experimental results on intrusion detection datasets show that MSCA significantly outperforms the K-means outlier detection in terms ofdata legality evaluation and poisoning attack detection.展开更多
[ Objective] The research aimed to study assessment index system of the rainstorm disaster in Fujian Province based on spectral cluste- ring model with grey correlation analysis. [Method] According to meteorological d...[ Objective] The research aimed to study assessment index system of the rainstorm disaster in Fujian Province based on spectral cluste- ring model with grey correlation analysis. [Method] According to meteorological disaster yearbook in Fujian Province, by comprehensively consider- ing disaster-inducing factor, disaster-inducing environment, disaster-sustaining body and regional disaster-prevention level, evaluation index system of the regional rainstorm disaster in Fujian was established. By spectral clustering model based on grey correlation analysis, dsk zoning of the rain- storm disaster was conducted in each area of Fujian. Finally, effect and application of the clustering model were analyzed by case research. [ Re- sult] In order to dig immanent connection among regional characteristics and improve disaster-preventing linkage performance of the evaluation unit, a spectral clustering model based on grey correlation analysis was used to conduct risk zoning of the rainstorm disaster in Fujian Province. Moreo- ver, combined weight was introduced to judge each evaluation index, so as to adjust clustering model. By case study, rainstorm disaster levels in 67 counties were obtained. Internal characteristics of each type were analyzed, and main correlation factors of each type were extracted. It was compared with statistical result of the rainstorm disaster, verifying validity and feasibility of the model. [ Conclusion] The method was feasible, and its evaluated result had better differentiation and decision accuracv.展开更多
This paper proposes a novel phishing web image segmentation algorithm which based on improving spectral clustering.Firstly,we construct a set of points which are composed of spatial location pixels and gray levels fro...This paper proposes a novel phishing web image segmentation algorithm which based on improving spectral clustering.Firstly,we construct a set of points which are composed of spatial location pixels and gray levels from a given image.Secondly,the data is clustered in spectral space of the similar matrix of the set points,in order to avoid the drawbacks of K-means algorithm in the conventional spectral clustering method that is sensitive to initial clustering centroids and convergence to local optimal solution,we introduce the clone operator,Cauthy mutation to enlarge the scale of clustering centers,quantum-inspired evolutionary algorithm to find the global optimal clustering centroids.Compared with phishing web image segmentation based on K-means,experimental results show that the segmentation performance of our method gains much improvement.Moreover,our method can convergence to global optimal solution and is better in accuracy of phishing web segmentation.展开更多
The complexity of large-scale network systems made of a large number of nonlinearly interconnected components is a restrictive facet for their modeling and analysis. In this paper, we propose a framework of hierarchic...The complexity of large-scale network systems made of a large number of nonlinearly interconnected components is a restrictive facet for their modeling and analysis. In this paper, we propose a framework of hierarchical modeling of a complex network system, based on a recursive unsupervised spectral clustering method. The hierarchical model serves the purpose of facilitating the management of complexity in the analysis of real-world critical infrastructures. We exemplify this by referring to the reliability analysis of the 380 kV Italian Power Transmission Network (IPTN). In this work of analysis, the classical component Importance Measures (IMs) of reliability theory have been extended to render them compatible and applicable to a complex distributed network system. By utilizing these extended IMs, the reliability properties of the IPTN system can be evaluated in the framework of the hierarchical system model, with the aim of providing risk managers with information on the risk/safety significance of system structures and components.展开更多
In the process of clothing image researching,how to segment the clothing quickly and accurately and retain the clothing style details as much as possible is the basis of subsequent image analysis.Spectral clustering c...In the process of clothing image researching,how to segment the clothing quickly and accurately and retain the clothing style details as much as possible is the basis of subsequent image analysis.Spectral clustering clothing image segmentation algorithm is a common method in the process of clothing image extraction.However,the traditional model requires high computing power and is easily affected by the initial center of clustering.It often falls into local optimization.Aiming at the above two points,an improved spectral clustering clothing image segmentation algorithm is proposed in this paper.The Nystrom approximation strategy is introduced into the spectral mapping process to reduce the computational complexity.In the clustering stage,this algorithm uses the global optimization advantage of the particle swarm optimization algorithm and selects the sparrow search algorithm to search the optimal initial clustering point,to effectively avoid the occurrence of local optimization.In the end,the effectiveness of this algorithm is verified on clothing images in each environment.展开更多
A new fuzzy support vector machine algorithm with dual membership values based on spectral clustering method is pro- posed to overcome the shortcoming of the normal support vector machine algorithm, which divides the ...A new fuzzy support vector machine algorithm with dual membership values based on spectral clustering method is pro- posed to overcome the shortcoming of the normal support vector machine algorithm, which divides the training datasets into two absolutely exclusive classes in the binary classification, ignoring the possibility of "overlapping" region between the two training classes. The proposed method handles sample "overlap" effi- ciently with spectral clustering, overcoming the disadvantages of over-fitting well, and improving the data mining efficiency greatly. Simulation provides clear evidences to the new method.展开更多
A simple and fast approach based on eigenvalue similarity metric for Polarimetric SAR image segmentation of Land Cover is proposed in this paper. The approach uses eigenvalues of the coherency matrix as to construct s...A simple and fast approach based on eigenvalue similarity metric for Polarimetric SAR image segmentation of Land Cover is proposed in this paper. The approach uses eigenvalues of the coherency matrix as to construct similarity metric of clustering algorithm to segment SAR image. The Mahalanobis distance is used to metric pairwise similarity between pixels to avoid the manual scale parameter tuning in previous spectral clustering method. Furthermore, the spatial coherence constraints and spectral clustering ensemble are employed to stabilize and improve the segmentation performance. All experiments are carried out on three sets of Polarimetric SAR data. The experimental results show that the proposed method is superior to other comparison methods.展开更多
Due to the development of E-Commerce, collaboration filtering (CF) recommendation algorithm becomes popular in recent years. It has some limitations such as cold start, data sparseness and low operation efficiency. In...Due to the development of E-Commerce, collaboration filtering (CF) recommendation algorithm becomes popular in recent years. It has some limitations such as cold start, data sparseness and low operation efficiency. In this paper, a CF recommendation algorithm is propose based on the latent factor model and improved spectral clustering (CFRALFMISC) to improve the forecasting precision. The latent factor model was firstly adopted to predict the missing score. Then, the cluster validity index was used to determine the number of clusters. Finally, the spectral clustering was improved by using the FCM algorithm to replace the K-means in the spectral clustering. The simulation results show that CFRALFMISC can effectively improve the recommendation precision compared with other algorithms.展开更多
In order to improve the accuracy and reliability of the driving fatigue detection based on a single feature, a new detection algorithm based on multiple features is proposed. Two direct driver's facial features refle...In order to improve the accuracy and reliability of the driving fatigue detection based on a single feature, a new detection algorithm based on multiple features is proposed. Two direct driver's facial features reflecting fatigue and one indirect vehicle behavior feature indicating fatigue are considered. Meanwhile, T-S fuzzy neural network(TSFNN)is adopted to recognize the driving fatigue of drivers. For the structure identification of the TSFNN, subtractive clustering(SC) is used to confirm the fuzzy rules and their correlative parameters. Moreover, the particle swarm optimization (PSO)algorithm is improved to train the TSFNN. Simulation results and experiments on vehicles show that the proposed algorithm can effectively improve the convergence speed and the recognition accuracy of the TSFNN, as well as enhance the correct rate of driving fatigue detection.展开更多
Feature selection is very important to obtain meaningful and interpretive clustering results from a clustering analysis. In the application of soil data clustering, there is a lack of good understanding of the respons...Feature selection is very important to obtain meaningful and interpretive clustering results from a clustering analysis. In the application of soil data clustering, there is a lack of good understanding of the response of clustering performance to different features subsets. In the present paper, we analyzed the performance differences between k-means, fuzzy c-means, and spectral clustering algorithms in the conditions of different feature subsets of soil data sets. The experimental results demonstrated that the performances of spectral clustering algorithm were generally better than those of k-means and fuzzy c-means with different features subsets. The feature subsets containing environmental attributes helped to improve clustering performances better than those having spatial attributes and produced more accurate and meaningful clustering results. Our results demonstrated that combination of spectral clustering algorithm with the feature subsets containing environmental attributes rather than spatial attributes may be a better choice in applications of soil data clustering.展开更多
In this paper,the synthesis of a novel polyoxovanadiumborate Cd_(0.75) Na_2 Ni_(0.5) [V_(12) B_(18) O_(50.5)(OH)_(9.5)]·27.5 H_2 O by hydrothermal method.The compound consists of metal M(Cd,Na,Ni)and V_(12) B_(18...In this paper,the synthesis of a novel polyoxovanadiumborate Cd_(0.75) Na_2 Ni_(0.5) [V_(12) B_(18) O_(50.5)(OH)_(9.5)]·27.5 H_2 O by hydrothermal method.The compound consists of metal M(Cd,Na,Ni)and V_(12) B_(18) O_(60) cluster units connected through the M-O bond to form a three-dimensional structure.We performed a series of characterizations of Compound 1 of our team and tested its fluorescence properties at the same time.The luminescence investigations show that the compound 1 displays an interesting luminescence property.The compound 1 exhibits a good potential as a luminescent multi-responsive sensing material for Fe^(3+)ions.展开更多
Since webpage classification is different from traditional text classification with its irregular words and phrases,massive and unlabeled features,which makes it harder for us to obtain effective feature.To cope with ...Since webpage classification is different from traditional text classification with its irregular words and phrases,massive and unlabeled features,which makes it harder for us to obtain effective feature.To cope with this problem,we propose two scenarios to extract meaningful strings based on document clustering and term clustering with multi-strategies to optimize a Vector Space Model(VSM) in order to improve webpage classification.The results show that document clustering work better than term clustering in coping with document content.However,a better overall performance is obtained by spectral clustering with document clustering.Moreover,owing to image existing in a same webpage with document content,the proposed method is also applied to extract image meaningful terms,and experiment results also show its effectiveness in improving webpage classification.展开更多
Wireless Multimedia Sensor Networks (WMSNs) are comprised of small embedded audio/video motes capable of extracting the surrounding environmental information, locally processing it and then wirelessly transmitting it ...Wireless Multimedia Sensor Networks (WMSNs) are comprised of small embedded audio/video motes capable of extracting the surrounding environmental information, locally processing it and then wirelessly transmitting it to sink/base station. Multimedia data such as image, audio and video is larger in volume than scalar data such as temperature, pressure and humidity. Thus to transmit multimedia information, more energy is required which reduces the lifetime of the network. Limitation of battery energy is a crucial problem in WMSN that needs to be addressed to prolong the lifetime of the network. In this paper we present a clustering approach based on Spectral Graph Partitioning (SGP) for WMSN that increases the lifetime of the network. The efficient strategies for cluster head selection and rotation are also proposed as part of clustering approach. Simulation results show that our strategy is better than existing strategies.展开更多
随着国家大力推进能源供给侧结构性改革,新能源装机容量不断提升,电力市场竞争愈加激烈。另一方面,全球煤炭市场的复杂多变,导致以煤炭为能量来源的发电企业成本上涨。燃煤发热量是衡量煤质的重要评价标准之一,也是采购煤炭最重要的依据...随着国家大力推进能源供给侧结构性改革,新能源装机容量不断提升,电力市场竞争愈加激烈。另一方面,全球煤炭市场的复杂多变,导致以煤炭为能量来源的发电企业成本上涨。燃煤发热量是衡量煤质的重要评价标准之一,也是采购煤炭最重要的依据,对燃煤发热量进行准确预测能够有效地控制电厂运行采购成本。为了实现燃煤发热量的高效预测,采用Pearson系数对相关变量进行特征选取,采用基于密度的噪点空间聚类(Density-Based Spatial Clustering of Applications with Noise,DBSCAN)算法对某电厂自备煤厂近2年1733条化验数据进行去噪,对去噪后数据进行谱聚类(Spectral Clustering,SC)分析。将分类后的子样本集采用极致梯度提升(Extreme Gradient Boosting,XGBoost)算法分别建立预测模型,并与最小二乘法回归(Ordinary Least Squares,OLS)、支持向量机(Support Vector Machines,SVM)模型进行性能比较。结果表明,基于XGBoost的电站燃煤发热量预测模型相较于其他算法准确性有明显提升,泛化能力更强。对经过SC算法分类后的燃煤分别建立预测模型能够进一步提高模型的精细化水平,为燃煤电站发热量预测提供一种可靠高效的方法。展开更多
基金This work was supported by the National Natural Science Foundation of China(61903086,61903366,62001115)the Natural Science Foundation of Hunan Province(2019JJ50745,2020JJ4280,2021JJ40133)the Fundamentals and Basic of Applications Research Foundation of Guangdong Province(2019A1515110136).
文摘The observation error model of the underwater acous-tic positioning system is an important factor to influence the positioning accuracy of the underwater target.For the position inconsistency error caused by considering the underwater tar-get as a mass point,as well as the observation system error,the traditional error model best estimation trajectory(EMBET)with little observed data and too many parameters can lead to the ill-condition of the parameter model.In this paper,a multi-station fusion system error model based on the optimal polynomial con-straint is constructed,and the corresponding observation sys-tem error identification based on improved spectral clustering is designed.Firstly,the reduced parameter unified modeling for the underwater target position parameters and the system error is achieved through the polynomial optimization.Then a multi-sta-tion non-oriented graph network is established,which can address the problem of the inaccurate identification for the sys-tem errors.Moreover,the similarity matrix of the spectral cluster-ing is improved,and the iterative identification for the system errors based on the improved spectral clustering is proposed.Finally,the comprehensive measured data of long baseline lake test and sea test show that the proposed method can accu-rately identify the system errors,and moreover can improve the positioning accuracy for the underwater target positioning.
基金supported by the National Natural Science Foundationof China(61272119)
文摘The similarity measure is crucial to the performance of spectral clustering. The Gaussian kernel function based on the Euclidean distance is usual y adopted as the similarity measure. However, the Euclidean distance measure cannot ful y reveal the complex distribution data, and the result of spectral clustering is very sensitive to the scaling parameter. To solve these problems, a new manifold distance measure and a novel simulated anneal-ing spectral clustering (SASC) algorithm based on the manifold distance measure are proposed. The simulated annealing based on genetic algorithm (SAGA), characterized by its rapid convergence to the global optimum, is used to cluster the sample points in the spectral mapping space. The proposed algorithm can not only reflect local and global consistency better, but also reduce the sensitivity of spectral clustering to the kernel parameter, which improves the algorithm’s clustering performance. To efficiently apply the algorithm to image segmentation, the Nystrom method is used to reduce the computation complexity. Experimental results show that compared with traditional clustering algorithms and those popular spectral clustering algorithms, the proposed algorithm can achieve better clustering performances on several synthetic datasets, texture images and real images.
基金The National Natural Science Foundation of China(No.60673060)the Natural Science Foundation of Jiangsu Province(No.BK2005047)
文摘A new algorithm for clustering multiple data streams is proposed.The algorithm can effectively cluster data streams which show similar behavior with some unknown time delays.The algorithm uses the autoregressive (AR) modeling technique to measure correlations between data streams.It exploits estimated frequencies spectra to extract the essential features of streams.Each stream is represented as the sum of spectral components and the correlation is measured component-wise.Each spectral component is described by four parameters,namely,amplitude,phase,damping rate and frequency.The ε-lag-correlation between two spectral components is calculated.The algorithm uses such information as similarity measures in clustering data streams.Based on a sliding window model,the algorithm can continuously report the most recent clustering results and adjust the number of clusters.Experiments on real and synthetic streams show that the proposed clustering method has a higher speed and clustering quality than other similar methods.
基金supported in part by National Key R&D Program of China under Grant 2018YFB1800800National NSF of China under Grant 61827801,61801218+2 种基金by the open research fund of Key Laboratory of Dynamic Cognitive System of Electromagnetic Spectrum Space(Nanjing Univ.Aeronaut.Astronaut.)(No.KF20181913)in part by the Natural Science Foundation of Jiangsu Province under Grant BK20180420by the Open Foundation for Graduate Innovation of NUAA(Grant NO.kfjj20190417).
文摘Vehicles can establish a collaborative environment cognition through sharing the original or processed sensor data from the vehicular sensors and status map. Clustering in the vehicular ad-hoc network(VANET) is crucial for enhancing the stability of the collaborative environment. In this paper, the problem for clustering is innovatively transformed into a cutting graph problem. A novel clustering algorithm based on the Spectral Clustering algorithm and the improved force-directed algorithm is designed. It takes the average lifetime of all clusters as an optimization goal so that the stability of the entire system can be enhanced. A series of close-to-practical scenarios are generated by the Simulation of Urban Mobility(SUMO). The numerical results indicate that our approach has superior performance in maintaining whole cluster stability.
基金This work was supported by National Natural Science Foundation of China(Nos.U1562218,41604107,and 41804126).
文摘Traditional unsupervised seismic facies analysis techniques need to assume that seismic data obey mixed Gaussian distribution.However,fi eld seismic data may not meet this condition,thereby leading to wrong classifi cation in the application of this technology.This paper introduces a spectral clustering technique for unsupervised seismic facies analysis.This algorithm is based on on the idea of a graph to cluster the data.Its kem is that seismic data are regarded as points in space,points can be connected with the edge and construct to graphs.When the graphs are divided,the weights of the edges between the different subgraphs are as low as possible,whereas the weights of the inner edges of the subgraph should be as high as possible.That has high computational complexity and entails large memory consumption for spectral clustering algorithm.To solve the problem this paper introduces the idea of sparse representation into spectral clustering.Through the selection of a small number of local sparse representation points,the spectral clustering matrix of all sample points is approximately represented to reduce the cost of spectral clustering operation.Verifi cation of physical model and fi eld data shows that the proposed approach can obtain more accurate seismic facies classification results without considering the data meet any hypothesis.The computing efficiency of this new method is better than that of the conventional spectral clustering method,thereby meeting the application needs of fi eld seismic data.
文摘Clustering is one of the most widely used techniques for exploratory data analysis. Spectral clustering algorithm, a popular modern cluslering algorithm, has been shown to be more effective in detecting clusters than many traditional algorithms. It has applications ranging from computer vision and information retrieval to social sienee and biology. With the size of databases soaring, cluostering algorithms bare saling computational time and memory use. In this paper, we propose a parallel spectral elustering implementation based on MapRednee. Both the computation and data storage are dislributed, which solves the sealability problems for most existing algorithms. We empirically analyze the proposed implementation on both benchmark net- works and a real social network dataset of about two million vertices and two billion edges crawled from Sina Weibo. It is shown that the proposed implementation scales well, speeds up the clustering without sacrificing quality, and processes massive datasets efficiently on commodity machine clusters.
文摘The defense techniques for machine learning are critical yet challenging due tothe number and type of attacks for widely applied machine learning algorithms aresignificantly increasing. Among these attacks, the poisoning attack, which disturbsmachine learning algorithms by injecting poisoning samples, is an attack with the greatestthreat. In this paper, we focus on analyzing the characteristics of positioning samples andpropose a novel sample evaluation method to defend against the poisoning attack cateringfor the characteristics of poisoning samples. To capture the intrinsic data characteristicsfrom heterogeneous aspects, we first evaluate training data by multiple criteria, each ofwhich is reformulated from a spectral clustering. Then, we integrate the multipleevaluation scores generated by the multiple criteria through the proposed multiplespectral clustering aggregation (MSCA) method. Finally, we use the unified score as theindicator of poisoning attack samples. Experimental results on intrusion detection datasets show that MSCA significantly outperforms the K-means outlier detection in terms ofdata legality evaluation and poisoning attack detection.
基金Supported by Special Item of the Public Sector(Meteorological) Science Research(GYHY201106040)
文摘[ Objective] The research aimed to study assessment index system of the rainstorm disaster in Fujian Province based on spectral cluste- ring model with grey correlation analysis. [Method] According to meteorological disaster yearbook in Fujian Province, by comprehensively consider- ing disaster-inducing factor, disaster-inducing environment, disaster-sustaining body and regional disaster-prevention level, evaluation index system of the regional rainstorm disaster in Fujian was established. By spectral clustering model based on grey correlation analysis, dsk zoning of the rain- storm disaster was conducted in each area of Fujian. Finally, effect and application of the clustering model were analyzed by case research. [ Re- sult] In order to dig immanent connection among regional characteristics and improve disaster-preventing linkage performance of the evaluation unit, a spectral clustering model based on grey correlation analysis was used to conduct risk zoning of the rainstorm disaster in Fujian Province. Moreo- ver, combined weight was introduced to judge each evaluation index, so as to adjust clustering model. By case study, rainstorm disaster levels in 67 counties were obtained. Internal characteristics of each type were analyzed, and main correlation factors of each type were extracted. It was compared with statistical result of the rainstorm disaster, verifying validity and feasibility of the model. [ Conclusion] The method was feasible, and its evaluated result had better differentiation and decision accuracv.
基金Supported by the Fundamental Research Funds for the Central Universities in North China Electric Power University(11MG13)the Natural Science Foundation of Hebei Province(F2011502038)
文摘This paper proposes a novel phishing web image segmentation algorithm which based on improving spectral clustering.Firstly,we construct a set of points which are composed of spatial location pixels and gray levels from a given image.Secondly,the data is clustered in spectral space of the similar matrix of the set points,in order to avoid the drawbacks of K-means algorithm in the conventional spectral clustering method that is sensitive to initial clustering centroids and convergence to local optimal solution,we introduce the clone operator,Cauthy mutation to enlarge the scale of clustering centers,quantum-inspired evolutionary algorithm to find the global optimal clustering centroids.Compared with phishing web image segmentation based on K-means,experimental results show that the segmentation performance of our method gains much improvement.Moreover,our method can convergence to global optimal solution and is better in accuracy of phishing web segmentation.
文摘The complexity of large-scale network systems made of a large number of nonlinearly interconnected components is a restrictive facet for their modeling and analysis. In this paper, we propose a framework of hierarchical modeling of a complex network system, based on a recursive unsupervised spectral clustering method. The hierarchical model serves the purpose of facilitating the management of complexity in the analysis of real-world critical infrastructures. We exemplify this by referring to the reliability analysis of the 380 kV Italian Power Transmission Network (IPTN). In this work of analysis, the classical component Importance Measures (IMs) of reliability theory have been extended to render them compatible and applicable to a complex distributed network system. By utilizing these extended IMs, the reliability properties of the IPTN system can be evaluated in the framework of the hierarchical system model, with the aim of providing risk managers with information on the risk/safety significance of system structures and components.
文摘In the process of clothing image researching,how to segment the clothing quickly and accurately and retain the clothing style details as much as possible is the basis of subsequent image analysis.Spectral clustering clothing image segmentation algorithm is a common method in the process of clothing image extraction.However,the traditional model requires high computing power and is easily affected by the initial center of clustering.It often falls into local optimization.Aiming at the above two points,an improved spectral clustering clothing image segmentation algorithm is proposed in this paper.The Nystrom approximation strategy is introduced into the spectral mapping process to reduce the computational complexity.In the clustering stage,this algorithm uses the global optimization advantage of the particle swarm optimization algorithm and selects the sparrow search algorithm to search the optimal initial clustering point,to effectively avoid the occurrence of local optimization.In the end,the effectiveness of this algorithm is verified on clothing images in each environment.
基金supported by the National Natural Science Foundation of China (7083100170821061)
文摘A new fuzzy support vector machine algorithm with dual membership values based on spectral clustering method is pro- posed to overcome the shortcoming of the normal support vector machine algorithm, which divides the training datasets into two absolutely exclusive classes in the binary classification, ignoring the possibility of "overlapping" region between the two training classes. The proposed method handles sample "overlap" effi- ciently with spectral clustering, overcoming the disadvantages of over-fitting well, and improving the data mining efficiency greatly. Simulation provides clear evidences to the new method.
文摘A simple and fast approach based on eigenvalue similarity metric for Polarimetric SAR image segmentation of Land Cover is proposed in this paper. The approach uses eigenvalues of the coherency matrix as to construct similarity metric of clustering algorithm to segment SAR image. The Mahalanobis distance is used to metric pairwise similarity between pixels to avoid the manual scale parameter tuning in previous spectral clustering method. Furthermore, the spatial coherence constraints and spectral clustering ensemble are employed to stabilize and improve the segmentation performance. All experiments are carried out on three sets of Polarimetric SAR data. The experimental results show that the proposed method is superior to other comparison methods.
基金the National Natural Science Foundation of China (Grant No. 61762031)Guangxi Key Research and Development Plan (Gui Science AB17195029, Gui Science AB18126006)+3 种基金Guangxi key Laboratory Fund of Embedded Technology and Intelligent System, 2017 Innovation Project of Guangxi Graduate Education (No. YCSW2017156)2018 Innovation Project of Guangxi Graduate Education (No. YCSW2018157)Subsidies for the Project of Promoting the Ability of Young and Middleaged Scientific Research in Universities and Colleges of Guangxi (KY2016YB184)2016 Guilin Science and Technology Project (Gui Science 2016010202).
文摘Due to the development of E-Commerce, collaboration filtering (CF) recommendation algorithm becomes popular in recent years. It has some limitations such as cold start, data sparseness and low operation efficiency. In this paper, a CF recommendation algorithm is propose based on the latent factor model and improved spectral clustering (CFRALFMISC) to improve the forecasting precision. The latent factor model was firstly adopted to predict the missing score. Then, the cluster validity index was used to determine the number of clusters. Finally, the spectral clustering was improved by using the FCM algorithm to replace the K-means in the spectral clustering. The simulation results show that CFRALFMISC can effectively improve the recommendation precision compared with other algorithms.
基金The National Key Technologies R & D Program during the 11th Five-Year Plan Period(No.2009BAG13A04)the Ph.D.Programs Foundation of Ministry of Education of China(No.200802861061)the Transportation Science Research Project of Jiangsu Province(No.08X09)
文摘In order to improve the accuracy and reliability of the driving fatigue detection based on a single feature, a new detection algorithm based on multiple features is proposed. Two direct driver's facial features reflecting fatigue and one indirect vehicle behavior feature indicating fatigue are considered. Meanwhile, T-S fuzzy neural network(TSFNN)is adopted to recognize the driving fatigue of drivers. For the structure identification of the TSFNN, subtractive clustering(SC) is used to confirm the fuzzy rules and their correlative parameters. Moreover, the particle swarm optimization (PSO)algorithm is improved to train the TSFNN. Simulation results and experiments on vehicles show that the proposed algorithm can effectively improve the convergence speed and the recognition accuracy of the TSFNN, as well as enhance the correct rate of driving fatigue detection.
文摘Feature selection is very important to obtain meaningful and interpretive clustering results from a clustering analysis. In the application of soil data clustering, there is a lack of good understanding of the response of clustering performance to different features subsets. In the present paper, we analyzed the performance differences between k-means, fuzzy c-means, and spectral clustering algorithms in the conditions of different feature subsets of soil data sets. The experimental results demonstrated that the performances of spectral clustering algorithm were generally better than those of k-means and fuzzy c-means with different features subsets. The feature subsets containing environmental attributes helped to improve clustering performances better than those having spatial attributes and produced more accurate and meaningful clustering results. Our results demonstrated that combination of spectral clustering algorithm with the feature subsets containing environmental attributes rather than spatial attributes may be a better choice in applications of soil data clustering.
基金supported by the NNSFC(21473030,1371033)the Natural Science Foundation of Fujian Province(2013J01042)
文摘In this paper,the synthesis of a novel polyoxovanadiumborate Cd_(0.75) Na_2 Ni_(0.5) [V_(12) B_(18) O_(50.5)(OH)_(9.5)]·27.5 H_2 O by hydrothermal method.The compound consists of metal M(Cd,Na,Ni)and V_(12) B_(18) O_(60) cluster units connected through the M-O bond to form a three-dimensional structure.We performed a series of characterizations of Compound 1 of our team and tested its fluorescence properties at the same time.The luminescence investigations show that the compound 1 displays an interesting luminescence property.The compound 1 exhibits a good potential as a luminescent multi-responsive sensing material for Fe^(3+)ions.
基金supported by the National Natural Science Foundation of China under Grants No.61100205,No.60873001the HiTech Research and Development Program of China under Grant No.2011AA010705the Fundamental Research Funds for the Central Universities under Grant No.2009RC0212
文摘Since webpage classification is different from traditional text classification with its irregular words and phrases,massive and unlabeled features,which makes it harder for us to obtain effective feature.To cope with this problem,we propose two scenarios to extract meaningful strings based on document clustering and term clustering with multi-strategies to optimize a Vector Space Model(VSM) in order to improve webpage classification.The results show that document clustering work better than term clustering in coping with document content.However,a better overall performance is obtained by spectral clustering with document clustering.Moreover,owing to image existing in a same webpage with document content,the proposed method is also applied to extract image meaningful terms,and experiment results also show its effectiveness in improving webpage classification.
文摘Wireless Multimedia Sensor Networks (WMSNs) are comprised of small embedded audio/video motes capable of extracting the surrounding environmental information, locally processing it and then wirelessly transmitting it to sink/base station. Multimedia data such as image, audio and video is larger in volume than scalar data such as temperature, pressure and humidity. Thus to transmit multimedia information, more energy is required which reduces the lifetime of the network. Limitation of battery energy is a crucial problem in WMSN that needs to be addressed to prolong the lifetime of the network. In this paper we present a clustering approach based on Spectral Graph Partitioning (SGP) for WMSN that increases the lifetime of the network. The efficient strategies for cluster head selection and rotation are also proposed as part of clustering approach. Simulation results show that our strategy is better than existing strategies.
文摘随着国家大力推进能源供给侧结构性改革,新能源装机容量不断提升,电力市场竞争愈加激烈。另一方面,全球煤炭市场的复杂多变,导致以煤炭为能量来源的发电企业成本上涨。燃煤发热量是衡量煤质的重要评价标准之一,也是采购煤炭最重要的依据,对燃煤发热量进行准确预测能够有效地控制电厂运行采购成本。为了实现燃煤发热量的高效预测,采用Pearson系数对相关变量进行特征选取,采用基于密度的噪点空间聚类(Density-Based Spatial Clustering of Applications with Noise,DBSCAN)算法对某电厂自备煤厂近2年1733条化验数据进行去噪,对去噪后数据进行谱聚类(Spectral Clustering,SC)分析。将分类后的子样本集采用极致梯度提升(Extreme Gradient Boosting,XGBoost)算法分别建立预测模型,并与最小二乘法回归(Ordinary Least Squares,OLS)、支持向量机(Support Vector Machines,SVM)模型进行性能比较。结果表明,基于XGBoost的电站燃煤发热量预测模型相较于其他算法准确性有明显提升,泛化能力更强。对经过SC算法分类后的燃煤分别建立预测模型能够进一步提高模型的精细化水平,为燃煤电站发热量预测提供一种可靠高效的方法。