Clustering is a crucial method for deciphering data structure and producing new information.Due to its significance in revealing fundamental connections between the human brain and events,it is essential to utilize cl...Clustering is a crucial method for deciphering data structure and producing new information.Due to its significance in revealing fundamental connections between the human brain and events,it is essential to utilize clustering for cognitive research.Dealing with noisy data caused by inaccurate synthesis from several sources or misleading data production processes is one of the most intriguing clustering difficulties.Noisy data can lead to incorrect object recognition and inference.This research aims to innovate a novel clustering approach,named Picture-Neutrosophic Trusted Safe Semi-Supervised Fuzzy Clustering(PNTS3FCM),to solve the clustering problem with noisy data using neutral and refusal degrees in the definition of Picture Fuzzy Set(PFS)and Neutrosophic Set(NS).Our contribution is to propose a new optimization model with four essential components:clustering,outlier removal,safe semi-supervised fuzzy clustering and partitioning with labeled and unlabeled data.The effectiveness and flexibility of the proposed technique are estimated and compared with the state-of-art methods,standard Picture fuzzy clustering(FC-PFS)and Confidence-weighted safe semi-supervised clustering(CS3FCM)on benchmark UCI datasets.The experimental results show that our method is better at least 10/15 datasets than the compared methods in terms of clustering quality and computational time.展开更多
In this paper,we introduce a novel Multi-scale and Auto-tuned Semi-supervised Deep Subspace Clustering(MAS-DSC)algorithm,aimed at addressing the challenges of deep subspace clustering in high-dimensional real-world da...In this paper,we introduce a novel Multi-scale and Auto-tuned Semi-supervised Deep Subspace Clustering(MAS-DSC)algorithm,aimed at addressing the challenges of deep subspace clustering in high-dimensional real-world data,particularly in the field of medical imaging.Traditional deep subspace clustering algorithms,which are mostly unsupervised,are limited in their ability to effectively utilize the inherent prior knowledge in medical images.Our MAS-DSC algorithm incorporates a semi-supervised learning framework that uses a small amount of labeled data to guide the clustering process,thereby enhancing the discriminative power of the feature representations.Additionally,the multi-scale feature extraction mechanism is designed to adapt to the complexity of medical imaging data,resulting in more accurate clustering performance.To address the difficulty of hyperparameter selection in deep subspace clustering,this paper employs a Bayesian optimization algorithm for adaptive tuning of hyperparameters related to subspace clustering,prior knowledge constraints,and model loss weights.Extensive experiments on standard clustering datasets,including ORL,Coil20,and Coil100,validate the effectiveness of the MAS-DSC algorithm.The results show that with its multi-scale network structure and Bayesian hyperparameter optimization,MAS-DSC achieves excellent clustering results on these datasets.Furthermore,tests on a brain tumor dataset demonstrate the robustness of the algorithm and its ability to leverage prior knowledge for efficient feature extraction and enhanced clustering performance within a semi-supervised learning framework.展开更多
To solve the problems of a few optical fibre line fault samples and the inefficiency of manual communication optical fibre fault diagnosis,this paper proposes a communication optical fibre fault diagnosis model based ...To solve the problems of a few optical fibre line fault samples and the inefficiency of manual communication optical fibre fault diagnosis,this paper proposes a communication optical fibre fault diagnosis model based on variational modal decomposition(VMD),fuzzy entropy(FE)and fuzzy clustering(FC).Firstly,based on the OTDR curve data collected in the field,VMD is used to extract the different modal components(IMF)of the original signal and calculate the fuzzy entropy(FE)values of different components to characterize the subtle differences between them.The fuzzy entropy of each curve is used as the feature vector,which in turn constructs the communication optical fibre feature vector matrix,and the fuzzy clustering algorithm is used to achieve fault diagnosis of faulty optical fibre.The VMD-FE combination can extract subtle differences in features,and the fuzzy clustering algorithm does not require sample training.The experimental results show that the model in this paper has high accuracy and is relevant to the maintenance of communication optical fibre when compared with existing feature extraction models and traditional machine learning models.展开更多
Wireless sensor networks(WSN)gather information and sense information samples in a certain region and communicate these readings to a base station(BS).Energy efficiency is considered a major design issue in the WSNs,a...Wireless sensor networks(WSN)gather information and sense information samples in a certain region and communicate these readings to a base station(BS).Energy efficiency is considered a major design issue in the WSNs,and can be addressed using clustering and routing techniques.Information is sent from the source to the BS via routing procedures.However,these routing protocols must ensure that packets are delivered securely,guaranteeing that neither adversaries nor unauthentic individuals have access to the sent information.Secure data transfer is intended to protect the data from illegal access,damage,or disruption.Thus,in the proposed model,secure data transmission is developed in an energy-effective manner.A low-energy adaptive clustering hierarchy(LEACH)is developed to efficiently transfer the data.For the intrusion detection systems(IDS),Fuzzy logic and artificial neural networks(ANNs)are proposed.Initially,the nodes were randomly placed in the network and initialized to gather information.To ensure fair energy dissipation between the nodes,LEACH randomly chooses cluster heads(CHs)and allocates this role to the various nodes based on a round-robin management mechanism.The intrusion-detection procedure was then utilized to determine whether intruders were present in the network.Within the WSN,a Fuzzy interference rule was utilized to distinguish the malicious nodes from legal nodes.Subsequently,an ANN was employed to distinguish the harmful nodes from suspicious nodes.The effectiveness of the proposed approach was validated using metrics that attained 97%accuracy,97%specificity,and 97%sensitivity of 95%.Thus,it was proved that the LEACH and Fuzzy-based IDS approaches are the best choices for securing data transmission in an energy-efficient manner.展开更多
Semi-supervised clustering improves learning performance as long as it uses a small number of labeled samples to assist un-tagged samples for learning.This paper implements and compares unsupervised and semi-supervise...Semi-supervised clustering improves learning performance as long as it uses a small number of labeled samples to assist un-tagged samples for learning.This paper implements and compares unsupervised and semi-supervised clustering analysis of BOA-Argo ocean text data.Unsupervised K-Means and Affinity Propagation(AP)are two classical clustering algorithms.The Election-AP algorithm is proposed to handle the final cluster number in AP clustering as it has proved to be difficult to control in a suitable range.Semi-supervised samples thermocline data in the BOA-Argo dataset according to the thermocline standard definition,and use this data for semi-supervised cluster analysis.Several semi-supervised clustering algorithms were chosen for comparison of learning performance:Constrained-K-Means,Seeded-K-Means,SAP(Semi-supervised Affinity Propagation),LSAP(Loose Seed AP)and CSAP(Compact Seed AP).In order to adapt the single label,this paper improves the above algorithms to SCKM(improved Constrained-K-Means),SSKM(improved Seeded-K-Means),and SSAP(improved Semi-supervised Affinity Propagationg)to perform semi-supervised clustering analysis on the data.A DSAP(Double Seed AP)semi-supervised clustering algorithm based on compact seeds is proposed as the experimental data shows that DSAP has a better clustering effect.The unsupervised and semi-supervised clustering results are used to analyze the potential patterns of marine data.展开更多
Fuzzy C-means (FCM) is simple and widely used for complex data pattern recognition and image analyses. However, selecting an appropriate fuzzifier (m) is crucial in identifying an optimal number of patterns and achiev...Fuzzy C-means (FCM) is simple and widely used for complex data pattern recognition and image analyses. However, selecting an appropriate fuzzifier (m) is crucial in identifying an optimal number of patterns and achieving higher clustering accuracy, which few studies have investigated. Built upon two existing methods on selecting fuzzifier, we developed an integrated fuzzifier evaluation and selection algorithm and tested it using real datasets. Our findings indicate that the consistent optimal number of clusters can be learnt from testing different fuzzifiers for each dataset and the fuzzifier with the lowest value for this consistency should be selected for clustering. Our evaluation also shows that the fuzzifier impacts the clustering accuracy. For longitudinal data with missing values, m = 2 could be an empirical rule to start fuzzy clustering, and the best clustering accuracy was achieved for tested data, especially using our multiple-imputation based fuzzy clustering.展开更多
With the rapid development of WLAN( Wireless Local Area Network) technology,an important target of indoor positioning systems is to improve the positioning accuracy while reducing the online computation.In this paper,...With the rapid development of WLAN( Wireless Local Area Network) technology,an important target of indoor positioning systems is to improve the positioning accuracy while reducing the online computation.In this paper,it proposes a novel fingerprint positioning algorithm known as semi-supervised affinity propagation clustering based on distance function constraints. We show that by employing affinity propagation techniques,it is able to use a fractional labeled data to adjust similarity matrix of signal space to cluster reference points with high accuracy. The semi-supervised APC uses a combination of machine learning,clustering analysis and fingerprinting algorithm. By collecting data and testing our algorithm in a realistic indoor WLAN environment,the experimental results indicate that the proposed algorithm can improve positioning accuracy while reduce the online localization computation,as compared with the widely used K nearest neighbor and maximum likelihood estimation algorithms.展开更多
Clustering analysis is one of the main concerns in data mining.A common approach to the clustering process is to bring together points that are close to each other and separate points that are away from each other.The...Clustering analysis is one of the main concerns in data mining.A common approach to the clustering process is to bring together points that are close to each other and separate points that are away from each other.Therefore,measuring the distance between sample points is crucial to the effectiveness of clustering.Filtering features by label information and mea-suring the distance between samples by these features is a common supervised learning method to reconstruct distance metric.However,in many application scenarios,it is very expensive to obtain a large number of labeled samples.In this paper,to solve the clustering problem in the few supervised sample and high data dimensionality scenarios,a novel semi-supervised clustering algorithm is proposed by designing an improved prototype network that attempts to reconstruct the distance metric in the sample space with a small amount of pairwise supervised information,such as Must-Link and Cannot-Link,and then cluster the data in the new metric space.The core idea is to make the similar ones closer and the dissimilar ones further away through embedding mapping.Extensive experiments on both real-world and synthetic datasets show the effectiveness of this algorithm.Average clustering metrics on various datasets improved by 8%compared to the comparison algorithm.展开更多
A clustering algorithm for semi-supervised affinity propagation based on layered combination is proposed in this paper in light of existing flaws. To improve accuracy of the algorithm,it introduces the idea of layered...A clustering algorithm for semi-supervised affinity propagation based on layered combination is proposed in this paper in light of existing flaws. To improve accuracy of the algorithm,it introduces the idea of layered combination, divides an affinity propagation clustering( APC) process into several hierarchies evenly,draws samples from data of each hierarchy according to weight,and executes semi-supervised learning through construction of pairwise constraints and use of submanifold label mapping,weighting and combining clustering results of all hierarchies by combined promotion. It is shown by theoretical analysis and experimental result that clustering accuracy and computation complexity of the semi-supervised affinity propagation clustering algorithm based on layered combination( SAP-LC algorithm) have been greatly improved.展开更多
In the face of a growing number of large-scale data sets, affinity propagation clustering algorithm to calculate the process required to build the similarity matrix, will bring huge storage and computation. Therefore,...In the face of a growing number of large-scale data sets, affinity propagation clustering algorithm to calculate the process required to build the similarity matrix, will bring huge storage and computation. Therefore, this paper proposes an improved affinity propagation clustering algorithm. First, add the subtraction clustering, using the density value of the data points to obtain the point of initial clusters. Then, calculate the similarity distance between the initial cluster points, and reference the idea of semi-supervised clustering, adding pairs restriction information, structure sparse similarity matrix. Finally, the cluster representative points conduct AP clustering until a suitable cluster division.Experimental results show that the algorithm allows the calculation is greatly reduced, the similarity matrix storage capacity is also reduced, and better than the original algorithm on the clustering effect and processing speed.展开更多
To improve the accuracy of text clustering, fuzzy c-means clustering based on topic concept sub-space (TCS2FCM) is introduced for classifying texts. Five evaluation functions are combined to extract key phrases. Con...To improve the accuracy of text clustering, fuzzy c-means clustering based on topic concept sub-space (TCS2FCM) is introduced for classifying texts. Five evaluation functions are combined to extract key phrases. Concept phrases, as well as the descriptions of final clusters, are presented using WordNet origin from key phrases. Initial centers and membership matrix are the most important factors affecting clustering performance. Orthogonal concept topic sub-spaces are built with the topic concept phrases representing topics of the texts and the initialization of centers and the membership matrix depend on the concept vectors in sub-spaces. The results show that, different from random initialization of traditional fuzzy c-means clustering, the initialization related to text content contributions can improve clustering precision.展开更多
A novel model of fuzzy clustering, i.e. an allied fuzzy c means (AFCM) model is proposed based on the combination of advantages of fuzzy c means (FCM) and possibilistic c means (PCM) clustering. PCM is sensitive...A novel model of fuzzy clustering, i.e. an allied fuzzy c means (AFCM) model is proposed based on the combination of advantages of fuzzy c means (FCM) and possibilistic c means (PCM) clustering. PCM is sensitive to initializations and often generates coincident clusters. AFCM overcomes this shortcoming and it is an ex tension of PCM. Membership and typicality values can be simultaneously produced in AFCM. Experimental re- suits show that noise data can be well processed, coincident clusters are avoided and clustering accuracy is better.展开更多
Aimed to the characters of pests forecast such as fuzziness, correlation, nonlinear and real-time as well as decline of generalization capacity of neural network in prediction with few observations, a method of pests ...Aimed to the characters of pests forecast such as fuzziness, correlation, nonlinear and real-time as well as decline of generalization capacity of neural network in prediction with few observations, a method of pests forecasting using the method of neural network based on fuzzy clustering was proposed in this experiment. The simulation results demonstrated that the method was simple and practical and could forecast pests fast and accurately, particularly, the method could obtain good results with few samples and samples correlation.展开更多
Due to the limitation and hesitation in one's knowledge, the membership degree of an element to a given set usually has a few different values, in which the conventional fuzzy sets are invalid. Hesitant fuzzy sets ar...Due to the limitation and hesitation in one's knowledge, the membership degree of an element to a given set usually has a few different values, in which the conventional fuzzy sets are invalid. Hesitant fuzzy sets are a powerful tool to treat this case. The present paper focuses on investigating the clustering technique for hesitant fuzzy sets based on the K-means clustering algorithm which takes the results of hierarchical clustering as the initial clusters. Finally, two examples demonstrate the validity of our algorithm.展开更多
Intuitionistic fuzzy sets(IFSs) are useful means to describe and deal with vague and uncertain data.An intuitionistic fuzzy C-means algorithm to cluster IFSs is developed.In each stage of the intuitionistic fuzzy C-me...Intuitionistic fuzzy sets(IFSs) are useful means to describe and deal with vague and uncertain data.An intuitionistic fuzzy C-means algorithm to cluster IFSs is developed.In each stage of the intuitionistic fuzzy C-means method the seeds are modified,and for each IFS a membership degree to each of the clusters is estimated.In the end of the algorithm,all the given IFSs are clustered according to the estimated membership degrees.Furthermore,the algorithm is extended for clustering interval-valued intuitionistic fuzzy sets(IVIFSs).Finally,the developed algorithms are illustrated through conducting experiments on both the real-world and simulated data sets.展开更多
High fidelity analysis are utilized in modern engineering design optimization problems which involve expensive black-box models.For computation-intensive engineering design problems,efficient global optimization metho...High fidelity analysis are utilized in modern engineering design optimization problems which involve expensive black-box models.For computation-intensive engineering design problems,efficient global optimization methods must be developed to relieve the computational burden.A new metamodel-based global optimization method using fuzzy clustering for design space reduction(MGO-FCR) is presented.The uniformly distributed initial sample points are generated by Latin hypercube design to construct the radial basis function metamodel,whose accuracy is improved with increasing number of sample points gradually.Fuzzy c-mean method and Gath-Geva clustering method are applied to divide the design space into several small interesting cluster spaces for low and high dimensional problems respectively.Modeling efficiency and accuracy are directly related to the design space,so unconcerned spaces are eliminated by the proposed reduction principle and two pseudo reduction algorithms.The reduction principle is developed to determine whether the current design space should be reduced and which space is eliminated.The first pseudo reduction algorithm improves the speed of clustering,while the second pseudo reduction algorithm ensures the design space to be reduced.Through several numerical benchmark functions,comparative studies with adaptive response surface method,approximated unimodal region elimination method and mode-pursuing sampling are carried out.The optimization results reveal that this method captures the real global optimum for all the numerical benchmark functions.And the number of function evaluations show that the efficiency of this method is favorable especially for high dimensional problems.Based on this global design optimization method,a design optimization of a lifting surface in high speed flow is carried out and this method saves about 10 h compared with genetic algorithms.This method possesses favorable performance on efficiency,robustness and capability of global convergence and gives a new optimization strategy for engineering design optimization problems involving expensive black box models.展开更多
Based on Multi-Masking Empirical Mode Decomposition (MMEMD) and fuzzy c-means (FCM) clustering, a new method of wind turbine bearing fault diagnosis FCM-MMEMD is proposed, which can determine the fault accurately and ...Based on Multi-Masking Empirical Mode Decomposition (MMEMD) and fuzzy c-means (FCM) clustering, a new method of wind turbine bearing fault diagnosis FCM-MMEMD is proposed, which can determine the fault accurately and timely. First, FCM clustering is employed to classify the data into different clusters, which helps to estimate whether there is a fault and how many fault types there are. If fault signals exist, the fault vibration signals are then demodulated and decomposed into different frequency bands by MMEMD in order to be analyzed further. In order to overcome the mode mixing defect of empirical mode decomposition (EMD), a novel method called MMEMD is proposed. It is an improvement to masking empirical mode decomposition (MEMD). By adding multi-masking signals to the signals to be decomposed in different levels, it can restrain low-frequency components from mixing in highfrequency components effectively in the sifting process and then suppress the mode mixing. It has the advantages of easy implementation and strong ability of suppressing modal mixing. The fault type is determined by Hilbert envelope finally. The results of simulation signal decomposition showed the high performance of MMEMD. Experiments of bearing fault diagnosis in wind turbine bearing fault diagnosis proved the validity and high accuracy of the new method.展开更多
Reduced order models(ROMs) based on the snapshots on the CFD high-fidelity simulations have been paid great attention recently due to their capability of capturing the features of the complex geometries and flow con...Reduced order models(ROMs) based on the snapshots on the CFD high-fidelity simulations have been paid great attention recently due to their capability of capturing the features of the complex geometries and flow configurations. To improve the efficiency and precision of the ROMs, it is indispensable to add extra sampling points to the initial snapshots, since the number of sampling points to achieve an adequately accurate ROM is generally unknown in prior, but a large number of initial sampling points reduces the parsimony of the ROMs. A fuzzy-clustering-based adding-point strategy is proposed and the fuzzy clustering acts an indicator of the region in which the precision of ROMs is relatively low. The proposed method is applied to construct the ROMs for the benchmark mathematical examples and a numerical example of hypersonic aerothermodynamics prediction for a typical control surface. The proposed method can achieve a 34.5% improvement on the efficiency than the estimated mean squared error prediction algorithm and shows same-level prediction accuracy.展开更多
High fidelity analysis models,which are beneficial to improving the design quality,have been more and more widely utilized in the modern engineering design optimization problems.However,the high fidelity analysis mode...High fidelity analysis models,which are beneficial to improving the design quality,have been more and more widely utilized in the modern engineering design optimization problems.However,the high fidelity analysis models are so computationally expensive that the time required in design optimization is usually unacceptable.In order to improve the efficiency of optimization involving high fidelity analysis models,the optimization efficiency can be upgraded through applying surrogates to approximate the computationally expensive models,which can greately reduce the computation time.An efficient heuristic global optimization method using adaptive radial basis function(RBF) based on fuzzy clustering(ARFC) is proposed.In this method,a novel algorithm of maximin Latin hypercube design using successive local enumeration(SLE) is employed to obtain sample points with good performance in both space-filling and projective uniformity properties,which does a great deal of good to metamodels accuracy.RBF method is adopted for constructing the metamodels,and with the increasing the number of sample points the approximation accuracy of RBF is gradually enhanced.The fuzzy c-means clustering method is applied to identify the reduced attractive regions in the original design space.The numerical benchmark examples are used for validating the performance of ARFC.The results demonstrates that for most application examples the global optima are effectively obtained and comparison with adaptive response surface method(ARSM) proves that the proposed method can intuitively capture promising design regions and can efficiently identify the global or near-global design optimum.This method improves the efficiency and global convergence of the optimization problems,and gives a new optimization strategy for engineering design optimization problems involving computationally expensive models.展开更多
Intuitionistic fuzzy set (IFS) is a set of 2-tuple arguments, each of which is characterized by a membership degree and a nonmembership degree. The generalized form of IFS is interval-valued intuitionistic fuzzy set...Intuitionistic fuzzy set (IFS) is a set of 2-tuple arguments, each of which is characterized by a membership degree and a nonmembership degree. The generalized form of IFS is interval-valued intuitionistic fuzzy set (IVIFS), whose components are intervals rather than exact numbers. IFSs and IVIFSs have been found to be very useful to describe vagueness and uncertainty. However, it seems that little attention has been focused on the clustering analysis of IFSs and IVIFSs. An intuitionistic fuzzy hierarchical algorithm is introduced for clustering IFSs, which is based on the traditional hierarchical clustering procedure, the intuitionistic fuzzy aggregation operator, and the basic distance measures between IFSs: the Hamming distance, normalized Hamming, weighted Hamming, the Euclidean distance, the normalized Euclidean distance, and the weighted Euclidean distance. Subsequently, the algorithm is extended for clustering IVIFSs. Finally the algorithm and its extended form are applied to the classifications of building materials and enterprises respectively.展开更多
基金This research is funded by Graduate University of Science and Technology under grant number GUST.STS.DT2020-TT01。
文摘Clustering is a crucial method for deciphering data structure and producing new information.Due to its significance in revealing fundamental connections between the human brain and events,it is essential to utilize clustering for cognitive research.Dealing with noisy data caused by inaccurate synthesis from several sources or misleading data production processes is one of the most intriguing clustering difficulties.Noisy data can lead to incorrect object recognition and inference.This research aims to innovate a novel clustering approach,named Picture-Neutrosophic Trusted Safe Semi-Supervised Fuzzy Clustering(PNTS3FCM),to solve the clustering problem with noisy data using neutral and refusal degrees in the definition of Picture Fuzzy Set(PFS)and Neutrosophic Set(NS).Our contribution is to propose a new optimization model with four essential components:clustering,outlier removal,safe semi-supervised fuzzy clustering and partitioning with labeled and unlabeled data.The effectiveness and flexibility of the proposed technique are estimated and compared with the state-of-art methods,standard Picture fuzzy clustering(FC-PFS)and Confidence-weighted safe semi-supervised clustering(CS3FCM)on benchmark UCI datasets.The experimental results show that our method is better at least 10/15 datasets than the compared methods in terms of clustering quality and computational time.
基金supported in part by the National Natural Science Foundation of China under Grant 62171203in part by the Jiangsu Province“333 Project”High-Level Talent Cultivation Subsidized Project+2 种基金in part by the SuzhouKey Supporting Subjects for Health Informatics under Grant SZFCXK202147in part by the Changshu Science and Technology Program under Grants CS202015 and CS202246in part by Changshu Key Laboratory of Medical Artificial Intelligence and Big Data under Grants CYZ202301 and CS202314.
文摘In this paper,we introduce a novel Multi-scale and Auto-tuned Semi-supervised Deep Subspace Clustering(MAS-DSC)algorithm,aimed at addressing the challenges of deep subspace clustering in high-dimensional real-world data,particularly in the field of medical imaging.Traditional deep subspace clustering algorithms,which are mostly unsupervised,are limited in their ability to effectively utilize the inherent prior knowledge in medical images.Our MAS-DSC algorithm incorporates a semi-supervised learning framework that uses a small amount of labeled data to guide the clustering process,thereby enhancing the discriminative power of the feature representations.Additionally,the multi-scale feature extraction mechanism is designed to adapt to the complexity of medical imaging data,resulting in more accurate clustering performance.To address the difficulty of hyperparameter selection in deep subspace clustering,this paper employs a Bayesian optimization algorithm for adaptive tuning of hyperparameters related to subspace clustering,prior knowledge constraints,and model loss weights.Extensive experiments on standard clustering datasets,including ORL,Coil20,and Coil100,validate the effectiveness of the MAS-DSC algorithm.The results show that with its multi-scale network structure and Bayesian hyperparameter optimization,MAS-DSC achieves excellent clustering results on these datasets.Furthermore,tests on a brain tumor dataset demonstrate the robustness of the algorithm and its ability to leverage prior knowledge for efficient feature extraction and enhanced clustering performance within a semi-supervised learning framework.
基金This paper is supported by State Grid Gansu Electric Power Company Science and Technology Project(20220515003).
文摘To solve the problems of a few optical fibre line fault samples and the inefficiency of manual communication optical fibre fault diagnosis,this paper proposes a communication optical fibre fault diagnosis model based on variational modal decomposition(VMD),fuzzy entropy(FE)and fuzzy clustering(FC).Firstly,based on the OTDR curve data collected in the field,VMD is used to extract the different modal components(IMF)of the original signal and calculate the fuzzy entropy(FE)values of different components to characterize the subtle differences between them.The fuzzy entropy of each curve is used as the feature vector,which in turn constructs the communication optical fibre feature vector matrix,and the fuzzy clustering algorithm is used to achieve fault diagnosis of faulty optical fibre.The VMD-FE combination can extract subtle differences in features,and the fuzzy clustering algorithm does not require sample training.The experimental results show that the model in this paper has high accuracy and is relevant to the maintenance of communication optical fibre when compared with existing feature extraction models and traditional machine learning models.
文摘Wireless sensor networks(WSN)gather information and sense information samples in a certain region and communicate these readings to a base station(BS).Energy efficiency is considered a major design issue in the WSNs,and can be addressed using clustering and routing techniques.Information is sent from the source to the BS via routing procedures.However,these routing protocols must ensure that packets are delivered securely,guaranteeing that neither adversaries nor unauthentic individuals have access to the sent information.Secure data transfer is intended to protect the data from illegal access,damage,or disruption.Thus,in the proposed model,secure data transmission is developed in an energy-effective manner.A low-energy adaptive clustering hierarchy(LEACH)is developed to efficiently transfer the data.For the intrusion detection systems(IDS),Fuzzy logic and artificial neural networks(ANNs)are proposed.Initially,the nodes were randomly placed in the network and initialized to gather information.To ensure fair energy dissipation between the nodes,LEACH randomly chooses cluster heads(CHs)and allocates this role to the various nodes based on a round-robin management mechanism.The intrusion-detection procedure was then utilized to determine whether intruders were present in the network.Within the WSN,a Fuzzy interference rule was utilized to distinguish the malicious nodes from legal nodes.Subsequently,an ANN was employed to distinguish the harmful nodes from suspicious nodes.The effectiveness of the proposed approach was validated using metrics that attained 97%accuracy,97%specificity,and 97%sensitivity of 95%.Thus,it was proved that the LEACH and Fuzzy-based IDS approaches are the best choices for securing data transmission in an energy-efficient manner.
基金This work was supported in part by the National Natural Science Foundation of China(51679105,61872160,51809112)“Thirteenth Five Plan”Science and Technology Project of Education Department,Jilin Province(JJKH20200990KJ).
文摘Semi-supervised clustering improves learning performance as long as it uses a small number of labeled samples to assist un-tagged samples for learning.This paper implements and compares unsupervised and semi-supervised clustering analysis of BOA-Argo ocean text data.Unsupervised K-Means and Affinity Propagation(AP)are two classical clustering algorithms.The Election-AP algorithm is proposed to handle the final cluster number in AP clustering as it has proved to be difficult to control in a suitable range.Semi-supervised samples thermocline data in the BOA-Argo dataset according to the thermocline standard definition,and use this data for semi-supervised cluster analysis.Several semi-supervised clustering algorithms were chosen for comparison of learning performance:Constrained-K-Means,Seeded-K-Means,SAP(Semi-supervised Affinity Propagation),LSAP(Loose Seed AP)and CSAP(Compact Seed AP).In order to adapt the single label,this paper improves the above algorithms to SCKM(improved Constrained-K-Means),SSKM(improved Seeded-K-Means),and SSAP(improved Semi-supervised Affinity Propagationg)to perform semi-supervised clustering analysis on the data.A DSAP(Double Seed AP)semi-supervised clustering algorithm based on compact seeds is proposed as the experimental data shows that DSAP has a better clustering effect.The unsupervised and semi-supervised clustering results are used to analyze the potential patterns of marine data.
文摘Fuzzy C-means (FCM) is simple and widely used for complex data pattern recognition and image analyses. However, selecting an appropriate fuzzifier (m) is crucial in identifying an optimal number of patterns and achieving higher clustering accuracy, which few studies have investigated. Built upon two existing methods on selecting fuzzifier, we developed an integrated fuzzifier evaluation and selection algorithm and tested it using real datasets. Our findings indicate that the consistent optimal number of clusters can be learnt from testing different fuzzifiers for each dataset and the fuzzifier with the lowest value for this consistency should be selected for clustering. Our evaluation also shows that the fuzzifier impacts the clustering accuracy. For longitudinal data with missing values, m = 2 could be an empirical rule to start fuzzy clustering, and the best clustering accuracy was achieved for tested data, especially using our multiple-imputation based fuzzy clustering.
基金Sponsored by the National Natural Science Foundation of China(Grant No.61101122 and 61071105)
文摘With the rapid development of WLAN( Wireless Local Area Network) technology,an important target of indoor positioning systems is to improve the positioning accuracy while reducing the online computation.In this paper,it proposes a novel fingerprint positioning algorithm known as semi-supervised affinity propagation clustering based on distance function constraints. We show that by employing affinity propagation techniques,it is able to use a fractional labeled data to adjust similarity matrix of signal space to cluster reference points with high accuracy. The semi-supervised APC uses a combination of machine learning,clustering analysis and fingerprinting algorithm. By collecting data and testing our algorithm in a realistic indoor WLAN environment,the experimental results indicate that the proposed algorithm can improve positioning accuracy while reduce the online localization computation,as compared with the widely used K nearest neighbor and maximum likelihood estimation algorithms.
文摘Clustering analysis is one of the main concerns in data mining.A common approach to the clustering process is to bring together points that are close to each other and separate points that are away from each other.Therefore,measuring the distance between sample points is crucial to the effectiveness of clustering.Filtering features by label information and mea-suring the distance between samples by these features is a common supervised learning method to reconstruct distance metric.However,in many application scenarios,it is very expensive to obtain a large number of labeled samples.In this paper,to solve the clustering problem in the few supervised sample and high data dimensionality scenarios,a novel semi-supervised clustering algorithm is proposed by designing an improved prototype network that attempts to reconstruct the distance metric in the sample space with a small amount of pairwise supervised information,such as Must-Link and Cannot-Link,and then cluster the data in the new metric space.The core idea is to make the similar ones closer and the dissimilar ones further away through embedding mapping.Extensive experiments on both real-world and synthetic datasets show the effectiveness of this algorithm.Average clustering metrics on various datasets improved by 8%compared to the comparison algorithm.
基金the Science and Technology Research Program of Zhejiang Province,China(No.2011C21036)Projects in Science and Technology of Ningbo Municipal,China(No.2012B82003)+1 种基金Shanghai Natural Science Foundation,China(No.10ZR1400100)the National Undergraduate Training Programs for Innovation and Entrepreneurship,China(No.201410876011)
文摘A clustering algorithm for semi-supervised affinity propagation based on layered combination is proposed in this paper in light of existing flaws. To improve accuracy of the algorithm,it introduces the idea of layered combination, divides an affinity propagation clustering( APC) process into several hierarchies evenly,draws samples from data of each hierarchy according to weight,and executes semi-supervised learning through construction of pairwise constraints and use of submanifold label mapping,weighting and combining clustering results of all hierarchies by combined promotion. It is shown by theoretical analysis and experimental result that clustering accuracy and computation complexity of the semi-supervised affinity propagation clustering algorithm based on layered combination( SAP-LC algorithm) have been greatly improved.
基金This research has been partially supported by the national natural science foundation of China (51175169) and the national science and technology support program (2012BAF02B01).
文摘In the face of a growing number of large-scale data sets, affinity propagation clustering algorithm to calculate the process required to build the similarity matrix, will bring huge storage and computation. Therefore, this paper proposes an improved affinity propagation clustering algorithm. First, add the subtraction clustering, using the density value of the data points to obtain the point of initial clusters. Then, calculate the similarity distance between the initial cluster points, and reference the idea of semi-supervised clustering, adding pairs restriction information, structure sparse similarity matrix. Finally, the cluster representative points conduct AP clustering until a suitable cluster division.Experimental results show that the algorithm allows the calculation is greatly reduced, the similarity matrix storage capacity is also reduced, and better than the original algorithm on the clustering effect and processing speed.
基金The National Natural Science Foundation of China(No60672056)Open Fund of MOE-MS Key Laboratory of Multime-dia Computing and Communication(No06120809)
文摘To improve the accuracy of text clustering, fuzzy c-means clustering based on topic concept sub-space (TCS2FCM) is introduced for classifying texts. Five evaluation functions are combined to extract key phrases. Concept phrases, as well as the descriptions of final clusters, are presented using WordNet origin from key phrases. Initial centers and membership matrix are the most important factors affecting clustering performance. Orthogonal concept topic sub-spaces are built with the topic concept phrases representing topics of the texts and the initialization of centers and the membership matrix depend on the concept vectors in sub-spaces. The results show that, different from random initialization of traditional fuzzy c-means clustering, the initialization related to text content contributions can improve clustering precision.
文摘A novel model of fuzzy clustering, i.e. an allied fuzzy c means (AFCM) model is proposed based on the combination of advantages of fuzzy c means (FCM) and possibilistic c means (PCM) clustering. PCM is sensitive to initializations and often generates coincident clusters. AFCM overcomes this shortcoming and it is an ex tension of PCM. Membership and typicality values can be simultaneously produced in AFCM. Experimental re- suits show that noise data can be well processed, coincident clusters are avoided and clustering accuracy is better.
基金Supported by Guangxi Science Research and Technology Explora-tion Plan Project(0815001-10)~~
文摘Aimed to the characters of pests forecast such as fuzziness, correlation, nonlinear and real-time as well as decline of generalization capacity of neural network in prediction with few observations, a method of pests forecasting using the method of neural network based on fuzzy clustering was proposed in this experiment. The simulation results demonstrated that the method was simple and practical and could forecast pests fast and accurately, particularly, the method could obtain good results with few samples and samples correlation.
基金Supported by the National Natural Science Foundation of China(61273209)
文摘Due to the limitation and hesitation in one's knowledge, the membership degree of an element to a given set usually has a few different values, in which the conventional fuzzy sets are invalid. Hesitant fuzzy sets are a powerful tool to treat this case. The present paper focuses on investigating the clustering technique for hesitant fuzzy sets based on the K-means clustering algorithm which takes the results of hierarchical clustering as the initial clusters. Finally, two examples demonstrate the validity of our algorithm.
基金supported by the National Natural Science Foundation of China for Distinguished Young Scholars(70625005)
文摘Intuitionistic fuzzy sets(IFSs) are useful means to describe and deal with vague and uncertain data.An intuitionistic fuzzy C-means algorithm to cluster IFSs is developed.In each stage of the intuitionistic fuzzy C-means method the seeds are modified,and for each IFS a membership degree to each of the clusters is estimated.In the end of the algorithm,all the given IFSs are clustered according to the estimated membership degrees.Furthermore,the algorithm is extended for clustering interval-valued intuitionistic fuzzy sets(IVIFSs).Finally,the developed algorithms are illustrated through conducting experiments on both the real-world and simulated data sets.
基金supported by National Natural Science Foundation of China(Grant No.51105040)Aeronautic Science Foundation of China(Grant No.2011ZA72003)Excellent Young Scholars Research Fund of Beijing Institute of Technology(Grant No.2010Y0102)
文摘High fidelity analysis are utilized in modern engineering design optimization problems which involve expensive black-box models.For computation-intensive engineering design problems,efficient global optimization methods must be developed to relieve the computational burden.A new metamodel-based global optimization method using fuzzy clustering for design space reduction(MGO-FCR) is presented.The uniformly distributed initial sample points are generated by Latin hypercube design to construct the radial basis function metamodel,whose accuracy is improved with increasing number of sample points gradually.Fuzzy c-mean method and Gath-Geva clustering method are applied to divide the design space into several small interesting cluster spaces for low and high dimensional problems respectively.Modeling efficiency and accuracy are directly related to the design space,so unconcerned spaces are eliminated by the proposed reduction principle and two pseudo reduction algorithms.The reduction principle is developed to determine whether the current design space should be reduced and which space is eliminated.The first pseudo reduction algorithm improves the speed of clustering,while the second pseudo reduction algorithm ensures the design space to be reduced.Through several numerical benchmark functions,comparative studies with adaptive response surface method,approximated unimodal region elimination method and mode-pursuing sampling are carried out.The optimization results reveal that this method captures the real global optimum for all the numerical benchmark functions.And the number of function evaluations show that the efficiency of this method is favorable especially for high dimensional problems.Based on this global design optimization method,a design optimization of a lifting surface in high speed flow is carried out and this method saves about 10 h compared with genetic algorithms.This method possesses favorable performance on efficiency,robustness and capability of global convergence and gives a new optimization strategy for engineering design optimization problems involving expensive black box models.
基金Supported by National Key R&D Projects(Grant No.2018YFB0905500)National Natural Science Foundation of China(Grant No.51875498)+1 种基金Hebei Provincial Natural Science Foundation of China(Grant Nos.E2018203439,E2018203339,F2016203496)Key Scientific Research Projects Plan of Henan Higher Education Institutions(Grant No.19B460001)
文摘Based on Multi-Masking Empirical Mode Decomposition (MMEMD) and fuzzy c-means (FCM) clustering, a new method of wind turbine bearing fault diagnosis FCM-MMEMD is proposed, which can determine the fault accurately and timely. First, FCM clustering is employed to classify the data into different clusters, which helps to estimate whether there is a fault and how many fault types there are. If fault signals exist, the fault vibration signals are then demodulated and decomposed into different frequency bands by MMEMD in order to be analyzed further. In order to overcome the mode mixing defect of empirical mode decomposition (EMD), a novel method called MMEMD is proposed. It is an improvement to masking empirical mode decomposition (MEMD). By adding multi-masking signals to the signals to be decomposed in different levels, it can restrain low-frequency components from mixing in highfrequency components effectively in the sifting process and then suppress the mode mixing. It has the advantages of easy implementation and strong ability of suppressing modal mixing. The fault type is determined by Hilbert envelope finally. The results of simulation signal decomposition showed the high performance of MMEMD. Experiments of bearing fault diagnosis in wind turbine bearing fault diagnosis proved the validity and high accuracy of the new method.
基金Supported by National Natural Science Foundation of China(Grant No.11372036)
文摘Reduced order models(ROMs) based on the snapshots on the CFD high-fidelity simulations have been paid great attention recently due to their capability of capturing the features of the complex geometries and flow configurations. To improve the efficiency and precision of the ROMs, it is indispensable to add extra sampling points to the initial snapshots, since the number of sampling points to achieve an adequately accurate ROM is generally unknown in prior, but a large number of initial sampling points reduces the parsimony of the ROMs. A fuzzy-clustering-based adding-point strategy is proposed and the fuzzy clustering acts an indicator of the region in which the precision of ROMs is relatively low. The proposed method is applied to construct the ROMs for the benchmark mathematical examples and a numerical example of hypersonic aerothermodynamics prediction for a typical control surface. The proposed method can achieve a 34.5% improvement on the efficiency than the estimated mean squared error prediction algorithm and shows same-level prediction accuracy.
基金supported by National Natural Science Foundation of China (Grant Nos. 50875024,51105040)Excellent Young Scholars Research Fund of Beijing Institute of Technology,China (Grant No.2010Y0102)Defense Creative Research Group Foundation of China(Grant No. GFTD0803)
文摘High fidelity analysis models,which are beneficial to improving the design quality,have been more and more widely utilized in the modern engineering design optimization problems.However,the high fidelity analysis models are so computationally expensive that the time required in design optimization is usually unacceptable.In order to improve the efficiency of optimization involving high fidelity analysis models,the optimization efficiency can be upgraded through applying surrogates to approximate the computationally expensive models,which can greately reduce the computation time.An efficient heuristic global optimization method using adaptive radial basis function(RBF) based on fuzzy clustering(ARFC) is proposed.In this method,a novel algorithm of maximin Latin hypercube design using successive local enumeration(SLE) is employed to obtain sample points with good performance in both space-filling and projective uniformity properties,which does a great deal of good to metamodels accuracy.RBF method is adopted for constructing the metamodels,and with the increasing the number of sample points the approximation accuracy of RBF is gradually enhanced.The fuzzy c-means clustering method is applied to identify the reduced attractive regions in the original design space.The numerical benchmark examples are used for validating the performance of ARFC.The results demonstrates that for most application examples the global optima are effectively obtained and comparison with adaptive response surface method(ARSM) proves that the proposed method can intuitively capture promising design regions and can efficiently identify the global or near-global design optimum.This method improves the efficiency and global convergence of the optimization problems,and gives a new optimization strategy for engineering design optimization problems involving computationally expensive models.
基金supported by the National Natural Science Foundation of China (70571087)the National Science Fund for Distinguished Young Scholars of China (70625005)
文摘Intuitionistic fuzzy set (IFS) is a set of 2-tuple arguments, each of which is characterized by a membership degree and a nonmembership degree. The generalized form of IFS is interval-valued intuitionistic fuzzy set (IVIFS), whose components are intervals rather than exact numbers. IFSs and IVIFSs have been found to be very useful to describe vagueness and uncertainty. However, it seems that little attention has been focused on the clustering analysis of IFSs and IVIFSs. An intuitionistic fuzzy hierarchical algorithm is introduced for clustering IFSs, which is based on the traditional hierarchical clustering procedure, the intuitionistic fuzzy aggregation operator, and the basic distance measures between IFSs: the Hamming distance, normalized Hamming, weighted Hamming, the Euclidean distance, the normalized Euclidean distance, and the weighted Euclidean distance. Subsequently, the algorithm is extended for clustering IVIFSs. Finally the algorithm and its extended form are applied to the classifications of building materials and enterprises respectively.