In this paper,we introduce a novel Multi-scale and Auto-tuned Semi-supervised Deep Subspace Clustering(MAS-DSC)algorithm,aimed at addressing the challenges of deep subspace clustering in high-dimensional real-world da...In this paper,we introduce a novel Multi-scale and Auto-tuned Semi-supervised Deep Subspace Clustering(MAS-DSC)algorithm,aimed at addressing the challenges of deep subspace clustering in high-dimensional real-world data,particularly in the field of medical imaging.Traditional deep subspace clustering algorithms,which are mostly unsupervised,are limited in their ability to effectively utilize the inherent prior knowledge in medical images.Our MAS-DSC algorithm incorporates a semi-supervised learning framework that uses a small amount of labeled data to guide the clustering process,thereby enhancing the discriminative power of the feature representations.Additionally,the multi-scale feature extraction mechanism is designed to adapt to the complexity of medical imaging data,resulting in more accurate clustering performance.To address the difficulty of hyperparameter selection in deep subspace clustering,this paper employs a Bayesian optimization algorithm for adaptive tuning of hyperparameters related to subspace clustering,prior knowledge constraints,and model loss weights.Extensive experiments on standard clustering datasets,including ORL,Coil20,and Coil100,validate the effectiveness of the MAS-DSC algorithm.The results show that with its multi-scale network structure and Bayesian hyperparameter optimization,MAS-DSC achieves excellent clustering results on these datasets.Furthermore,tests on a brain tumor dataset demonstrate the robustness of the algorithm and its ability to leverage prior knowledge for efficient feature extraction and enhanced clustering performance within a semi-supervised learning framework.展开更多
Active learning in semi-supervised classification involves introducing additional labels for unlabelled data to improve the accuracy of the underlying classifier.A challenge is to identify which points to label to bes...Active learning in semi-supervised classification involves introducing additional labels for unlabelled data to improve the accuracy of the underlying classifier.A challenge is to identify which points to label to best improve performance while limiting the number of new labels."Model Change"active learning quantifies the resulting change incurred in the classifier by introducing the additional label(s).We pair this idea with graph-based semi-supervised learning(SSL)methods,that use the spectrum of the graph Laplacian matrix,which can be truncated to avoid prohibitively large computational and storage costs.We consider a family of convex loss functions for which the acquisition function can be efficiently approximated using the Laplace approximation of the posterior distribution.We show a variety of multiclass examples that illustrate improved performance over prior state-of-art.展开更多
Offboard active decoys(OADs)can effectively jam monopulse radars.However,for missiles approaching from a particular direction and distance,the OAD should be placed at a specific location,posing high requirements for t...Offboard active decoys(OADs)can effectively jam monopulse radars.However,for missiles approaching from a particular direction and distance,the OAD should be placed at a specific location,posing high requirements for timing and deployment.To improve the response speed and jamming effect,a cluster of OADs based on an unmanned surface vehicle(USV)is proposed.The formation of the cluster determines the effectiveness of jamming.First,based on the mechanism of OAD jamming,critical conditions are identified,and a method for assessing the jamming effect is proposed.Then,for the optimization of the cluster formation,a mathematical model is built,and a multi-tribe adaptive particle swarm optimization algorithm based on mutation strategy and Metropolis criterion(3M-APSO)is designed.Finally,the formation optimization problem is solved and analyzed using the 3M-APSO algorithm under specific scenarios.The results show that the improved algorithm has a faster convergence rate and superior performance as compared to the standard Adaptive-PSO algorithm.Compared with a single OAD,the optimal formation of USV-OAD cluster effectively fills the blind area and maximizes the use of jamming resources.展开更多
In the face of a growing number of large-scale data sets, affinity propagation clustering algorithm to calculate the process required to build the similarity matrix, will bring huge storage and computation. Therefore,...In the face of a growing number of large-scale data sets, affinity propagation clustering algorithm to calculate the process required to build the similarity matrix, will bring huge storage and computation. Therefore, this paper proposes an improved affinity propagation clustering algorithm. First, add the subtraction clustering, using the density value of the data points to obtain the point of initial clusters. Then, calculate the similarity distance between the initial cluster points, and reference the idea of semi-supervised clustering, adding pairs restriction information, structure sparse similarity matrix. Finally, the cluster representative points conduct AP clustering until a suitable cluster division.Experimental results show that the algorithm allows the calculation is greatly reduced, the similarity matrix storage capacity is also reduced, and better than the original algorithm on the clustering effect and processing speed.展开更多
Clustering is a crucial method for deciphering data structure and producing new information.Due to its significance in revealing fundamental connections between the human brain and events,it is essential to utilize cl...Clustering is a crucial method for deciphering data structure and producing new information.Due to its significance in revealing fundamental connections between the human brain and events,it is essential to utilize clustering for cognitive research.Dealing with noisy data caused by inaccurate synthesis from several sources or misleading data production processes is one of the most intriguing clustering difficulties.Noisy data can lead to incorrect object recognition and inference.This research aims to innovate a novel clustering approach,named Picture-Neutrosophic Trusted Safe Semi-Supervised Fuzzy Clustering(PNTS3FCM),to solve the clustering problem with noisy data using neutral and refusal degrees in the definition of Picture Fuzzy Set(PFS)and Neutrosophic Set(NS).Our contribution is to propose a new optimization model with four essential components:clustering,outlier removal,safe semi-supervised fuzzy clustering and partitioning with labeled and unlabeled data.The effectiveness and flexibility of the proposed technique are estimated and compared with the state-of-art methods,standard Picture fuzzy clustering(FC-PFS)and Confidence-weighted safe semi-supervised clustering(CS3FCM)on benchmark UCI datasets.The experimental results show that our method is better at least 10/15 datasets than the compared methods in terms of clustering quality and computational time.展开更多
Clustering analysis is one of the main concerns in data mining.A common approach to the clustering process is to bring together points that are close to each other and separate points that are away from each other.The...Clustering analysis is one of the main concerns in data mining.A common approach to the clustering process is to bring together points that are close to each other and separate points that are away from each other.Therefore,measuring the distance between sample points is crucial to the effectiveness of clustering.Filtering features by label information and mea-suring the distance between samples by these features is a common supervised learning method to reconstruct distance metric.However,in many application scenarios,it is very expensive to obtain a large number of labeled samples.In this paper,to solve the clustering problem in the few supervised sample and high data dimensionality scenarios,a novel semi-supervised clustering algorithm is proposed by designing an improved prototype network that attempts to reconstruct the distance metric in the sample space with a small amount of pairwise supervised information,such as Must-Link and Cannot-Link,and then cluster the data in the new metric space.The core idea is to make the similar ones closer and the dissimilar ones further away through embedding mapping.Extensive experiments on both real-world and synthetic datasets show the effectiveness of this algorithm.Average clustering metrics on various datasets improved by 8%compared to the comparison algorithm.展开更多
The majority of big data analytics applied to transportation datasets suffer from being too domain-specific,that is,they draw conclusions for a dataset based on analytics on the same dataset.This makes models trained ...The majority of big data analytics applied to transportation datasets suffer from being too domain-specific,that is,they draw conclusions for a dataset based on analytics on the same dataset.This makes models trained from one domain(e.g.taxi data)applies badly to a different domain(e.g.Uber data).To achieve accurate analyses on a new domain,substantial amounts of data must be available,which limits practical applications.To remedy this,we propose to use semi-supervised and active learning of big data to accomplish the domain adaptation task:Selectively choosing a small amount of datapoints from a new domain while achieving comparable performances to using all the datapoints.We choose the New York City(NYC)transportation data of taxi and Uber as our dataset,simulating different domains with 90%as the source data domain for training and the remaining 10%as the target data domain for evaluation.We propose semi-supervised and active learning strategies and apply it to the source domain for selecting datapoints.Experimental results show that our adaptation achieves a comparable performance of using all datapoints while using only a fraction of them,substantially reducing the amount of data required.Our approach has two major advantages:It can make accurate analytics and predictions when big datasets are not available,and even if big datasets are available,our approach chooses the most informative datapoints out of the dataset,making the process much more efficient without having to process huge amounts of data.展开更多
Microbial population and enzyme activities are the significant indicators of soil strength.Soil microbial dynamics characterize microbial population and enzyme activities.The present study explores the development of ...Microbial population and enzyme activities are the significant indicators of soil strength.Soil microbial dynamics characterize microbial population and enzyme activities.The present study explores the development of efficient predictive modeling systems for the estimation of specific soil microbial dynamics,like rock phosphate solubilization,bacterial population,and ACC-deaminase activity.More specifically,optimized subtractive clustering(SC)and Wang and Mendel's(WM)fuzzy inference systems(FIS)have been implemented with the objective to achieve the best estimation accuracy of microbial dynamics.Experimental measurements were performed using controlled pot experiment using minimal salt media with rock phosphate as sole carbon source inoculated with phosphate solubilizing microorganism in order to estimate rock phosphate solubilization potential of selected strains.Three experimental parameters,including temperature,pH,and incubation period have been used as inputs SC-FIS and WM-FIS.The better performance of the SC-FIS has been observed as compared to the WM-FIS in the estimation of phosphate solubilization and bacterial population with the maximum value of the coefficient of determination(0.9988)2 R=in the estimation of previous microbial dynamics.展开更多
Semi-supervised clustering improves learning performance as long as it uses a small number of labeled samples to assist un-tagged samples for learning.This paper implements and compares unsupervised and semi-supervise...Semi-supervised clustering improves learning performance as long as it uses a small number of labeled samples to assist un-tagged samples for learning.This paper implements and compares unsupervised and semi-supervised clustering analysis of BOA-Argo ocean text data.Unsupervised K-Means and Affinity Propagation(AP)are two classical clustering algorithms.The Election-AP algorithm is proposed to handle the final cluster number in AP clustering as it has proved to be difficult to control in a suitable range.Semi-supervised samples thermocline data in the BOA-Argo dataset according to the thermocline standard definition,and use this data for semi-supervised cluster analysis.Several semi-supervised clustering algorithms were chosen for comparison of learning performance:Constrained-K-Means,Seeded-K-Means,SAP(Semi-supervised Affinity Propagation),LSAP(Loose Seed AP)and CSAP(Compact Seed AP).In order to adapt the single label,this paper improves the above algorithms to SCKM(improved Constrained-K-Means),SSKM(improved Seeded-K-Means),and SSAP(improved Semi-supervised Affinity Propagationg)to perform semi-supervised clustering analysis on the data.A DSAP(Double Seed AP)semi-supervised clustering algorithm based on compact seeds is proposed as the experimental data shows that DSAP has a better clustering effect.The unsupervised and semi-supervised clustering results are used to analyze the potential patterns of marine data.展开更多
With the rapid development of WLAN( Wireless Local Area Network) technology,an important target of indoor positioning systems is to improve the positioning accuracy while reducing the online computation.In this paper,...With the rapid development of WLAN( Wireless Local Area Network) technology,an important target of indoor positioning systems is to improve the positioning accuracy while reducing the online computation.In this paper,it proposes a novel fingerprint positioning algorithm known as semi-supervised affinity propagation clustering based on distance function constraints. We show that by employing affinity propagation techniques,it is able to use a fractional labeled data to adjust similarity matrix of signal space to cluster reference points with high accuracy. The semi-supervised APC uses a combination of machine learning,clustering analysis and fingerprinting algorithm. By collecting data and testing our algorithm in a realistic indoor WLAN environment,the experimental results indicate that the proposed algorithm can improve positioning accuracy while reduce the online localization computation,as compared with the widely used K nearest neighbor and maximum likelihood estimation algorithms.展开更多
A clustering algorithm for semi-supervised affinity propagation based on layered combination is proposed in this paper in light of existing flaws. To improve accuracy of the algorithm,it introduces the idea of layered...A clustering algorithm for semi-supervised affinity propagation based on layered combination is proposed in this paper in light of existing flaws. To improve accuracy of the algorithm,it introduces the idea of layered combination, divides an affinity propagation clustering( APC) process into several hierarchies evenly,draws samples from data of each hierarchy according to weight,and executes semi-supervised learning through construction of pairwise constraints and use of submanifold label mapping,weighting and combining clustering results of all hierarchies by combined promotion. It is shown by theoretical analysis and experimental result that clustering accuracy and computation complexity of the semi-supervised affinity propagation clustering algorithm based on layered combination( SAP-LC algorithm) have been greatly improved.展开更多
There are rich emergent phase behaviors in non-equilibrium active systems.Flocking and clustering are two representative dynamic phases.The relationship between both the phases is still unclear.Herein,we numerically i...There are rich emergent phase behaviors in non-equilibrium active systems.Flocking and clustering are two representative dynamic phases.The relationship between both the phases is still unclear.Herein,we numerically investigate the evolution of flocking and clustering in a system consisting of self-propelled particles with active reorientation.We consider the interplay between flocking and clustering phases with different initial configurations,and observe a domain in steady state order parameter phase diagrams sensitive to the choice of initial configurations.Specifically,by tuning the initial degree of polar ordering,either a more ordered flocking or a disordered clustering state can be observed in the steady state.These results enlighten us to manipulate emergent behaviors and collective motions of an active system,and are qualitatively different from the emergence of a new bi-stable regime observed in aligned active particles due to an explicit attraction[New J.Phys.14073033(2012)].展开更多
The classification of the Northeast China Cold Vortex(NCCV)activity paths is an important way to analyze its characteristics in detail.Based on the daily precipitation data of the northeastern China(NEC)region,and the...The classification of the Northeast China Cold Vortex(NCCV)activity paths is an important way to analyze its characteristics in detail.Based on the daily precipitation data of the northeastern China(NEC)region,and the atmospheric circulation field and temperature field data of ERA-Interim for every six hours,the NCCV processes during the early summer(June)seasons from 1979 to 2018 were objectively identified.Then,the NCCV processes were classified using a machine learning method(k-means)according to the characteristic parameters of the activity path information.The rationality of the classification results was verified from two aspects,as follows:(1)the atmospheric circulation configuration of the NCCV on various paths;and(2)its influences on the climate conditions in the NEC.The obtained results showed that the activity paths of the NCCV could be divided into four types according to such characteristics as the generation origin,movement direction,and movement velocity of the NCCV.These included the generation-eastward movement type in the east of the Mongolia Plateau(eastward movement type or type A);generation-southeast longdistance movement type in the upstream of the Lena River(southeast long-distance movement type or type B);generationeastward less-movement type near Lake Baikal(eastward less-movement type or type C);and the generation-southward less-movement type in eastern Siberia(southward less-movement type or type D).There were obvious differences observed in the atmospheric circulation configuration and the climate impact of the NCCV on the four above-mentioned types of paths,which indicated that the classification results were reasonable.展开更多
The volcanic cluster in Arshan, inner Mongolia, is located in the west of the middle section of the Da Hinggan Mountains. There are more than forty Cenozoic volcanoes among which the Yanshan Volcano and Gaoshan Volcan...The volcanic cluster in Arshan, inner Mongolia, is located in the west of the middle section of the Da Hinggan Mountains. There are more than forty Cenozoic volcanoes among which the Yanshan Volcano and Gaoshan Volcano are the active ones in broad sense and basaltic central vents. Arshan is a newly found volcanic active region in the Chinese continent. The volcanoes are perfectly preserved and composed of cinder cones, pyroclastic sheets and lava flows. Their cones are grand and the Gaoshan cone is about 362m high, and the depth of the Yanshan crater is about 140m. The pyroclastic sheet is mainly made up of scoria, and the distribution area of scoria with thickness more than i m is about 27km^2 . There are two Carbonized-wood sites in the pyroclastic sheet and the ^14C datings indicate ages of 1990±100a B. P and 1900 ±70a B. P, which are rectified by dendrodating. Basaltic lava flows are uncovered, and they change from pahoehoe in the early stage to aa in the later stage. There are lots of perfect fumarolic cones, fumarolic dishes and lava tumulus in the front zones. The spread of lava flow is controlled by the local topography and its main body flowed northwestwards covering the Holocene rivers and swamp deposits and blocked up the Halahahe river and its branches to create six lava-dam lakes. For these distinguishing features, Arshan volcanic cluster could be called another natural “Volcano Museum”.展开更多
The challenges posed by energy and environmental issues have forced mankind to explore and utilize unconventional energy sources.It is imperative to convert the abundant coalbed gas(CBG)into high value-added products,...The challenges posed by energy and environmental issues have forced mankind to explore and utilize unconventional energy sources.It is imperative to convert the abundant coalbed gas(CBG)into high value-added products,i.e.,selective and efficient conversion of methane from CBG.Methane activation,known as the“holy grail”,poses a challenge to the design and development of catalysts.The structural complexity of the active metal on the carrier is of particular concern.In this work,we have studied the nucleation growth of small Co clusters(up to Co_(6))on the surface of CeO_(2)(110)using density functional theory,from which a stable loaded Co/CeO_(2)(110)structure was selected to investigate the methane activation mechanism.Despite the relatively small size of the selected Co clusters,the obtained Co_(x)/CeO_(2)(110)exhibits interesting properties.The optimized Co_(5)/CeO_(2)(110)structure was selected as the optimal structure to study the activation mechanism of methane due to its competitive electronic structure,adsorption energy and binding energy.The energy barriers for the stepwise dissociation of methane to form CH3^(*),CH2^(*),CH^(*),and C^(*)radical fragments are 0.44,0.55,0.31,and 1.20 eV,respectively,indicating that CH^(*)dissociative dehydrogenation is the rate-determining step for the system under investigation here.This fundamental study of metal-support interactions based on Co growth on the CeO_(2)(110)surface contributes to the understanding of the essence of Co/CeO_(2) catalysts with promising catalytic behavior.It provides theoretical guidance for better designing the optimal Co/CeO_(2) catalyst for tailored catalytic reactions.展开更多
In this paper, CiteSpace, a bibliometrics software, was adopted to collect research papers published on the Web of Science, which are relevant to biological model and effluent quality prediction in activated sludge pr...In this paper, CiteSpace, a bibliometrics software, was adopted to collect research papers published on the Web of Science, which are relevant to biological model and effluent quality prediction in activated sludge process in the wastewater treatment. By the way of trend map, keyword knowledge map, and co-cited knowledge map, specific visualization analysis and identification of the authors, institutions and regions were concluded. Furthermore, the topics and hotspots of water quality prediction in activated sludge process through the literature-co-citation-based cluster analysis and literature citation burst analysis were also determined, which not only reflected the historical evolution progress to a certain extent, but also provided the direction and insight of the knowledge structure of water quality prediction and activated sludge process for future research.展开更多
A new kind of hydrazone (I) diastereoisomers was prepared with enantiomeric hydazide (II) and chiral cluster (III), which was characterized by HMBC. Unfortunately, the mixture could not be separated into pure diastere...A new kind of hydrazone (I) diastereoisomers was prepared with enantiomeric hydazide (II) and chiral cluster (III), which was characterized by HMBC. Unfortunately, the mixture could not be separated into pure diastereoisomer. This could be a direction to separate the racemic chiral clusters.展开更多
Human Activity Recognition (HAR) is an important way for lower limb exoskeleton robots to implement human-computer collaboration with users. Most of the existing methods in this field focus on a simple scenario recogn...Human Activity Recognition (HAR) is an important way for lower limb exoskeleton robots to implement human-computer collaboration with users. Most of the existing methods in this field focus on a simple scenario recognizing activities for specific users, which does not consider the individual differences among users and cannot adapt to new users. In order to improve the generalization ability of HAR model, this paper proposes a novel method that combines the theories in transfer learning and active learning to mitigate the cross-subject issue, so that it can enable lower limb exoskeleton robots being used in more complex scenarios. First, a neural network based on convolutional neural networks (CNN) is designed, which can extract temporal and spatial features from sensor signals collected from different parts of human body. It can recognize human activities with high accuracy after trained by labeled data. Second, in order to improve the cross-subject adaptation ability of the pre-trained model, we design a cross-subject HAR algorithm based on sparse interrogation and label propagation. Through leave-one-subject-out validation on two widely-used public datasets with existing methods, our method achieves average accuracies of 91.77% on DSAD and 80.97% on PAMAP2, respectively. The experimental results demonstrate the potential of implementing cross-subject HAR for lower limb exoskeleton robots.展开更多
Based on the earthquake data of 11 active intraplate fault zones of the Chinese mainland, we have studied the earthquake recurrence behaviors on entire active fault zones and their relations to those on individual fau...Based on the earthquake data of 11 active intraplate fault zones of the Chinese mainland, we have studied the earthquake recurrence behaviors on entire active fault zones and their relations to those on individual fault-segments. The results show that the earthquake recurrence on entire active fault zones, each of them is made up of multiple segments, displays three types of behavior, i.e., the clustering behavior, the random behavior, and the poor quasi-periodic behavior. The major one is the sparse clustering behavior, its recurrence process often exhibits that clusters (active periods) and gaps (quiescent periods) occur alternatively in varying degrees. The recurrence intervals within and between clusters, the durations of individual clusters, the earthquake number and strength of every cluster are all variable. The recurrence process is non-linear, there is neither the strength-time dependence nor the time-strength dependence. However, the earthquake recurrence processes on individual fault-segments are much more simple, and mainly display either the quasi-periodic or the time-predictable behaviors. Also, this study further discovers that the temporal clustering in earthquake recurrence process on entire fault zones is mainly caused by the rupture 'contagion' on different fault-segments within relatively short periods of time. Along active fault zones, the degree and orientation of rupture 'contagion' may vary with different seismic cycles, and the 'contagion' seems to be able to jump over unbroken 'gaps' on the fault zones.展开更多
In this review,the history and outlook of gas-phase CO_(2)activation using single electrons,metal atoms,clusters(mainly metal hydride clusters),and molecules are discussed on both of the experimental and theoretical f...In this review,the history and outlook of gas-phase CO_(2)activation using single electrons,metal atoms,clusters(mainly metal hydride clusters),and molecules are discussed on both of the experimental and theoretical fronts.Although the development of bulk solid-state materials for the activation and conversion of CO_(2)into value-added products have enjoyed great success in the past several decades,this review focuses only on gas-phase studies,because isolated,well-defined gas-phase systems are ideally suited for high-resolution experiments using state-of-the-art spectrometric and spectroscopic techniques,and for simulations employing modern quantum theoretical methods.The unmatched high complementarity and comparability of experiment and theory in the case of gas-phase investigations bear an enormous potential in providing insights in the reactions of CO_(2)activation at the atomic level.In all of these examples,the reduction and bending of the inert neutral CO_(2)molecule is the critical step determined by the frontier orbitals of reaction participants.Based on the results and outlook summarized in this review,we anticipate that studies of gas-phase CO_(2)activations will be an avenue rich with opportunities for the rational design of novel catalysts based on the knowledge obtained on the atomic level.展开更多
基金supported in part by the National Natural Science Foundation of China under Grant 62171203in part by the Jiangsu Province“333 Project”High-Level Talent Cultivation Subsidized Project+2 种基金in part by the SuzhouKey Supporting Subjects for Health Informatics under Grant SZFCXK202147in part by the Changshu Science and Technology Program under Grants CS202015 and CS202246in part by Changshu Key Laboratory of Medical Artificial Intelligence and Big Data under Grants CYZ202301 and CS202314.
文摘In this paper,we introduce a novel Multi-scale and Auto-tuned Semi-supervised Deep Subspace Clustering(MAS-DSC)algorithm,aimed at addressing the challenges of deep subspace clustering in high-dimensional real-world data,particularly in the field of medical imaging.Traditional deep subspace clustering algorithms,which are mostly unsupervised,are limited in their ability to effectively utilize the inherent prior knowledge in medical images.Our MAS-DSC algorithm incorporates a semi-supervised learning framework that uses a small amount of labeled data to guide the clustering process,thereby enhancing the discriminative power of the feature representations.Additionally,the multi-scale feature extraction mechanism is designed to adapt to the complexity of medical imaging data,resulting in more accurate clustering performance.To address the difficulty of hyperparameter selection in deep subspace clustering,this paper employs a Bayesian optimization algorithm for adaptive tuning of hyperparameters related to subspace clustering,prior knowledge constraints,and model loss weights.Extensive experiments on standard clustering datasets,including ORL,Coil20,and Coil100,validate the effectiveness of the MAS-DSC algorithm.The results show that with its multi-scale network structure and Bayesian hyperparameter optimization,MAS-DSC achieves excellent clustering results on these datasets.Furthermore,tests on a brain tumor dataset demonstrate the robustness of the algorithm and its ability to leverage prior knowledge for efficient feature extraction and enhanced clustering performance within a semi-supervised learning framework.
基金supported by the DOD National Defense Science and Engineering Graduate(NDSEG)Research Fellowshipsupported by the NGA under Contract No.HM04762110003.
文摘Active learning in semi-supervised classification involves introducing additional labels for unlabelled data to improve the accuracy of the underlying classifier.A challenge is to identify which points to label to best improve performance while limiting the number of new labels."Model Change"active learning quantifies the resulting change incurred in the classifier by introducing the additional label(s).We pair this idea with graph-based semi-supervised learning(SSL)methods,that use the spectrum of the graph Laplacian matrix,which can be truncated to avoid prohibitively large computational and storage costs.We consider a family of convex loss functions for which the acquisition function can be efficiently approximated using the Laplace approximation of the posterior distribution.We show a variety of multiclass examples that illustrate improved performance over prior state-of-art.
基金the National Natural Science Foundation of China(Grant No.62101579).
文摘Offboard active decoys(OADs)can effectively jam monopulse radars.However,for missiles approaching from a particular direction and distance,the OAD should be placed at a specific location,posing high requirements for timing and deployment.To improve the response speed and jamming effect,a cluster of OADs based on an unmanned surface vehicle(USV)is proposed.The formation of the cluster determines the effectiveness of jamming.First,based on the mechanism of OAD jamming,critical conditions are identified,and a method for assessing the jamming effect is proposed.Then,for the optimization of the cluster formation,a mathematical model is built,and a multi-tribe adaptive particle swarm optimization algorithm based on mutation strategy and Metropolis criterion(3M-APSO)is designed.Finally,the formation optimization problem is solved and analyzed using the 3M-APSO algorithm under specific scenarios.The results show that the improved algorithm has a faster convergence rate and superior performance as compared to the standard Adaptive-PSO algorithm.Compared with a single OAD,the optimal formation of USV-OAD cluster effectively fills the blind area and maximizes the use of jamming resources.
基金This research has been partially supported by the national natural science foundation of China (51175169) and the national science and technology support program (2012BAF02B01).
文摘In the face of a growing number of large-scale data sets, affinity propagation clustering algorithm to calculate the process required to build the similarity matrix, will bring huge storage and computation. Therefore, this paper proposes an improved affinity propagation clustering algorithm. First, add the subtraction clustering, using the density value of the data points to obtain the point of initial clusters. Then, calculate the similarity distance between the initial cluster points, and reference the idea of semi-supervised clustering, adding pairs restriction information, structure sparse similarity matrix. Finally, the cluster representative points conduct AP clustering until a suitable cluster division.Experimental results show that the algorithm allows the calculation is greatly reduced, the similarity matrix storage capacity is also reduced, and better than the original algorithm on the clustering effect and processing speed.
基金This research is funded by Graduate University of Science and Technology under grant number GUST.STS.DT2020-TT01。
文摘Clustering is a crucial method for deciphering data structure and producing new information.Due to its significance in revealing fundamental connections between the human brain and events,it is essential to utilize clustering for cognitive research.Dealing with noisy data caused by inaccurate synthesis from several sources or misleading data production processes is one of the most intriguing clustering difficulties.Noisy data can lead to incorrect object recognition and inference.This research aims to innovate a novel clustering approach,named Picture-Neutrosophic Trusted Safe Semi-Supervised Fuzzy Clustering(PNTS3FCM),to solve the clustering problem with noisy data using neutral and refusal degrees in the definition of Picture Fuzzy Set(PFS)and Neutrosophic Set(NS).Our contribution is to propose a new optimization model with four essential components:clustering,outlier removal,safe semi-supervised fuzzy clustering and partitioning with labeled and unlabeled data.The effectiveness and flexibility of the proposed technique are estimated and compared with the state-of-art methods,standard Picture fuzzy clustering(FC-PFS)and Confidence-weighted safe semi-supervised clustering(CS3FCM)on benchmark UCI datasets.The experimental results show that our method is better at least 10/15 datasets than the compared methods in terms of clustering quality and computational time.
文摘Clustering analysis is one of the main concerns in data mining.A common approach to the clustering process is to bring together points that are close to each other and separate points that are away from each other.Therefore,measuring the distance between sample points is crucial to the effectiveness of clustering.Filtering features by label information and mea-suring the distance between samples by these features is a common supervised learning method to reconstruct distance metric.However,in many application scenarios,it is very expensive to obtain a large number of labeled samples.In this paper,to solve the clustering problem in the few supervised sample and high data dimensionality scenarios,a novel semi-supervised clustering algorithm is proposed by designing an improved prototype network that attempts to reconstruct the distance metric in the sample space with a small amount of pairwise supervised information,such as Must-Link and Cannot-Link,and then cluster the data in the new metric space.The core idea is to make the similar ones closer and the dissimilar ones further away through embedding mapping.Extensive experiments on both real-world and synthetic datasets show the effectiveness of this algorithm.Average clustering metrics on various datasets improved by 8%compared to the comparison algorithm.
文摘The majority of big data analytics applied to transportation datasets suffer from being too domain-specific,that is,they draw conclusions for a dataset based on analytics on the same dataset.This makes models trained from one domain(e.g.taxi data)applies badly to a different domain(e.g.Uber data).To achieve accurate analyses on a new domain,substantial amounts of data must be available,which limits practical applications.To remedy this,we propose to use semi-supervised and active learning of big data to accomplish the domain adaptation task:Selectively choosing a small amount of datapoints from a new domain while achieving comparable performances to using all the datapoints.We choose the New York City(NYC)transportation data of taxi and Uber as our dataset,simulating different domains with 90%as the source data domain for training and the remaining 10%as the target data domain for evaluation.We propose semi-supervised and active learning strategies and apply it to the source domain for selecting datapoints.Experimental results show that our adaptation achieves a comparable performance of using all datapoints while using only a fraction of them,substantially reducing the amount of data required.Our approach has two major advantages:It can make accurate analytics and predictions when big datasets are not available,and even if big datasets are available,our approach chooses the most informative datapoints out of the dataset,making the process much more efficient without having to process huge amounts of data.
文摘Microbial population and enzyme activities are the significant indicators of soil strength.Soil microbial dynamics characterize microbial population and enzyme activities.The present study explores the development of efficient predictive modeling systems for the estimation of specific soil microbial dynamics,like rock phosphate solubilization,bacterial population,and ACC-deaminase activity.More specifically,optimized subtractive clustering(SC)and Wang and Mendel's(WM)fuzzy inference systems(FIS)have been implemented with the objective to achieve the best estimation accuracy of microbial dynamics.Experimental measurements were performed using controlled pot experiment using minimal salt media with rock phosphate as sole carbon source inoculated with phosphate solubilizing microorganism in order to estimate rock phosphate solubilization potential of selected strains.Three experimental parameters,including temperature,pH,and incubation period have been used as inputs SC-FIS and WM-FIS.The better performance of the SC-FIS has been observed as compared to the WM-FIS in the estimation of phosphate solubilization and bacterial population with the maximum value of the coefficient of determination(0.9988)2 R=in the estimation of previous microbial dynamics.
基金This work was supported in part by the National Natural Science Foundation of China(51679105,61872160,51809112)“Thirteenth Five Plan”Science and Technology Project of Education Department,Jilin Province(JJKH20200990KJ).
文摘Semi-supervised clustering improves learning performance as long as it uses a small number of labeled samples to assist un-tagged samples for learning.This paper implements and compares unsupervised and semi-supervised clustering analysis of BOA-Argo ocean text data.Unsupervised K-Means and Affinity Propagation(AP)are two classical clustering algorithms.The Election-AP algorithm is proposed to handle the final cluster number in AP clustering as it has proved to be difficult to control in a suitable range.Semi-supervised samples thermocline data in the BOA-Argo dataset according to the thermocline standard definition,and use this data for semi-supervised cluster analysis.Several semi-supervised clustering algorithms were chosen for comparison of learning performance:Constrained-K-Means,Seeded-K-Means,SAP(Semi-supervised Affinity Propagation),LSAP(Loose Seed AP)and CSAP(Compact Seed AP).In order to adapt the single label,this paper improves the above algorithms to SCKM(improved Constrained-K-Means),SSKM(improved Seeded-K-Means),and SSAP(improved Semi-supervised Affinity Propagationg)to perform semi-supervised clustering analysis on the data.A DSAP(Double Seed AP)semi-supervised clustering algorithm based on compact seeds is proposed as the experimental data shows that DSAP has a better clustering effect.The unsupervised and semi-supervised clustering results are used to analyze the potential patterns of marine data.
基金Sponsored by the National Natural Science Foundation of China(Grant No.61101122 and 61071105)
文摘With the rapid development of WLAN( Wireless Local Area Network) technology,an important target of indoor positioning systems is to improve the positioning accuracy while reducing the online computation.In this paper,it proposes a novel fingerprint positioning algorithm known as semi-supervised affinity propagation clustering based on distance function constraints. We show that by employing affinity propagation techniques,it is able to use a fractional labeled data to adjust similarity matrix of signal space to cluster reference points with high accuracy. The semi-supervised APC uses a combination of machine learning,clustering analysis and fingerprinting algorithm. By collecting data and testing our algorithm in a realistic indoor WLAN environment,the experimental results indicate that the proposed algorithm can improve positioning accuracy while reduce the online localization computation,as compared with the widely used K nearest neighbor and maximum likelihood estimation algorithms.
基金the Science and Technology Research Program of Zhejiang Province,China(No.2011C21036)Projects in Science and Technology of Ningbo Municipal,China(No.2012B82003)+1 种基金Shanghai Natural Science Foundation,China(No.10ZR1400100)the National Undergraduate Training Programs for Innovation and Entrepreneurship,China(No.201410876011)
文摘A clustering algorithm for semi-supervised affinity propagation based on layered combination is proposed in this paper in light of existing flaws. To improve accuracy of the algorithm,it introduces the idea of layered combination, divides an affinity propagation clustering( APC) process into several hierarchies evenly,draws samples from data of each hierarchy according to weight,and executes semi-supervised learning through construction of pairwise constraints and use of submanifold label mapping,weighting and combining clustering results of all hierarchies by combined promotion. It is shown by theoretical analysis and experimental result that clustering accuracy and computation complexity of the semi-supervised affinity propagation clustering algorithm based on layered combination( SAP-LC algorithm) have been greatly improved.
基金support from the Beijing Computational Science Research Centersupported by the National Natural Science Foundation of China(Grant Nos.U2230402,11975050,11735005,and 11904320)。
文摘There are rich emergent phase behaviors in non-equilibrium active systems.Flocking and clustering are two representative dynamic phases.The relationship between both the phases is still unclear.Herein,we numerically investigate the evolution of flocking and clustering in a system consisting of self-propelled particles with active reorientation.We consider the interplay between flocking and clustering phases with different initial configurations,and observe a domain in steady state order parameter phase diagrams sensitive to the choice of initial configurations.Specifically,by tuning the initial degree of polar ordering,either a more ordered flocking or a disordered clustering state can be observed in the steady state.These results enlighten us to manipulate emergent behaviors and collective motions of an active system,and are qualitatively different from the emergence of a new bi-stable regime observed in aligned active particles due to an explicit attraction[New J.Phys.14073033(2012)].
基金This research was jointly supported by the National Natural Science Foundation of China(Grant No.42005037)the Liaoning Provincial Natural Science Foundation Project(PhD Start-up Research Fund 2019-BS-214),the Special Scientific Research Project for the Forecaster(Grant No.CMAYBY2018-018)+2 种基金a Key Technical Project of Liaoning Meteorological Bureau(Grant No.LNGJ201903)the National Key Research and Development Project(Grant No.2018YFC1505601)the Open Foundation Project of the Institute of Atmospheric Environment,China Meteorological Administration(Grant Nos.2020SYIAE08 and 2020SYIAEZD5).
文摘The classification of the Northeast China Cold Vortex(NCCV)activity paths is an important way to analyze its characteristics in detail.Based on the daily precipitation data of the northeastern China(NEC)region,and the atmospheric circulation field and temperature field data of ERA-Interim for every six hours,the NCCV processes during the early summer(June)seasons from 1979 to 2018 were objectively identified.Then,the NCCV processes were classified using a machine learning method(k-means)according to the characteristic parameters of the activity path information.The rationality of the classification results was verified from two aspects,as follows:(1)the atmospheric circulation configuration of the NCCV on various paths;and(2)its influences on the climate conditions in the NEC.The obtained results showed that the activity paths of the NCCV could be divided into four types according to such characteristics as the generation origin,movement direction,and movement velocity of the NCCV.These included the generation-eastward movement type in the east of the Mongolia Plateau(eastward movement type or type A);generation-southeast longdistance movement type in the upstream of the Lena River(southeast long-distance movement type or type B);generationeastward less-movement type near Lake Baikal(eastward less-movement type or type C);and the generation-southward less-movement type in eastern Siberia(southward less-movement type or type D).There were obvious differences observed in the atmospheric circulation configuration and the climate impact of the NCCV on the four above-mentioned types of paths,which indicated that the classification results were reasonable.
文摘The volcanic cluster in Arshan, inner Mongolia, is located in the west of the middle section of the Da Hinggan Mountains. There are more than forty Cenozoic volcanoes among which the Yanshan Volcano and Gaoshan Volcano are the active ones in broad sense and basaltic central vents. Arshan is a newly found volcanic active region in the Chinese continent. The volcanoes are perfectly preserved and composed of cinder cones, pyroclastic sheets and lava flows. Their cones are grand and the Gaoshan cone is about 362m high, and the depth of the Yanshan crater is about 140m. The pyroclastic sheet is mainly made up of scoria, and the distribution area of scoria with thickness more than i m is about 27km^2 . There are two Carbonized-wood sites in the pyroclastic sheet and the ^14C datings indicate ages of 1990±100a B. P and 1900 ±70a B. P, which are rectified by dendrodating. Basaltic lava flows are uncovered, and they change from pahoehoe in the early stage to aa in the later stage. There are lots of perfect fumarolic cones, fumarolic dishes and lava tumulus in the front zones. The spread of lava flow is controlled by the local topography and its main body flowed northwestwards covering the Holocene rivers and swamp deposits and blocked up the Halahahe river and its branches to create six lava-dam lakes. For these distinguishing features, Arshan volcanic cluster could be called another natural “Volcano Museum”.
基金National Natural Science Foundation of China(52174279)Analysis and Testing Foundation of Kunming University of Science and Technology(2022M20202202138)Yunnan Fundamental Research Projects(202301AU070027).
文摘The challenges posed by energy and environmental issues have forced mankind to explore and utilize unconventional energy sources.It is imperative to convert the abundant coalbed gas(CBG)into high value-added products,i.e.,selective and efficient conversion of methane from CBG.Methane activation,known as the“holy grail”,poses a challenge to the design and development of catalysts.The structural complexity of the active metal on the carrier is of particular concern.In this work,we have studied the nucleation growth of small Co clusters(up to Co_(6))on the surface of CeO_(2)(110)using density functional theory,from which a stable loaded Co/CeO_(2)(110)structure was selected to investigate the methane activation mechanism.Despite the relatively small size of the selected Co clusters,the obtained Co_(x)/CeO_(2)(110)exhibits interesting properties.The optimized Co_(5)/CeO_(2)(110)structure was selected as the optimal structure to study the activation mechanism of methane due to its competitive electronic structure,adsorption energy and binding energy.The energy barriers for the stepwise dissociation of methane to form CH3^(*),CH2^(*),CH^(*),and C^(*)radical fragments are 0.44,0.55,0.31,and 1.20 eV,respectively,indicating that CH^(*)dissociative dehydrogenation is the rate-determining step for the system under investigation here.This fundamental study of metal-support interactions based on Co growth on the CeO_(2)(110)surface contributes to the understanding of the essence of Co/CeO_(2) catalysts with promising catalytic behavior.It provides theoretical guidance for better designing the optimal Co/CeO_(2) catalyst for tailored catalytic reactions.
文摘In this paper, CiteSpace, a bibliometrics software, was adopted to collect research papers published on the Web of Science, which are relevant to biological model and effluent quality prediction in activated sludge process in the wastewater treatment. By the way of trend map, keyword knowledge map, and co-cited knowledge map, specific visualization analysis and identification of the authors, institutions and regions were concluded. Furthermore, the topics and hotspots of water quality prediction in activated sludge process through the literature-co-citation-based cluster analysis and literature citation burst analysis were also determined, which not only reflected the historical evolution progress to a certain extent, but also provided the direction and insight of the knowledge structure of water quality prediction and activated sludge process for future research.
基金We are grate ful to the National Natural Science Foundation of China for the f inancial support of this work.
文摘A new kind of hydrazone (I) diastereoisomers was prepared with enantiomeric hydazide (II) and chiral cluster (III), which was characterized by HMBC. Unfortunately, the mixture could not be separated into pure diastereoisomer. This could be a direction to separate the racemic chiral clusters.
文摘Human Activity Recognition (HAR) is an important way for lower limb exoskeleton robots to implement human-computer collaboration with users. Most of the existing methods in this field focus on a simple scenario recognizing activities for specific users, which does not consider the individual differences among users and cannot adapt to new users. In order to improve the generalization ability of HAR model, this paper proposes a novel method that combines the theories in transfer learning and active learning to mitigate the cross-subject issue, so that it can enable lower limb exoskeleton robots being used in more complex scenarios. First, a neural network based on convolutional neural networks (CNN) is designed, which can extract temporal and spatial features from sensor signals collected from different parts of human body. It can recognize human activities with high accuracy after trained by labeled data. Second, in order to improve the cross-subject adaptation ability of the pre-trained model, we design a cross-subject HAR algorithm based on sparse interrogation and label propagation. Through leave-one-subject-out validation on two widely-used public datasets with existing methods, our method achieves average accuracies of 91.77% on DSAD and 80.97% on PAMAP2, respectively. The experimental results demonstrate the potential of implementing cross-subject HAR for lower limb exoskeleton robots.
基金Chinese Joint Seismological Science Foundation !(95-07-423).
文摘Based on the earthquake data of 11 active intraplate fault zones of the Chinese mainland, we have studied the earthquake recurrence behaviors on entire active fault zones and their relations to those on individual fault-segments. The results show that the earthquake recurrence on entire active fault zones, each of them is made up of multiple segments, displays three types of behavior, i.e., the clustering behavior, the random behavior, and the poor quasi-periodic behavior. The major one is the sparse clustering behavior, its recurrence process often exhibits that clusters (active periods) and gaps (quiescent periods) occur alternatively in varying degrees. The recurrence intervals within and between clusters, the durations of individual clusters, the earthquake number and strength of every cluster are all variable. The recurrence process is non-linear, there is neither the strength-time dependence nor the time-strength dependence. However, the earthquake recurrence processes on individual fault-segments are much more simple, and mainly display either the quasi-periodic or the time-predictable behaviors. Also, this study further discovers that the temporal clustering in earthquake recurrence process on entire fault zones is mainly caused by the rupture 'contagion' on different fault-segments within relatively short periods of time. Along active fault zones, the degree and orientation of rupture 'contagion' may vary with different seismic cycles, and the 'contagion' seems to be able to jump over unbroken 'gaps' on the fault zones.
基金National Key R&D Program of China(2018YFE0115000)the National Natural Science Foundation of China(22003027 and 22174073)+2 种基金the NSF of Tianjin City(19JCYBJC19600)the Frontiers Science Center for New Organic Matter of Nankai University(63181206)supported by the Air Force Office of Scientific Research(AFOSR)under grant number,FA9550-19-1-0077(KHB)。
文摘In this review,the history and outlook of gas-phase CO_(2)activation using single electrons,metal atoms,clusters(mainly metal hydride clusters),and molecules are discussed on both of the experimental and theoretical fronts.Although the development of bulk solid-state materials for the activation and conversion of CO_(2)into value-added products have enjoyed great success in the past several decades,this review focuses only on gas-phase studies,because isolated,well-defined gas-phase systems are ideally suited for high-resolution experiments using state-of-the-art spectrometric and spectroscopic techniques,and for simulations employing modern quantum theoretical methods.The unmatched high complementarity and comparability of experiment and theory in the case of gas-phase investigations bear an enormous potential in providing insights in the reactions of CO_(2)activation at the atomic level.In all of these examples,the reduction and bending of the inert neutral CO_(2)molecule is the critical step determined by the frontier orbitals of reaction participants.Based on the results and outlook summarized in this review,we anticipate that studies of gas-phase CO_(2)activations will be an avenue rich with opportunities for the rational design of novel catalysts based on the knowledge obtained on the atomic level.