With the successive application of deep learning(DL)in classification tasks,the DL-based modulation classification method has become the preference for its state-of-the-art performance.Nevertheless,once the DL recogni...With the successive application of deep learning(DL)in classification tasks,the DL-based modulation classification method has become the preference for its state-of-the-art performance.Nevertheless,once the DL recognition model is pre-trained with fixed classes,the pre-trained model tends to predict incorrect results when identifying incremental classes.Moreover,the incremental classes are usually emergent without label information or only a few labeled samples of incremental classes can be obtained.In this context,we propose a graphbased semi-supervised approach to address the fewshot classes-incremental(FSCI)modulation classification problem.Our proposed method is a twostage learning method,specifically,a warm-up model is trained for classifying old classes and incremental classes,where the unlabeled samples of incremental classes are uniformly labeled with the same label to alleviate the damage of the class imbalance problem.Then the warm-up model is regarded as a feature extractor for constructing a similar graph to connect labeled samples and unlabeled samples,and the label propagation algorithm is adopted to propagate the label information from labeled nodes to unlabeled nodes in the graph to achieve the purpose of incremental classes recognition.Simulation results prove that the proposed method is superior to other finetuning methods and retrain methods.展开更多
Online target maneuver recognition is an important prerequisite for air combat situation recognition and maneuver decision-making.Conventional target maneuver recognition methods adopt mainly supervised learning metho...Online target maneuver recognition is an important prerequisite for air combat situation recognition and maneuver decision-making.Conventional target maneuver recognition methods adopt mainly supervised learning methods and assume that many sample labels are available.However,in real-world applications,manual sample labeling is often time-consuming and laborious.In addition,airborne sensors collecting target maneuver trajectory information in data streams often cannot process information in real time.To solve these problems,in this paper,an air combat target maneuver recognition model based on an online ensemble semi-supervised classification framework based on online learning,ensemble learning,semi-supervised learning,and Tri-training algorithm,abbreviated as Online Ensemble Semi-supervised Classification Framework(OESCF),is proposed.The framework is divided into four parts:basic classifier offline training stage,online recognition model initialization stage,target maneuver online recognition stage,and online model update stage.Firstly,based on the improved Tri-training algorithm and the fusion decision filtering strategy combined with disagreement,basic classifiers are trained offline by making full use of labeled and unlabeled sample data.Secondly,the dynamic density clustering algorithm of the target maneuver is performed,statistical information of each cluster is calculated,and a set of micro-clusters is obtained to initialize the online recognition model.Thirdly,the ensemble K-Nearest Neighbor(KNN)-based learning method is used to recognize the incoming target maneuver trajectory instances.Finally,to further improve the accuracy and adaptability of the model under the condition of high dynamic air combat,the parameters of the model are updated online using error-driven representation learning,exponential decay function and basic classifier obtained in the offline training stage.The experimental results on several University of California Irvine(UCI)datasets and real air combat target maneuver trajectory data validate the effectiveness of the proposed method in comparison with other semi-supervised models and supervised models,and the results show that the proposed model achieves higher classification accuracy.展开更多
Sentiment classification is a useful tool to classify reviews about sentiments and attitudes towards a product or service.Existing studies heavily rely on sentiment classification methods that require fully annotated ...Sentiment classification is a useful tool to classify reviews about sentiments and attitudes towards a product or service.Existing studies heavily rely on sentiment classification methods that require fully annotated inputs.However,there is limited labelled text available,making the acquirement process of the fully annotated input costly and labour-intensive.Lately,semi-supervised methods emerge as they require only partially labelled input but perform comparably to supervised methods.Nevertheless,some works reported that the performance of the semi-supervised model degraded after adding unlabelled instances into training.Literature also shows that not all unlabelled instances are equally useful;thus identifying the informative unlabelled instances is beneficial in training a semi-supervised model.To achieve this,an informative score is proposed and incorporated into semisupervised sentiment classification.The evaluation is performed on a semisupervised method without an informative score and with an informative score.By using the informative score in the instance selection strategy to identify informative unlabelled instances,semi-supervised models perform better compared to models that do not incorporate informative scores into their training.Although the performance of semi-supervised models incorporated with an informative score is not able to surpass the supervised models,the results are still found promising as the differences in performance are subtle with a small difference of 2%to 5%,but the number of labelled instances used is greatly reduced from100%to 40%.The best finding of the proposed instance selection strategy is achieved when incorporating an informative score with a baseline confidence score at a 0.5:0.5 ratio using only 40%labelled data.展开更多
Malaria is a lethal disease responsible for thousands of deaths worldwide every year.Manual methods of malaria diagnosis are timeconsuming that require a great deal of human expertise and efforts.Computerbased automat...Malaria is a lethal disease responsible for thousands of deaths worldwide every year.Manual methods of malaria diagnosis are timeconsuming that require a great deal of human expertise and efforts.Computerbased automated diagnosis of diseases is progressively becoming popular.Although deep learning models show high performance in the medical field,it demands a large volume of data for training which is hard to acquire for medical problems.Similarly,labeling of medical images can be done with the help of medical experts only.Several recent studies have utilized deep learning models to develop efficient malaria diagnostic system,which showed promising results.However,the most common problem with these models is that they need a large amount of data for training.This paper presents a computer-aided malaria diagnosis system that combines a semi-supervised generative adversarial network and transfer learning.The proposed model is trained in a semi-supervised manner and requires less training data than conventional deep learning models.Performance of the proposed model is evaluated on a publicly available dataset of blood smear images(with malariainfected and normal class)and achieved a classification accuracy of 96.6%.展开更多
Through semi-supervised learning and knowledge inheritance,a novel Takagi-Sugeno-Kang(TSK)fuzzy system framework is proposed for epilepsy data classification in this study.The new method is based on the maximum mean d...Through semi-supervised learning and knowledge inheritance,a novel Takagi-Sugeno-Kang(TSK)fuzzy system framework is proposed for epilepsy data classification in this study.The new method is based on the maximum mean discrepancy(MMD)method and TSK fuzzy system,as a basic model for the classification of epilepsy data.First,formedical data,the interpretability of TSK fuzzy systems can ensure that the prediction results are traceable and safe.Second,in view of the deviation in the data distribution between the real source domain and the target domain,MMD is used to measure the distance between different data distributions.The objective function is constructed according to the MMD distance,and the distribution distance of different datasets is minimized to find the similar characteristics of different datasets.We introduce semi-supervised learning to further explore the relationship between data.Based on the MMD method,a semi-supervised learning(SSL)-MMD method is constructed by using pseudo-tags to realize the data distribution alignment of the same category.In addition,the idea of knowledge dissemination is used to learn pseudo-tags as additional data features.Finally,for epilepsy classification,the cross-domain TSK fuzzy system uses the cross-entropy function as the objective function and adopts the back-propagation strategy to optimize the parameters.The experimental results show that the new method can process complex epilepsy data and identify whether patients have epilepsy.展开更多
Non-collaborative radio transmitter recognition is a significant but challenging issue, since it is hard or costly to obtain labeled training data samples. In order to make effective use of the unlabeled samples which...Non-collaborative radio transmitter recognition is a significant but challenging issue, since it is hard or costly to obtain labeled training data samples. In order to make effective use of the unlabeled samples which can be obtained much easier, a novel semi-supervised classification method named Elastic Sparsity Regularized Support Vector Machine (ESRSVM) is proposed for radio transmitter classification. ESRSVM first constructs an elastic-net graph over data samples to capture the robust and natural discriminating information and then incorporate the information into the manifold learning framework by an elastic sparsity regularization term. Experimental results on 10 GMSK modulated Automatic Identification System radios and 15 FM walkie-talkie radios show that ESRSVM achieves obviously better performance than KNN and SVM, which use only labeled samples for classification, and also outperforms semi-supervised classifier LapSVM based on manifold regularization.展开更多
Recently, there have been some attempts of Transformer in 3D point cloud classification. In order to reduce computations, most existing methods focus on local spatial attention,but ignore their content and fail to est...Recently, there have been some attempts of Transformer in 3D point cloud classification. In order to reduce computations, most existing methods focus on local spatial attention,but ignore their content and fail to establish relationships between distant but relevant points. To overcome the limitation of local spatial attention, we propose a point content-based Transformer architecture, called PointConT for short. It exploits the locality of points in the feature space(content-based), which clusters the sampled points with similar features into the same class and computes the self-attention within each class, thus enabling an effective trade-off between capturing long-range dependencies and computational complexity. We further introduce an inception feature aggregator for point cloud classification, which uses parallel structures to aggregate high-frequency and low-frequency information in each branch separately. Extensive experiments show that our PointConT model achieves a remarkable performance on point cloud shape classification. Especially, our method exhibits 90.3% Top-1 accuracy on the hardest setting of ScanObjectN N. Source code of this paper is available at https://github.com/yahuiliu99/PointC onT.展开更多
When building a classification model,the scenario where the samples of one class are significantly more than those of the other class is called data imbalance.Data imbalance causes the trained classification model to ...When building a classification model,the scenario where the samples of one class are significantly more than those of the other class is called data imbalance.Data imbalance causes the trained classification model to be in favor of the majority class(usually defined as the negative class),which may do harm to the accuracy of the minority class(usually defined as the positive class),and then lead to poor overall performance of the model.A method called MSHR-FCSSVM for solving imbalanced data classification is proposed in this article,which is based on a new hybrid resampling approach(MSHR)and a new fine cost-sensitive support vector machine(CS-SVM)classifier(FCSSVM).The MSHR measures the separability of each negative sample through its Silhouette value calculated by Mahalanobis distance between samples,based on which,the so-called pseudo-negative samples are screened out to generate new positive samples(over-sampling step)through linear interpolation and are deleted finally(under-sampling step).This approach replaces pseudo-negative samples with generated new positive samples one by one to clear up the inter-class overlap on the borderline,without changing the overall scale of the dataset.The FCSSVM is an improved version of the traditional CS-SVM.It considers influences of both the imbalance of sample number and the class distribution on classification simultaneously,and through finely tuning the class cost weights by using the efficient optimization algorithm based on the physical phenomenon of rime-ice(RIME)algorithm with cross-validation accuracy as the fitness function to accurately adjust the classification borderline.To verify the effectiveness of the proposed method,a series of experiments are carried out based on 20 imbalanced datasets including both mildly and extremely imbalanced datasets.The experimental results show that the MSHR-FCSSVM method performs better than the methods for comparison in most cases,and both the MSHR and the FCSSVM played significant roles.展开更多
The complex sand-casting process combined with the interactions between process parameters makes it difficult to control the casting quality,resulting in a high scrap rate.A strategy based on a data-driven model was p...The complex sand-casting process combined with the interactions between process parameters makes it difficult to control the casting quality,resulting in a high scrap rate.A strategy based on a data-driven model was proposed to reduce casting defects and improve production efficiency,which includes the random forest(RF)classification model,the feature importance analysis,and the process parameters optimization with Monte Carlo simulation.The collected data includes four types of defects and corresponding process parameters were used to construct the RF model.Classification results show a recall rate above 90% for all categories.The Gini Index was used to assess the importance of the process parameters in the formation of various defects in the RF model.Finally,the classification model was applied to different production conditions for quality prediction.In the case of process parameters optimization for gas porosity defects,this model serves as an experimental process in the Monte Carlo method to estimate a better temperature distribution.The prediction model,when applied to the factory,greatly improved the efficiency of defect detection.Results show that the scrap rate decreased from 10.16% to 6.68%.展开更多
The network of Himalayan roadways and highways connects some remote regions of valleys or hill slopes,which is vital for India’s socio-economic growth.Due to natural and artificial factors,frequency of slope instabil...The network of Himalayan roadways and highways connects some remote regions of valleys or hill slopes,which is vital for India’s socio-economic growth.Due to natural and artificial factors,frequency of slope instabilities along the networks has been increasing over last few decades.Assessment of stability of natural and artificial slopes due to construction of these connecting road networks is significant in safely executing these roads throughout the year.Several rock mass classification methods are generally used to assess the strength and deformability of rock mass.This study assesses slope stability along the NH-1A of Ramban district of North Western Himalayas.Various structurally and non-structurally controlled rock mass classification systems have been applied to assess the stability conditions of 14 slopes.For evaluating the stability of these slopes,kinematic analysis was performed along with geological strength index(GSI),rock mass rating(RMR),continuous slope mass rating(CoSMR),slope mass rating(SMR),and Q-slope in the present study.The SMR gives three slopes as completely unstable while CoSMR suggests four slopes as completely unstable.The stability of all slopes was also analyzed using a design chart under dynamic and static conditions by slope stability rating(SSR)for the factor of safety(FoS)of 1.2 and 1 respectively.Q-slope with probability of failure(PoF)1%gives two slopes as stable slopes.Stable slope angle has been determined based on the Q-slope safe angle equation and SSR design chart based on the FoS.The value ranges given by different empirical classifications were RMR(37-74),GSI(27.3-58.5),SMR(11-59),and CoSMR(3.39-74.56).Good relationship was found among RMR&SSR and RMR&GSI with correlation coefficient(R 2)value of 0.815 and 0.6866,respectively.Lastly,a comparative stability of all these slopes based on the above classification has been performed to identify the most critical slope along this road.展开更多
Background: Cavernous transformation of the portal vein(CTPV) due to portal vein obstruction is a rare vascular anomaly defined as the formation of multiple collateral vessels in the hepatic hilum. This study aimed to...Background: Cavernous transformation of the portal vein(CTPV) due to portal vein obstruction is a rare vascular anomaly defined as the formation of multiple collateral vessels in the hepatic hilum. This study aimed to investigate the imaging features of intrahepatic portal vein in adult patients with CTPV and establish the relationship between the manifestations of intrahepatic portal vein and the progression of CTPV. Methods: We retrospectively analyzed 14 CTPV patients in Beijing Tsinghua Changgung Hospital. All patients underwent both direct portal venography(DPV) and computed tomography angiography(CTA) to reveal the manifestations of the portal venous system. The vessels measured included the left portal vein(LPV), right portal vein(RPV), main portal vein(MPV) and the portal vein bifurcation(PVB). Results: Nine males and 5 females, with a median age of 40.5 years, were included in the study. No significant difference was found in the diameters of the LPV or RPV measured by DPV and CTA. The visualization in terms of LPV, RPV and PVB measured by DPV was higher than that by CTA. There was a significant association between LPV/RPV and PVB/MPV in term of visibility revealed with DPV( P = 0.01), while this association was not observed with CTA. According to the imaging features of the portal vein measured by DPV, CTPV was classified into three categories to facilitate the diagnosis and treatment. Conclusions: DPV was more accurate than CTA for revealing the course of the intrahepatic portal vein in patients with CTPV. The classification of CTPV, that originated from the imaging features of the portal vein revealed by DPV, may provide a new perspective for the diagnosis and treatment of CTPV.展开更多
In this study,our aim is to address the problem of gene selection by proposing a hybrid bio-inspired evolutionary algorithm that combines Grey Wolf Optimization(GWO)with Harris Hawks Optimization(HHO)for feature selec...In this study,our aim is to address the problem of gene selection by proposing a hybrid bio-inspired evolutionary algorithm that combines Grey Wolf Optimization(GWO)with Harris Hawks Optimization(HHO)for feature selection.Themotivation for utilizingGWOandHHOstems fromtheir bio-inspired nature and their demonstrated success in optimization problems.We aimto leverage the strengths of these algorithms to enhance the effectiveness of feature selection in microarray-based cancer classification.We selected leave-one-out cross-validation(LOOCV)to evaluate the performance of both two widely used classifiers,k-nearest neighbors(KNN)and support vector machine(SVM),on high-dimensional cancer microarray data.The proposed method is extensively tested on six publicly available cancer microarray datasets,and a comprehensive comparison with recently published methods is conducted.Our hybrid algorithm demonstrates its effectiveness in improving classification performance,Surpassing alternative approaches in terms of precision.The outcomes confirm the capability of our method to substantially improve both the precision and efficiency of cancer classification,thereby advancing the development ofmore efficient treatment strategies.The proposed hybridmethod offers a promising solution to the gene selection problem in microarray-based cancer classification.It improves the accuracy and efficiency of cancer diagnosis and treatment,and its superior performance compared to other methods highlights its potential applicability in realworld cancer classification tasks.By harnessing the complementary search mechanisms of GWO and HHO,we leverage their bio-inspired behavior to identify informative genes relevant to cancer diagnosis and treatment.展开更多
In general,data contain noises which come from faulty instruments,flawed measurements or faulty communication.Learning with data in the context of classification or regression is inevitably affected by noises in the d...In general,data contain noises which come from faulty instruments,flawed measurements or faulty communication.Learning with data in the context of classification or regression is inevitably affected by noises in the data.In order to remove or greatly reduce the impact of noises,we introduce the ideas of fuzzy membership functions and the Laplacian twin support vector machine(Lap-TSVM).A formulation of the linear intuitionistic fuzzy Laplacian twin support vector machine(IFLap-TSVM)is presented.Moreover,we extend the linear IFLap-TSVM to the nonlinear case by kernel function.The proposed IFLap-TSVM resolves the negative impact of noises and outliers by using fuzzy membership functions and is a more accurate reasonable classi-fier by using the geometric distribution information of labeled data and unlabeled data based on manifold regularization.Experiments with constructed artificial datasets,several UCI benchmark datasets and MNIST dataset show that the IFLap-TSVM has better classification accuracy than other state-of-the-art twin support vector machine(TSVM),intuitionistic fuzzy twin support vector machine(IFTSVM)and Lap-TSVM.展开更多
Purpose:Many science,technology and innovation(STI)resources are attached with several different labels.To assign automatically the resulting labels to an interested instance,many approaches with good performance on t...Purpose:Many science,technology and innovation(STI)resources are attached with several different labels.To assign automatically the resulting labels to an interested instance,many approaches with good performance on the benchmark datasets have been proposed for multi-label classification task in the literature.Furthermore,several open-source tools implementing these approaches have also been developed.However,the characteristics of real-world multi-label patent and publication datasets are not completely in line with those of benchmark ones.Therefore,the main purpose of this paper is to evaluate comprehensively seven multi-label classification methods on real-world datasets.Research limitations:Three real-world datasets differ in the following aspects:statement,data quality,and purposes.Additionally,open-source tools designed for multi-label classification also have intrinsic differences in their approaches for data processing and feature selection,which in turn impacts the performance of a multi-label classification approach.In the near future,we will enhance experimental precision and reinforce the validity of conclusions by employing more rigorous control over variables through introducing expanded parameter settings.Practical implications:The observed Macro F1 and Micro F1 scores on real-world datasets typically fall short of those achieved on benchmark datasets,underscoring the complexity of real-world multi-label classification tasks.Approaches leveraging deep learning techniques offer promising solutions by accommodating the hierarchical relationships and interdependencies among labels.With ongoing enhancements in deep learning algorithms and large-scale models,it is expected that the efficacy of multi-label classification tasks will be significantly improved,reaching a level of practical utility in the foreseeable future.Originality/value:(1)Seven multi-label classification methods are comprehensively compared on three real-world datasets.(2)The TextCNN and TextRCNN models perform better on small-scale datasets with more complex hierarchical structure of labels and more balanced document-label distribution.(3)The MLkNN method works better on the larger-scale dataset with more unbalanced document-label distribution.展开更多
Hybrid precoding is considered as a promising low-cost technique for millimeter wave(mm-wave)massive Multi-Input Multi-Output(MIMO)systems.In this work,referring to the time-varying propagation circumstances,with semi...Hybrid precoding is considered as a promising low-cost technique for millimeter wave(mm-wave)massive Multi-Input Multi-Output(MIMO)systems.In this work,referring to the time-varying propagation circumstances,with semi-supervised Incremental Learning(IL),we propose an online hybrid beamforming scheme.Firstly,given the constraint of constant modulus on analog beamformer and combiner,we propose a new broadnetwork-based structure for the design model of hybrid beamforming.Compared with the existing network structure,the proposed network structure can achieve better transmission performance and lower complexity.Moreover,to enhance the efficiency of IL further,by combining the semi-supervised graph with IL,we propose a hybrid beamforming scheme based on chunk-by-chunk semi-supervised learning,where only few transmissions are required to calculate the label and all other unlabelled transmissions would also be put into a training data chunk.Unlike the existing single-by-single approach where transmissions during the model update are not taken into the consideration of model update,all transmissions,even the ones during the model update,would make contributions to model update in the proposed method.During the model update,the amount of unlabelled transmissions is very large and they also carry some information,the prediction performance can be enhanced to some extent by these unlabelled channel data.Simulation results demonstrate the spectral efficiency of the proposed method outperforms that of the existing single-by-single approach.Besides,we prove the general complexity of the proposed method is lower than that of the existing approach and give the condition under which its absolute complexity outperforms that of the existing approach.展开更多
Timely inspection of defects on the surfaces of wind turbine blades can effectively prevent unpredictable accidents.To this end,this study proposes a semi-supervised object-detection network based on You Only Looking ...Timely inspection of defects on the surfaces of wind turbine blades can effectively prevent unpredictable accidents.To this end,this study proposes a semi-supervised object-detection network based on You Only Looking Once version 4(YOLOv4).A semi-supervised structure comprising a generative adversarial network(GAN)was designed to overcome the difficulty in obtaining sufficient samples and sample labeling.In a GAN,the generator is realized by an encoder-decoder network,where the backbone of the encoder is YOLOv4 and the decoder comprises inverse convolutional layers.Partial features from the generator are passed to the defect detection network.Deploying several unlabeled images can significantly improve the generalization and recognition capabilities of defect-detection models.The small-scale object detection capacity of the network can be improved by enhancing essential features in the feature map by adding the concurrent spatial and channel squeeze and excitation(scSE)attention module to the three parts of the YOLOv4 network.A balancing improvement was made to the loss function of YOLOv4 to overcome the imbalance problem of the defective species.The results for both the single-and multi-category defect datasets show that the improved model can make good use of the features of the unlabeled images.The accuracy of wind turbine blade defect detection also has a significant advantage over classical object detection algorithms,including faster R-CNN and DETR.展开更多
With the rapid development of Internet of Things(IoT)technology,IoT systems have been widely applied in health-care,transportation,home,and other fields.However,with the continuous expansion of the scale and increasin...With the rapid development of Internet of Things(IoT)technology,IoT systems have been widely applied in health-care,transportation,home,and other fields.However,with the continuous expansion of the scale and increasing complexity of IoT systems,the stability and security issues of IoT systems have become increasingly prominent.Thus,it is crucial to detect anomalies in the collected IoT time series from various sensors.Recently,deep learning models have been leveraged for IoT anomaly detection.However,owing to the challenges associated with data labeling,most IoT anomaly detection methods resort to unsupervised learning techniques.Nevertheless,the absence of accurate abnormal information in unsupervised learning methods limits their performance.To address these problems,we propose AS-GCN-MTM,an adaptive structural Graph Convolutional Networks(GCN)-based framework using a mean-teacher mechanism(AS-GCN-MTM)for anomaly identification.It performs better than unsupervised methods using only a small amount of labeled data.Mean Teachers is an effective semi-supervised learning method that utilizes unlabeled data for training to improve the generalization ability and performance of the model.However,the dependencies between data are often unknown in time series data.To solve this problem,we designed a graph structure adaptive learning layer based on neural networks,which can automatically learn the graph structure from time series data.It not only better captures the relationships between nodes but also enhances the model’s performance by augmenting key data.Experiments have demonstrated that our method improves the baseline model with the highest F1 value by 10.4%,36.1%,and 5.6%,respectively,on three real datasets with a 10%data labeling rate.展开更多
Active learning in semi-supervised classification involves introducing additional labels for unlabelled data to improve the accuracy of the underlying classifier.A challenge is to identify which points to label to bes...Active learning in semi-supervised classification involves introducing additional labels for unlabelled data to improve the accuracy of the underlying classifier.A challenge is to identify which points to label to best improve performance while limiting the number of new labels."Model Change"active learning quantifies the resulting change incurred in the classifier by introducing the additional label(s).We pair this idea with graph-based semi-supervised learning(SSL)methods,that use the spectrum of the graph Laplacian matrix,which can be truncated to avoid prohibitively large computational and storage costs.We consider a family of convex loss functions for which the acquisition function can be efficiently approximated using the Laplace approximation of the posterior distribution.We show a variety of multiclass examples that illustrate improved performance over prior state-of-art.展开更多
基金supported in part by the National Natural Science Foundation of China under Grant No.62171334,No.11973077 and No.12003061。
文摘With the successive application of deep learning(DL)in classification tasks,the DL-based modulation classification method has become the preference for its state-of-the-art performance.Nevertheless,once the DL recognition model is pre-trained with fixed classes,the pre-trained model tends to predict incorrect results when identifying incremental classes.Moreover,the incremental classes are usually emergent without label information or only a few labeled samples of incremental classes can be obtained.In this context,we propose a graphbased semi-supervised approach to address the fewshot classes-incremental(FSCI)modulation classification problem.Our proposed method is a twostage learning method,specifically,a warm-up model is trained for classifying old classes and incremental classes,where the unlabeled samples of incremental classes are uniformly labeled with the same label to alleviate the damage of the class imbalance problem.Then the warm-up model is regarded as a feature extractor for constructing a similar graph to connect labeled samples and unlabeled samples,and the label propagation algorithm is adopted to propagate the label information from labeled nodes to unlabeled nodes in the graph to achieve the purpose of incremental classes recognition.Simulation results prove that the proposed method is superior to other finetuning methods and retrain methods.
基金the support received from the Excellent Doctoral Dissertation Fund of Air Force Engineering University,China.
文摘Online target maneuver recognition is an important prerequisite for air combat situation recognition and maneuver decision-making.Conventional target maneuver recognition methods adopt mainly supervised learning methods and assume that many sample labels are available.However,in real-world applications,manual sample labeling is often time-consuming and laborious.In addition,airborne sensors collecting target maneuver trajectory information in data streams often cannot process information in real time.To solve these problems,in this paper,an air combat target maneuver recognition model based on an online ensemble semi-supervised classification framework based on online learning,ensemble learning,semi-supervised learning,and Tri-training algorithm,abbreviated as Online Ensemble Semi-supervised Classification Framework(OESCF),is proposed.The framework is divided into four parts:basic classifier offline training stage,online recognition model initialization stage,target maneuver online recognition stage,and online model update stage.Firstly,based on the improved Tri-training algorithm and the fusion decision filtering strategy combined with disagreement,basic classifiers are trained offline by making full use of labeled and unlabeled sample data.Secondly,the dynamic density clustering algorithm of the target maneuver is performed,statistical information of each cluster is calculated,and a set of micro-clusters is obtained to initialize the online recognition model.Thirdly,the ensemble K-Nearest Neighbor(KNN)-based learning method is used to recognize the incoming target maneuver trajectory instances.Finally,to further improve the accuracy and adaptability of the model under the condition of high dynamic air combat,the parameters of the model are updated online using error-driven representation learning,exponential decay function and basic classifier obtained in the offline training stage.The experimental results on several University of California Irvine(UCI)datasets and real air combat target maneuver trajectory data validate the effectiveness of the proposed method in comparison with other semi-supervised models and supervised models,and the results show that the proposed model achieves higher classification accuracy.
基金This research is supported by Fundamental Research Grant Scheme(FRGS),Ministry of Education Malaysia(MOE)under the project code,FRGS/1/2018/ICT02/USM/02/9 titled,Automated Big Data Annotation for Training Semi-Supervised Deep Learning Model in Sentiment Classification.
文摘Sentiment classification is a useful tool to classify reviews about sentiments and attitudes towards a product or service.Existing studies heavily rely on sentiment classification methods that require fully annotated inputs.However,there is limited labelled text available,making the acquirement process of the fully annotated input costly and labour-intensive.Lately,semi-supervised methods emerge as they require only partially labelled input but perform comparably to supervised methods.Nevertheless,some works reported that the performance of the semi-supervised model degraded after adding unlabelled instances into training.Literature also shows that not all unlabelled instances are equally useful;thus identifying the informative unlabelled instances is beneficial in training a semi-supervised model.To achieve this,an informative score is proposed and incorporated into semisupervised sentiment classification.The evaluation is performed on a semisupervised method without an informative score and with an informative score.By using the informative score in the instance selection strategy to identify informative unlabelled instances,semi-supervised models perform better compared to models that do not incorporate informative scores into their training.Although the performance of semi-supervised models incorporated with an informative score is not able to surpass the supervised models,the results are still found promising as the differences in performance are subtle with a small difference of 2%to 5%,but the number of labelled instances used is greatly reduced from100%to 40%.The best finding of the proposed instance selection strategy is achieved when incorporating an informative score with a baseline confidence score at a 0.5:0.5 ratio using only 40%labelled data.
基金The publication of this article is funded by the Qatar National Library.
文摘Malaria is a lethal disease responsible for thousands of deaths worldwide every year.Manual methods of malaria diagnosis are timeconsuming that require a great deal of human expertise and efforts.Computerbased automated diagnosis of diseases is progressively becoming popular.Although deep learning models show high performance in the medical field,it demands a large volume of data for training which is hard to acquire for medical problems.Similarly,labeling of medical images can be done with the help of medical experts only.Several recent studies have utilized deep learning models to develop efficient malaria diagnostic system,which showed promising results.However,the most common problem with these models is that they need a large amount of data for training.This paper presents a computer-aided malaria diagnosis system that combines a semi-supervised generative adversarial network and transfer learning.The proposed model is trained in a semi-supervised manner and requires less training data than conventional deep learning models.Performance of the proposed model is evaluated on a publicly available dataset of blood smear images(with malariainfected and normal class)and achieved a classification accuracy of 96.6%.
基金supported by the Fifth Key Project of Jiangsu Vocational Education Teaching Reform Research under Grant ZZZ13in part by the Science and Technology Project of Changzhou City under Grant CE20215032.
文摘Through semi-supervised learning and knowledge inheritance,a novel Takagi-Sugeno-Kang(TSK)fuzzy system framework is proposed for epilepsy data classification in this study.The new method is based on the maximum mean discrepancy(MMD)method and TSK fuzzy system,as a basic model for the classification of epilepsy data.First,formedical data,the interpretability of TSK fuzzy systems can ensure that the prediction results are traceable and safe.Second,in view of the deviation in the data distribution between the real source domain and the target domain,MMD is used to measure the distance between different data distributions.The objective function is constructed according to the MMD distance,and the distribution distance of different datasets is minimized to find the similar characteristics of different datasets.We introduce semi-supervised learning to further explore the relationship between data.Based on the MMD method,a semi-supervised learning(SSL)-MMD method is constructed by using pseudo-tags to realize the data distribution alignment of the same category.In addition,the idea of knowledge dissemination is used to learn pseudo-tags as additional data features.Finally,for epilepsy classification,the cross-domain TSK fuzzy system uses the cross-entropy function as the objective function and adopts the back-propagation strategy to optimize the parameters.The experimental results show that the new method can process complex epilepsy data and identify whether patients have epilepsy.
基金Supported by the Hi-Tech Research and Development Program of China (No. 2009AAJ130)
文摘Non-collaborative radio transmitter recognition is a significant but challenging issue, since it is hard or costly to obtain labeled training data samples. In order to make effective use of the unlabeled samples which can be obtained much easier, a novel semi-supervised classification method named Elastic Sparsity Regularized Support Vector Machine (ESRSVM) is proposed for radio transmitter classification. ESRSVM first constructs an elastic-net graph over data samples to capture the robust and natural discriminating information and then incorporate the information into the manifold learning framework by an elastic sparsity regularization term. Experimental results on 10 GMSK modulated Automatic Identification System radios and 15 FM walkie-talkie radios show that ESRSVM achieves obviously better performance than KNN and SVM, which use only labeled samples for classification, and also outperforms semi-supervised classifier LapSVM based on manifold regularization.
基金supported in part by the Nationa Natural Science Foundation of China (61876011)the National Key Research and Development Program of China (2022YFB4703700)+1 种基金the Key Research and Development Program 2020 of Guangzhou (202007050002)the Key-Area Research and Development Program of Guangdong Province (2020B090921003)。
文摘Recently, there have been some attempts of Transformer in 3D point cloud classification. In order to reduce computations, most existing methods focus on local spatial attention,but ignore their content and fail to establish relationships between distant but relevant points. To overcome the limitation of local spatial attention, we propose a point content-based Transformer architecture, called PointConT for short. It exploits the locality of points in the feature space(content-based), which clusters the sampled points with similar features into the same class and computes the self-attention within each class, thus enabling an effective trade-off between capturing long-range dependencies and computational complexity. We further introduce an inception feature aggregator for point cloud classification, which uses parallel structures to aggregate high-frequency and low-frequency information in each branch separately. Extensive experiments show that our PointConT model achieves a remarkable performance on point cloud shape classification. Especially, our method exhibits 90.3% Top-1 accuracy on the hardest setting of ScanObjectN N. Source code of this paper is available at https://github.com/yahuiliu99/PointC onT.
基金supported by the Yunnan Major Scientific and Technological Projects(Grant No.202302AD080001)the National Natural Science Foundation,China(No.52065033).
文摘When building a classification model,the scenario where the samples of one class are significantly more than those of the other class is called data imbalance.Data imbalance causes the trained classification model to be in favor of the majority class(usually defined as the negative class),which may do harm to the accuracy of the minority class(usually defined as the positive class),and then lead to poor overall performance of the model.A method called MSHR-FCSSVM for solving imbalanced data classification is proposed in this article,which is based on a new hybrid resampling approach(MSHR)and a new fine cost-sensitive support vector machine(CS-SVM)classifier(FCSSVM).The MSHR measures the separability of each negative sample through its Silhouette value calculated by Mahalanobis distance between samples,based on which,the so-called pseudo-negative samples are screened out to generate new positive samples(over-sampling step)through linear interpolation and are deleted finally(under-sampling step).This approach replaces pseudo-negative samples with generated new positive samples one by one to clear up the inter-class overlap on the borderline,without changing the overall scale of the dataset.The FCSSVM is an improved version of the traditional CS-SVM.It considers influences of both the imbalance of sample number and the class distribution on classification simultaneously,and through finely tuning the class cost weights by using the efficient optimization algorithm based on the physical phenomenon of rime-ice(RIME)algorithm with cross-validation accuracy as the fitness function to accurately adjust the classification borderline.To verify the effectiveness of the proposed method,a series of experiments are carried out based on 20 imbalanced datasets including both mildly and extremely imbalanced datasets.The experimental results show that the MSHR-FCSSVM method performs better than the methods for comparison in most cases,and both the MSHR and the FCSSVM played significant roles.
基金financially supported by the National Key Research and Development Program of China(2022YFB3706800,2020YFB1710100)the National Natural Science Foundation of China(51821001,52090042,52074183)。
文摘The complex sand-casting process combined with the interactions between process parameters makes it difficult to control the casting quality,resulting in a high scrap rate.A strategy based on a data-driven model was proposed to reduce casting defects and improve production efficiency,which includes the random forest(RF)classification model,the feature importance analysis,and the process parameters optimization with Monte Carlo simulation.The collected data includes four types of defects and corresponding process parameters were used to construct the RF model.Classification results show a recall rate above 90% for all categories.The Gini Index was used to assess the importance of the process parameters in the formation of various defects in the RF model.Finally,the classification model was applied to different production conditions for quality prediction.In the case of process parameters optimization for gas porosity defects,this model serves as an experimental process in the Monte Carlo method to estimate a better temperature distribution.The prediction model,when applied to the factory,greatly improved the efficiency of defect detection.Results show that the scrap rate decreased from 10.16% to 6.68%.
文摘The network of Himalayan roadways and highways connects some remote regions of valleys or hill slopes,which is vital for India’s socio-economic growth.Due to natural and artificial factors,frequency of slope instabilities along the networks has been increasing over last few decades.Assessment of stability of natural and artificial slopes due to construction of these connecting road networks is significant in safely executing these roads throughout the year.Several rock mass classification methods are generally used to assess the strength and deformability of rock mass.This study assesses slope stability along the NH-1A of Ramban district of North Western Himalayas.Various structurally and non-structurally controlled rock mass classification systems have been applied to assess the stability conditions of 14 slopes.For evaluating the stability of these slopes,kinematic analysis was performed along with geological strength index(GSI),rock mass rating(RMR),continuous slope mass rating(CoSMR),slope mass rating(SMR),and Q-slope in the present study.The SMR gives three slopes as completely unstable while CoSMR suggests four slopes as completely unstable.The stability of all slopes was also analyzed using a design chart under dynamic and static conditions by slope stability rating(SSR)for the factor of safety(FoS)of 1.2 and 1 respectively.Q-slope with probability of failure(PoF)1%gives two slopes as stable slopes.Stable slope angle has been determined based on the Q-slope safe angle equation and SSR design chart based on the FoS.The value ranges given by different empirical classifications were RMR(37-74),GSI(27.3-58.5),SMR(11-59),and CoSMR(3.39-74.56).Good relationship was found among RMR&SSR and RMR&GSI with correlation coefficient(R 2)value of 0.815 and 0.6866,respectively.Lastly,a comparative stability of all these slopes based on the above classification has been performed to identify the most critical slope along this road.
文摘Background: Cavernous transformation of the portal vein(CTPV) due to portal vein obstruction is a rare vascular anomaly defined as the formation of multiple collateral vessels in the hepatic hilum. This study aimed to investigate the imaging features of intrahepatic portal vein in adult patients with CTPV and establish the relationship between the manifestations of intrahepatic portal vein and the progression of CTPV. Methods: We retrospectively analyzed 14 CTPV patients in Beijing Tsinghua Changgung Hospital. All patients underwent both direct portal venography(DPV) and computed tomography angiography(CTA) to reveal the manifestations of the portal venous system. The vessels measured included the left portal vein(LPV), right portal vein(RPV), main portal vein(MPV) and the portal vein bifurcation(PVB). Results: Nine males and 5 females, with a median age of 40.5 years, were included in the study. No significant difference was found in the diameters of the LPV or RPV measured by DPV and CTA. The visualization in terms of LPV, RPV and PVB measured by DPV was higher than that by CTA. There was a significant association between LPV/RPV and PVB/MPV in term of visibility revealed with DPV( P = 0.01), while this association was not observed with CTA. According to the imaging features of the portal vein measured by DPV, CTPV was classified into three categories to facilitate the diagnosis and treatment. Conclusions: DPV was more accurate than CTA for revealing the course of the intrahepatic portal vein in patients with CTPV. The classification of CTPV, that originated from the imaging features of the portal vein revealed by DPV, may provide a new perspective for the diagnosis and treatment of CTPV.
基金the Deputyship for Research and Innovation,“Ministry of Education”in Saudi Arabia for funding this research(IFKSUOR3-014-3).
文摘In this study,our aim is to address the problem of gene selection by proposing a hybrid bio-inspired evolutionary algorithm that combines Grey Wolf Optimization(GWO)with Harris Hawks Optimization(HHO)for feature selection.Themotivation for utilizingGWOandHHOstems fromtheir bio-inspired nature and their demonstrated success in optimization problems.We aimto leverage the strengths of these algorithms to enhance the effectiveness of feature selection in microarray-based cancer classification.We selected leave-one-out cross-validation(LOOCV)to evaluate the performance of both two widely used classifiers,k-nearest neighbors(KNN)and support vector machine(SVM),on high-dimensional cancer microarray data.The proposed method is extensively tested on six publicly available cancer microarray datasets,and a comprehensive comparison with recently published methods is conducted.Our hybrid algorithm demonstrates its effectiveness in improving classification performance,Surpassing alternative approaches in terms of precision.The outcomes confirm the capability of our method to substantially improve both the precision and efficiency of cancer classification,thereby advancing the development ofmore efficient treatment strategies.The proposed hybridmethod offers a promising solution to the gene selection problem in microarray-based cancer classification.It improves the accuracy and efficiency of cancer diagnosis and treatment,and its superior performance compared to other methods highlights its potential applicability in realworld cancer classification tasks.By harnessing the complementary search mechanisms of GWO and HHO,we leverage their bio-inspired behavior to identify informative genes relevant to cancer diagnosis and treatment.
基金This work was supported by the National Natural Science Foundation of China(No.11771275)The second author thanks the partially support of Dutch Research Council(No.040.11.724).
文摘In general,data contain noises which come from faulty instruments,flawed measurements or faulty communication.Learning with data in the context of classification or regression is inevitably affected by noises in the data.In order to remove or greatly reduce the impact of noises,we introduce the ideas of fuzzy membership functions and the Laplacian twin support vector machine(Lap-TSVM).A formulation of the linear intuitionistic fuzzy Laplacian twin support vector machine(IFLap-TSVM)is presented.Moreover,we extend the linear IFLap-TSVM to the nonlinear case by kernel function.The proposed IFLap-TSVM resolves the negative impact of noises and outliers by using fuzzy membership functions and is a more accurate reasonable classi-fier by using the geometric distribution information of labeled data and unlabeled data based on manifold regularization.Experiments with constructed artificial datasets,several UCI benchmark datasets and MNIST dataset show that the IFLap-TSVM has better classification accuracy than other state-of-the-art twin support vector machine(TSVM),intuitionistic fuzzy twin support vector machine(IFTSVM)and Lap-TSVM.
基金the Natural Science Foundation of China(Grant Numbers 72074014 and 72004012).
文摘Purpose:Many science,technology and innovation(STI)resources are attached with several different labels.To assign automatically the resulting labels to an interested instance,many approaches with good performance on the benchmark datasets have been proposed for multi-label classification task in the literature.Furthermore,several open-source tools implementing these approaches have also been developed.However,the characteristics of real-world multi-label patent and publication datasets are not completely in line with those of benchmark ones.Therefore,the main purpose of this paper is to evaluate comprehensively seven multi-label classification methods on real-world datasets.Research limitations:Three real-world datasets differ in the following aspects:statement,data quality,and purposes.Additionally,open-source tools designed for multi-label classification also have intrinsic differences in their approaches for data processing and feature selection,which in turn impacts the performance of a multi-label classification approach.In the near future,we will enhance experimental precision and reinforce the validity of conclusions by employing more rigorous control over variables through introducing expanded parameter settings.Practical implications:The observed Macro F1 and Micro F1 scores on real-world datasets typically fall short of those achieved on benchmark datasets,underscoring the complexity of real-world multi-label classification tasks.Approaches leveraging deep learning techniques offer promising solutions by accommodating the hierarchical relationships and interdependencies among labels.With ongoing enhancements in deep learning algorithms and large-scale models,it is expected that the efficacy of multi-label classification tasks will be significantly improved,reaching a level of practical utility in the foreseeable future.Originality/value:(1)Seven multi-label classification methods are comprehensively compared on three real-world datasets.(2)The TextCNN and TextRCNN models perform better on small-scale datasets with more complex hierarchical structure of labels and more balanced document-label distribution.(3)The MLkNN method works better on the larger-scale dataset with more unbalanced document-label distribution.
基金supported by the National Science Foundation of China under Grant No.62101467.
文摘Hybrid precoding is considered as a promising low-cost technique for millimeter wave(mm-wave)massive Multi-Input Multi-Output(MIMO)systems.In this work,referring to the time-varying propagation circumstances,with semi-supervised Incremental Learning(IL),we propose an online hybrid beamforming scheme.Firstly,given the constraint of constant modulus on analog beamformer and combiner,we propose a new broadnetwork-based structure for the design model of hybrid beamforming.Compared with the existing network structure,the proposed network structure can achieve better transmission performance and lower complexity.Moreover,to enhance the efficiency of IL further,by combining the semi-supervised graph with IL,we propose a hybrid beamforming scheme based on chunk-by-chunk semi-supervised learning,where only few transmissions are required to calculate the label and all other unlabelled transmissions would also be put into a training data chunk.Unlike the existing single-by-single approach where transmissions during the model update are not taken into the consideration of model update,all transmissions,even the ones during the model update,would make contributions to model update in the proposed method.During the model update,the amount of unlabelled transmissions is very large and they also carry some information,the prediction performance can be enhanced to some extent by these unlabelled channel data.Simulation results demonstrate the spectral efficiency of the proposed method outperforms that of the existing single-by-single approach.Besides,we prove the general complexity of the proposed method is lower than that of the existing approach and give the condition under which its absolute complexity outperforms that of the existing approach.
基金supported in part by the National Natural Science Foundation of China under grants 62202044 and 62372039Scientific and Technological Innovation Foundation of Foshan under grant BK22BF009+3 种基金Excellent Youth Team Project for the Central Universities under grant FRF-EYIT-23-01Fundamental Research Funds for the Central Universities under grants 06500103 and 06500078Guangdong Basic and Applied Basic Research Foundation under grant 2022A1515240044Beijing Natural Science Foundation under grant 4232040.
文摘Timely inspection of defects on the surfaces of wind turbine blades can effectively prevent unpredictable accidents.To this end,this study proposes a semi-supervised object-detection network based on You Only Looking Once version 4(YOLOv4).A semi-supervised structure comprising a generative adversarial network(GAN)was designed to overcome the difficulty in obtaining sufficient samples and sample labeling.In a GAN,the generator is realized by an encoder-decoder network,where the backbone of the encoder is YOLOv4 and the decoder comprises inverse convolutional layers.Partial features from the generator are passed to the defect detection network.Deploying several unlabeled images can significantly improve the generalization and recognition capabilities of defect-detection models.The small-scale object detection capacity of the network can be improved by enhancing essential features in the feature map by adding the concurrent spatial and channel squeeze and excitation(scSE)attention module to the three parts of the YOLOv4 network.A balancing improvement was made to the loss function of YOLOv4 to overcome the imbalance problem of the defective species.The results for both the single-and multi-category defect datasets show that the improved model can make good use of the features of the unlabeled images.The accuracy of wind turbine blade defect detection also has a significant advantage over classical object detection algorithms,including faster R-CNN and DETR.
基金This research is partially supported by the National Natural Science Foundation of China under Grant No.62376043Science and Technology Program of Sichuan Province under Grant Nos.2020JDRC0067,2023JDRC0087,and 24NSFTD0025.
文摘With the rapid development of Internet of Things(IoT)technology,IoT systems have been widely applied in health-care,transportation,home,and other fields.However,with the continuous expansion of the scale and increasing complexity of IoT systems,the stability and security issues of IoT systems have become increasingly prominent.Thus,it is crucial to detect anomalies in the collected IoT time series from various sensors.Recently,deep learning models have been leveraged for IoT anomaly detection.However,owing to the challenges associated with data labeling,most IoT anomaly detection methods resort to unsupervised learning techniques.Nevertheless,the absence of accurate abnormal information in unsupervised learning methods limits their performance.To address these problems,we propose AS-GCN-MTM,an adaptive structural Graph Convolutional Networks(GCN)-based framework using a mean-teacher mechanism(AS-GCN-MTM)for anomaly identification.It performs better than unsupervised methods using only a small amount of labeled data.Mean Teachers is an effective semi-supervised learning method that utilizes unlabeled data for training to improve the generalization ability and performance of the model.However,the dependencies between data are often unknown in time series data.To solve this problem,we designed a graph structure adaptive learning layer based on neural networks,which can automatically learn the graph structure from time series data.It not only better captures the relationships between nodes but also enhances the model’s performance by augmenting key data.Experiments have demonstrated that our method improves the baseline model with the highest F1 value by 10.4%,36.1%,and 5.6%,respectively,on three real datasets with a 10%data labeling rate.
基金supported by the DOD National Defense Science and Engineering Graduate(NDSEG)Research Fellowshipsupported by the NGA under Contract No.HM04762110003.
文摘Active learning in semi-supervised classification involves introducing additional labels for unlabelled data to improve the accuracy of the underlying classifier.A challenge is to identify which points to label to best improve performance while limiting the number of new labels."Model Change"active learning quantifies the resulting change incurred in the classifier by introducing the additional label(s).We pair this idea with graph-based semi-supervised learning(SSL)methods,that use the spectrum of the graph Laplacian matrix,which can be truncated to avoid prohibitively large computational and storage costs.We consider a family of convex loss functions for which the acquisition function can be efficiently approximated using the Laplace approximation of the posterior distribution.We show a variety of multiclass examples that illustrate improved performance over prior state-of-art.