The aim of this paper is to broaden the application of Stochastic Configuration Network (SCN) in the semi-supervised domain by utilizing common unlabeled data in daily life. It can enhance the classification accuracy ...The aim of this paper is to broaden the application of Stochastic Configuration Network (SCN) in the semi-supervised domain by utilizing common unlabeled data in daily life. It can enhance the classification accuracy of decentralized SCN algorithms while effectively protecting user privacy. To this end, we propose a decentralized semi-supervised learning algorithm for SCN, called DMT-SCN, which introduces teacher and student models by combining the idea of consistency regularization to improve the response speed of model iterations. In order to reduce the possible negative impact of unsupervised data on the model, we purposely change the way of adding noise to the unlabeled data. Simulation results show that the algorithm can effectively utilize unlabeled data to improve the classification accuracy of SCN training and is robust under different ground simulation environments.展开更多
Using resting-state functional magnetic resonance imaging (fMRI) technology to assist in identifying brain diseases has great potential. In the identification of brain diseases, graph-based models have been widely use...Using resting-state functional magnetic resonance imaging (fMRI) technology to assist in identifying brain diseases has great potential. In the identification of brain diseases, graph-based models have been widely used, where graph represents the similarity between patients or brain regions of interest. In these models, constructing high-quality graphs is of paramount importance. Researchers have proposed various methods for constructing graphs from different perspectives, among which the simplest and most popular one is Pearson Correlation (PC). Although existing methods have achieved significant results, these graphs are usually fixed once they are constructed, and are generally operated separately from downstream task. Such a separation may result in neither the constructed graph nor the extracted features being ideal. To solve this problem, we use the graph-optimized locality preserving projection algorithm to extract features and the population graph simultaneously, aiming in higher identification accuracy through a task-dependent automatic optimization of the graph. At the same time, we incorporate supervised information to enable more flexible modelling. Specifically, the proposed method first uses PC to construct graph as the initial feature for each subject. Then, the projection matrix and graph are iteratively optimized through graph-optimization locality preserving projections based on semi-supervised learning, which fully employs the knowledge in various transformation spaces. Finally, the obtained projection matrix is applied to construct the subject-level graph and perform classification using support vector machines. To verify the effectiveness of the proposed method, we conduct experiments to identify subjects with mild cognitive impairment (MCI) and Autism spectrum disorder (ASD) from normal controls (NCs), and the results showed that the classification performance of our method is better than that of the baseline method.展开更多
Traditional expert-designed branching rules in branch-and-bound(B&B) are static, often failing to adapt to diverse and evolving problem instances. Crafting these rules is labor-intensive, and may not scale well wi...Traditional expert-designed branching rules in branch-and-bound(B&B) are static, often failing to adapt to diverse and evolving problem instances. Crafting these rules is labor-intensive, and may not scale well with complex problems.Given the frequent need to solve varied combinatorial optimization problems, leveraging statistical learning to auto-tune B&B algorithms for specific problem classes becomes attractive. This paper proposes a graph pointer network model to learn the branch rules. Graph features, global features and historical features are designated to represent the solver state. The graph neural network processes graph features, while the pointer mechanism assimilates the global and historical features to finally determine the variable on which to branch. The model is trained to imitate the expert strong branching rule by a tailored top-k Kullback-Leibler divergence loss function. Experiments on a series of benchmark problems demonstrate that the proposed approach significantly outperforms the widely used expert-designed branching rules. It also outperforms state-of-the-art machine-learning-based branch-and-bound methods in terms of solving speed and search tree size on all the test instances. In addition, the model can generalize to unseen instances and scale to larger instances.展开更多
Wheat is a critical crop,extensively consumed worldwide,and its production enhancement is essential to meet escalating demand.The presence of diseases like stem rust,leaf rust,yellow rust,and tan spot significantly di...Wheat is a critical crop,extensively consumed worldwide,and its production enhancement is essential to meet escalating demand.The presence of diseases like stem rust,leaf rust,yellow rust,and tan spot significantly diminishes wheat yield,making the early and precise identification of these diseases vital for effective disease management.With advancements in deep learning algorithms,researchers have proposed many methods for the automated detection of disease pathogens;however,accurately detectingmultiple disease pathogens simultaneously remains a challenge.This challenge arises due to the scarcity of RGB images for multiple diseases,class imbalance in existing public datasets,and the difficulty in extracting features that discriminate between multiple classes of disease pathogens.In this research,a novel method is proposed based on Transfer Generative Adversarial Networks for augmenting existing data,thereby overcoming the problems of class imbalance and data scarcity.This study proposes a customized architecture of Vision Transformers(ViT),where the feature vector is obtained by concatenating features extracted from the custom ViT and Graph Neural Networks.This paper also proposes a Model AgnosticMeta Learning(MAML)based ensemble classifier for accurate classification.The proposedmodel,validated on public datasets for wheat disease pathogen classification,achieved a test accuracy of 99.20%and an F1-score of 97.95%.Compared with existing state-of-the-art methods,this proposed model outperforms in terms of accuracy,F1-score,and the number of disease pathogens detection.In future,more diseases can be included for detection along with some other modalities like pests and weed.展开更多
Radio frequency fingerprinting(RFF)is a remarkable lightweight authentication scheme to support rapid and scalable identification in the internet of things(IoT)systems.Deep learning(DL)is a critical enabler of RFF ide...Radio frequency fingerprinting(RFF)is a remarkable lightweight authentication scheme to support rapid and scalable identification in the internet of things(IoT)systems.Deep learning(DL)is a critical enabler of RFF identification by leveraging the hardware-level features.However,traditional supervised learning methods require huge labeled training samples.Therefore,how to establish a highperformance supervised learning model with few labels under practical application is still challenging.To address this issue,we in this paper propose a novel RFF semi-supervised learning(RFFSSL)model which can obtain a better performance with few meta labels.Specifically,the proposed RFFSSL model is constituted by a teacher-student network,in which the student network learns from the pseudo label predicted by the teacher.Then,the output of the student model will be exploited to improve the performance of teacher among the labeled data.Furthermore,a comprehensive evaluation on the accuracy is conducted.We derive about 50 GB real long-term evolution(LTE)mobile phone’s raw signal datasets,which is used to evaluate various models.Experimental results demonstrate that the proposed RFFSSL scheme can achieve up to 97%experimental testing accuracy over a noisy environment only with 10%labeled samples when training samples equal to 2700.展开更多
Intrusion detection involves identifying unauthorized network activity and recognizing whether the data constitute an abnormal network transmission.Recent research has focused on using semi-supervised learning mechani...Intrusion detection involves identifying unauthorized network activity and recognizing whether the data constitute an abnormal network transmission.Recent research has focused on using semi-supervised learning mechanisms to identify abnormal network traffic to deal with labeled and unlabeled data in the industry.However,real-time training and classifying network traffic pose challenges,as they can lead to the degradation of the overall dataset and difficulties preventing attacks.Additionally,existing semi-supervised learning research might need to analyze the experimental results comprehensively.This paper proposes XA-GANomaly,a novel technique for explainable adaptive semi-supervised learning using GANomaly,an image anomalous detection model that dynamically trains small subsets to these issues.First,this research introduces a deep neural network(DNN)-based GANomaly for semi-supervised learning.Second,this paper presents the proposed adaptive algorithm for the DNN-based GANomaly,which is validated with four subsets of the adaptive dataset.Finally,this study demonstrates a monitoring system that incorporates three explainable techniques—Shapley additive explanations,reconstruction error visualization,and t-distributed stochastic neighbor embedding—to respond effectively to attacks on traffic data at each feature engineering stage,semi-supervised learning,and adaptive learning.Compared to other single-class classification techniques,the proposed DNN-based GANomaly achieves higher scores for Network Security Laboratory-Knowledge Discovery in Databases and UNSW-NB15 datasets at 13%and 8%of F1 scores and 4.17%and 11.51%for accuracy,respectively.Furthermore,experiments of the proposed adaptive learning reveal mostly improved results over the initial values.An analysis and monitoring system based on the combination of the three explainable methodologies is also described.Thus,the proposed method has the potential advantages to be applied in practical industry,and future research will explore handling unbalanced real-time datasets in various scenarios.展开更多
In the upcoming large-scale Internet of Things(Io T),it is increasingly challenging to defend against malicious traffic,due to the heterogeneity of Io T devices and the diversity of Io T communication protocols.In thi...In the upcoming large-scale Internet of Things(Io T),it is increasingly challenging to defend against malicious traffic,due to the heterogeneity of Io T devices and the diversity of Io T communication protocols.In this paper,we propose a semi-supervised learning-based approach to detect malicious traffic at the access side.It overcomes the resource-bottleneck problem of traditional malicious traffic defenders which are deployed at the victim side,and also is free of labeled traffic data in model training.Specifically,we design a coarse-grained behavior model of Io T devices by self-supervised learning with unlabeled traffic data.Then,we fine-tune this model to improve its accuracy in malicious traffic detection by adopting a transfer learning method using a small amount of labeled data.Experimental results show that our method can achieve the accuracy of 99.52%and the F1-score of 99.52%with only 1%of the labeled training data based on the CICDDoS2019 dataset.Moreover,our method outperforms the stateof-the-art supervised learning-based methods in terms of accuracy,precision,recall and F1-score with 1%of the training data.展开更多
Through semi-supervised learning and knowledge inheritance,a novel Takagi-Sugeno-Kang(TSK)fuzzy system framework is proposed for epilepsy data classification in this study.The new method is based on the maximum mean d...Through semi-supervised learning and knowledge inheritance,a novel Takagi-Sugeno-Kang(TSK)fuzzy system framework is proposed for epilepsy data classification in this study.The new method is based on the maximum mean discrepancy(MMD)method and TSK fuzzy system,as a basic model for the classification of epilepsy data.First,formedical data,the interpretability of TSK fuzzy systems can ensure that the prediction results are traceable and safe.Second,in view of the deviation in the data distribution between the real source domain and the target domain,MMD is used to measure the distance between different data distributions.The objective function is constructed according to the MMD distance,and the distribution distance of different datasets is minimized to find the similar characteristics of different datasets.We introduce semi-supervised learning to further explore the relationship between data.Based on the MMD method,a semi-supervised learning(SSL)-MMD method is constructed by using pseudo-tags to realize the data distribution alignment of the same category.In addition,the idea of knowledge dissemination is used to learn pseudo-tags as additional data features.Finally,for epilepsy classification,the cross-domain TSK fuzzy system uses the cross-entropy function as the objective function and adopts the back-propagation strategy to optimize the parameters.The experimental results show that the new method can process complex epilepsy data and identify whether patients have epilepsy.展开更多
Malaria is a lethal disease responsible for thousands of deaths worldwide every year.Manual methods of malaria diagnosis are timeconsuming that require a great deal of human expertise and efforts.Computerbased automat...Malaria is a lethal disease responsible for thousands of deaths worldwide every year.Manual methods of malaria diagnosis are timeconsuming that require a great deal of human expertise and efforts.Computerbased automated diagnosis of diseases is progressively becoming popular.Although deep learning models show high performance in the medical field,it demands a large volume of data for training which is hard to acquire for medical problems.Similarly,labeling of medical images can be done with the help of medical experts only.Several recent studies have utilized deep learning models to develop efficient malaria diagnostic system,which showed promising results.However,the most common problem with these models is that they need a large amount of data for training.This paper presents a computer-aided malaria diagnosis system that combines a semi-supervised generative adversarial network and transfer learning.The proposed model is trained in a semi-supervised manner and requires less training data than conventional deep learning models.Performance of the proposed model is evaluated on a publicly available dataset of blood smear images(with malariainfected and normal class)and achieved a classification accuracy of 96.6%.展开更多
Recent state-of-the-art semi-supervised learning(SSL)methods usually use data augmentations as core components.Such methods,however,are limited to simple transformations such as the augmentations under the instance’s...Recent state-of-the-art semi-supervised learning(SSL)methods usually use data augmentations as core components.Such methods,however,are limited to simple transformations such as the augmentations under the instance’s naive representations or the augmentations under the instance’s semantic representations.To tackle this problem,we offer a unique insight into data augmentations and propose a novel data-augmentation-based semi-supervised learning method,called Attentive Neighborhood Feature Aug-mentation(ANFA).The motivation of our method lies in the observation that the relationship between the given feature and its neighborhood may contribute to constructing more reliable transformations for the data,and further facilitating the classifier to distinguish the ambiguous features from the low-dense regions.Specially,we first project the labeled and unlabeled data points into an embedding space and then construct a neighbor graph that serves as a similarity measure based on the similar representations in the embedding space.Then,we employ an attention mechanism to transform the target features into augmented ones based on the neighbor graph.Finally,we formulate a novel semi-supervised loss by encouraging the predictions of the interpolations of augmented features to be consistent with the corresponding interpolations of the predictions of the target features.We carried out exper-iments on SVHN and CIFAR-10 benchmark datasets and the experimental results demonstrate that our method outperforms the state-of-the-art methods when the number of labeled examples is limited.展开更多
Software testing courses are characterized by strong practicality,comprehensiveness,and diversity.Due to the differences among students and the needs to design personalized solutions for their specific requirements,th...Software testing courses are characterized by strong practicality,comprehensiveness,and diversity.Due to the differences among students and the needs to design personalized solutions for their specific requirements,the design of the existing software testing courses fails to meet the demands for personalized learning.Knowledge graphs,with their rich semantics and good visualization effects,have a wide range of applications in the field of education.In response to the current problem of software testing courses which fails to meet the needs for personalized learning,this paper offers a learning path recommendation based on knowledge graphs to provide personalized learning paths for students.展开更多
Online review platforms are becoming increasingly popular,encouraging dishonest merchants and service providers to deceive customers by creating fake reviews for their goods or services.Using Sybil accounts,bot farms,...Online review platforms are becoming increasingly popular,encouraging dishonest merchants and service providers to deceive customers by creating fake reviews for their goods or services.Using Sybil accounts,bot farms,and real account purchases,immoral actors demonize rivals and advertise their goods.Most academic and industry efforts have been aimed at detecting fake/fraudulent product or service evaluations for years.The primary hurdle to identifying fraudulent reviews is the lack of a reliable means to distinguish fraudulent reviews from real ones.This paper adopts a semi-supervised machine learning method to detect fake reviews on any website,among other things.Online reviews are classified using a semi-supervised approach(PU-learning)since there is a shortage of labeled data,and they are dynamic.Then,classification is performed using the machine learning techniques Support Vector Machine(SVM)and Nave Bayes.The performance of the suggested system has been compared with standard works,and experimental findings are assessed using several assessment metrics.展开更多
In recent years,deep learning methods have developed rapidly and found application in many fields,including natural language processing.In the field of aspect-level sentiment analysis,deep learning methods can also gr...In recent years,deep learning methods have developed rapidly and found application in many fields,including natural language processing.In the field of aspect-level sentiment analysis,deep learning methods can also greatly improve the performance of models.However,previous studies did not take into account the relationship between user feature extraction and contextual terms.To address this issue,we use data feature extraction and deep learning combined to develop an aspect-level sentiment analysis method.To be specific,we design user comment feature extraction(UCFE)to distill salient features from users’historical comments and transform them into representative user feature vectors.Then,the aspect-sentence graph convolutional neural network(ASGCN)is used to incorporate innovative techniques for calculating adjacency matrices;meanwhile,ASGCN emphasizes capturing nuanced semantics within relationships among aspect words and syntactic dependency types.Afterward,three embedding methods are devised to embed the user feature vector into the ASGCN model.The empirical validations verify the effectiveness of these models,consistently surpassing conventional benchmarks and reaffirming the indispensable role of deep learning in advancing sentiment analysis methodologies.展开更多
Contrastive self‐supervised representation learning on attributed graph networks with Graph Neural Networks has attracted considerable research interest recently.However,there are still two challenges.First,most of t...Contrastive self‐supervised representation learning on attributed graph networks with Graph Neural Networks has attracted considerable research interest recently.However,there are still two challenges.First,most of the real‐word system are multiple relations,where entities are linked by different types of relations,and each relation is a view of the graph network.Second,the rich multi‐scale information(structure‐level and feature‐level)of the graph network can be seen as self‐supervised signals,which are not fully exploited.A novel contrastive self‐supervised representation learning framework on attributed multiplex graph networks with multi‐scale(named CoLM^(2)S)information is presented in this study.It mainly contains two components:intra‐relation contrast learning and interrelation contrastive learning.Specifically,the contrastive self‐supervised representation learning framework on attributed single‐layer graph networks with multi‐scale information(CoLMS)framework with the graph convolutional network as encoder to capture the intra‐relation information with multi‐scale structure‐level and feature‐level selfsupervised signals is introduced first.The structure‐level information includes the edge structure and sub‐graph structure,and the feature‐level information represents the output of different graph convolutional layer.Second,according to the consensus assumption among inter‐relations,the CoLM^(2)S framework is proposed to jointly learn various graph relations in attributed multiplex graph network to achieve global consensus node embedding.The proposed method can fully distil the graph information.Extensive experiments on unsupervised node clustering and graph visualisation tasks demonstrate the effectiveness of our methods,and it outperforms existing competitive baselines.展开更多
Cross-project software defect prediction(CPDP)aims to enhance defect prediction in target projects with limited or no historical data by leveraging information from related source projects.The existing CPDP approaches...Cross-project software defect prediction(CPDP)aims to enhance defect prediction in target projects with limited or no historical data by leveraging information from related source projects.The existing CPDP approaches rely on static metrics or dynamic syntactic features,which have shown limited effectiveness in CPDP due to their inability to capture higher-level system properties,such as complex design patterns,relationships between multiple functions,and dependencies in different software projects,that are important for CPDP.This paper introduces a novel approach,a graph-based feature learning model for CPDP(GB-CPDP),that utilizes NetworkX to extract features and learn representations of program entities from control flow graphs(CFGs)and data dependency graphs(DDGs).These graphs capture the structural and data dependencies within the source code.The proposed approach employs Node2Vec to transform CFGs and DDGs into numerical vectors and leverages Long Short-Term Memory(LSTM)networks to learn predictive models.The process involves graph construction,feature learning through graph embedding and LSTM,and defect prediction.Experimental evaluation using nine open-source Java projects from the PROMISE dataset demonstrates that GB-CPDP outperforms state-of-the-art CPDP methods in terms of F1-measure and Area Under the Curve(AUC).The results showcase the effectiveness of GB-CPDP in improving the performance of cross-project defect prediction.展开更多
With the rapid development of the 5G communications,the edge intelligence enables Internet of Vehicles(IoV)to provide traffic forecasting to alleviate traffic congestion and improve quality of experience of users simu...With the rapid development of the 5G communications,the edge intelligence enables Internet of Vehicles(IoV)to provide traffic forecasting to alleviate traffic congestion and improve quality of experience of users simultaneously.To enhance the forecasting performance,a novel edge-enabled probabilistic graph structure learning model(PGSLM)is proposed,which learns the graph structure and parameters by the edge sensing information and discrete probability distribution on the edges of the traffic road network.To obtain the spatio-temporal dependencies of traffic data,the learned dynamic graphs are combined with a predefined static graph to generate the graph convolution part of the recurrent graph convolution module.During the training process,a new graph training loss is introduced,which is composed of the K nearest neighbor(KNN)graph constructed by the traffic feature tensors and the graph structure.Detailed experimental results show that,compared with existing models,the proposed PGSLM improves the traffic prediction performance in terms of average absolute error and root mean square error in IoV.展开更多
With the popularity of online learning in educational settings, knowledge tracing(KT) plays an increasingly significant role. The task of KT is to help students learn more effectively by predicting their next mastery ...With the popularity of online learning in educational settings, knowledge tracing(KT) plays an increasingly significant role. The task of KT is to help students learn more effectively by predicting their next mastery of knowledge based on their historical exercise sequences. Nowadays, many related works have emerged in this field, such as Bayesian knowledge tracing and deep knowledge tracing methods. Despite the progress that has been made in KT, existing techniques still have the following limitations: 1) Previous studies address KT by only exploring the observational sparsity data distribution, and the counterfactual data distribution has been largely ignored. 2) Current works designed for KT only consider either the entity relationships between questions and concepts, or the relations between two concepts, and none of them investigates the relations among students, questions, and concepts, simultaneously, leading to inaccurate student modeling. To address the above limitations,we propose a graph counterfactual augmentation method for knowledge tracing. Concretely, to consider the multiple relationships among different entities, we first uniform students, questions, and concepts in graphs, and then leverage a heterogeneous graph convolutional network to conduct representation learning.To model the counterfactual world, we conduct counterfactual transformations on students’ learning graphs by changing the corresponding treatments and then exploit the counterfactual outcomes in a contrastive learning framework. We conduct extensive experiments on three real-world datasets, and the experimental results demonstrate the superiority of our proposed Graph CA method compared with several state-of-the-art baselines.展开更多
The traditional malware research is mainly based on its recognition and detection as a breakthrough point,without focusing on its propagation trends or predicting the subsequently infected nodes.The complexity of netw...The traditional malware research is mainly based on its recognition and detection as a breakthrough point,without focusing on its propagation trends or predicting the subsequently infected nodes.The complexity of network structure,diversity of network nodes,and sparsity of data all pose difficulties in predicting propagation.This paper proposes a malware propagation prediction model based on representation learning and Graph Convolutional Networks(GCN)to address the aforementioned problems.First,to solve the problem of the inaccuracy of infection intensity calculation caused by the sparsity of node interaction behavior data in the malware propagation network,a mechanism based on a tensor to mine the infection intensity among nodes is proposed to retain the network structure information.The influence of the relationship between nodes on the infection intensity is also analyzed.Second,given the diversity and complexity of the content and structure of infected and normal nodes in the network,considering the advantages of representation learning in data feature extraction,the corresponding representation learning method is adopted for the characteristics of infection intensity among nodes.This can efficiently calculate the relationship between entities and relationships in low dimensional space to achieve the goal of low dimensional,dense,and real-valued representation learning for the characteristics of propagation spatial data.We also design a new method,Tensor2vec,to learn the potential structural features of malware propagation.Finally,considering the convolution ability of GCN for non-Euclidean data,we propose a dynamic prediction model of malware propagation based on representation learning and GCN to solve the time effectiveness problem of the malware propagation carrier.The experimental results show that the proposed model can effectively predict the behaviors of the nodes in the network and discover the influence of different characteristics of nodes on the malware propagation situation.展开更多
Attacks on the cyber space is getting exponential in recent times.Illegal penetrations and breaches are real threats to the individuals and organizations.Conventional security systems are good enough to detect the kno...Attacks on the cyber space is getting exponential in recent times.Illegal penetrations and breaches are real threats to the individuals and organizations.Conventional security systems are good enough to detect the known threats but when it comes to Advanced Persistent Threats(APTs)they fails.These APTs are targeted,more sophisticated and very persistent and incorporates lot of evasive techniques to bypass the existing defenses.Hence,there is a need for an effective defense system that can achieve a complete reliance of security.To address the above-mentioned issues,this paper proposes a novel honeypot system that tracks the anonymous behavior of the APT threats.The key idea of honeypot leverages the concepts of graph theory to detect such targeted attacks.The proposed honey-pot is self-realizing,strategic assisted which withholds the APTs actionable tech-niques and observes the behavior for analysis and modelling.The proposed graph theory based self learning honeypot using the resultsγ(C(n,1)),γc(C(n,1)),γsc(C(n,1))outperforms traditional techniques by detecting APTs behavioral with detection rate of 96%.展开更多
The increasing adoption of renewable energy has posed challenges for voltage regulation in power distribution networks.Gridaware energy management,which includes the control of smart inverters and energy management sy...The increasing adoption of renewable energy has posed challenges for voltage regulation in power distribution networks.Gridaware energy management,which includes the control of smart inverters and energy management systems,is a trending way to mitigate this problem.However,existing multi-agent reinforcement learning methods for grid-aware energy management have not sufficiently considered the importance of agent cooperation and the unique characteristics of the grid,which leads to limited performance.In this study,we propose a new approach named multi-agent hierarchical graph attention reinforcement learning framework(MAHGA)to stabilize the voltage.Specifically,under the paradigm of centralized training and decentralized execution,we model the power distribution network as a novel hierarchical graph containing the agent-level topology and the bus-level topology.Then a hierarchical graph attention model is devised to capture the complex correlation between agents.Moreover,we incorporate graph contrastive learning as an auxiliary task in the reinforcement learning process to improve representation learning from graphs.Experiments on several real-world scenarios reveal that our approach achieves the best performance and can reduce the number of voltage violations remarkably.展开更多
文摘The aim of this paper is to broaden the application of Stochastic Configuration Network (SCN) in the semi-supervised domain by utilizing common unlabeled data in daily life. It can enhance the classification accuracy of decentralized SCN algorithms while effectively protecting user privacy. To this end, we propose a decentralized semi-supervised learning algorithm for SCN, called DMT-SCN, which introduces teacher and student models by combining the idea of consistency regularization to improve the response speed of model iterations. In order to reduce the possible negative impact of unsupervised data on the model, we purposely change the way of adding noise to the unlabeled data. Simulation results show that the algorithm can effectively utilize unlabeled data to improve the classification accuracy of SCN training and is robust under different ground simulation environments.
文摘Using resting-state functional magnetic resonance imaging (fMRI) technology to assist in identifying brain diseases has great potential. In the identification of brain diseases, graph-based models have been widely used, where graph represents the similarity between patients or brain regions of interest. In these models, constructing high-quality graphs is of paramount importance. Researchers have proposed various methods for constructing graphs from different perspectives, among which the simplest and most popular one is Pearson Correlation (PC). Although existing methods have achieved significant results, these graphs are usually fixed once they are constructed, and are generally operated separately from downstream task. Such a separation may result in neither the constructed graph nor the extracted features being ideal. To solve this problem, we use the graph-optimized locality preserving projection algorithm to extract features and the population graph simultaneously, aiming in higher identification accuracy through a task-dependent automatic optimization of the graph. At the same time, we incorporate supervised information to enable more flexible modelling. Specifically, the proposed method first uses PC to construct graph as the initial feature for each subject. Then, the projection matrix and graph are iteratively optimized through graph-optimization locality preserving projections based on semi-supervised learning, which fully employs the knowledge in various transformation spaces. Finally, the obtained projection matrix is applied to construct the subject-level graph and perform classification using support vector machines. To verify the effectiveness of the proposed method, we conduct experiments to identify subjects with mild cognitive impairment (MCI) and Autism spectrum disorder (ASD) from normal controls (NCs), and the results showed that the classification performance of our method is better than that of the baseline method.
基金supported by the Open Project of Xiangjiang Laboratory (22XJ02003)Scientific Project of the National University of Defense Technology (NUDT)(ZK21-07, 23-ZZCX-JDZ-28)+1 种基金the National Science Fund for Outstanding Young Scholars (62122093)the National Natural Science Foundation of China (72071205)。
文摘Traditional expert-designed branching rules in branch-and-bound(B&B) are static, often failing to adapt to diverse and evolving problem instances. Crafting these rules is labor-intensive, and may not scale well with complex problems.Given the frequent need to solve varied combinatorial optimization problems, leveraging statistical learning to auto-tune B&B algorithms for specific problem classes becomes attractive. This paper proposes a graph pointer network model to learn the branch rules. Graph features, global features and historical features are designated to represent the solver state. The graph neural network processes graph features, while the pointer mechanism assimilates the global and historical features to finally determine the variable on which to branch. The model is trained to imitate the expert strong branching rule by a tailored top-k Kullback-Leibler divergence loss function. Experiments on a series of benchmark problems demonstrate that the proposed approach significantly outperforms the widely used expert-designed branching rules. It also outperforms state-of-the-art machine-learning-based branch-and-bound methods in terms of solving speed and search tree size on all the test instances. In addition, the model can generalize to unseen instances and scale to larger instances.
基金Researchers Supporting Project Number(RSPD2024R 553),King Saud University,Riyadh,Saudi Arabia.
文摘Wheat is a critical crop,extensively consumed worldwide,and its production enhancement is essential to meet escalating demand.The presence of diseases like stem rust,leaf rust,yellow rust,and tan spot significantly diminishes wheat yield,making the early and precise identification of these diseases vital for effective disease management.With advancements in deep learning algorithms,researchers have proposed many methods for the automated detection of disease pathogens;however,accurately detectingmultiple disease pathogens simultaneously remains a challenge.This challenge arises due to the scarcity of RGB images for multiple diseases,class imbalance in existing public datasets,and the difficulty in extracting features that discriminate between multiple classes of disease pathogens.In this research,a novel method is proposed based on Transfer Generative Adversarial Networks for augmenting existing data,thereby overcoming the problems of class imbalance and data scarcity.This study proposes a customized architecture of Vision Transformers(ViT),where the feature vector is obtained by concatenating features extracted from the custom ViT and Graph Neural Networks.This paper also proposes a Model AgnosticMeta Learning(MAML)based ensemble classifier for accurate classification.The proposedmodel,validated on public datasets for wheat disease pathogen classification,achieved a test accuracy of 99.20%and an F1-score of 97.95%.Compared with existing state-of-the-art methods,this proposed model outperforms in terms of accuracy,F1-score,and the number of disease pathogens detection.In future,more diseases can be included for detection along with some other modalities like pests and weed.
基金supported by Innovation Talents Promotion Program of Shaanxi Province,China(No.2021TD08)。
文摘Radio frequency fingerprinting(RFF)is a remarkable lightweight authentication scheme to support rapid and scalable identification in the internet of things(IoT)systems.Deep learning(DL)is a critical enabler of RFF identification by leveraging the hardware-level features.However,traditional supervised learning methods require huge labeled training samples.Therefore,how to establish a highperformance supervised learning model with few labels under practical application is still challenging.To address this issue,we in this paper propose a novel RFF semi-supervised learning(RFFSSL)model which can obtain a better performance with few meta labels.Specifically,the proposed RFFSSL model is constituted by a teacher-student network,in which the student network learns from the pseudo label predicted by the teacher.Then,the output of the student model will be exploited to improve the performance of teacher among the labeled data.Furthermore,a comprehensive evaluation on the accuracy is conducted.We derive about 50 GB real long-term evolution(LTE)mobile phone’s raw signal datasets,which is used to evaluate various models.Experimental results demonstrate that the proposed RFFSSL scheme can achieve up to 97%experimental testing accuracy over a noisy environment only with 10%labeled samples when training samples equal to 2700.
基金supported by Korea Institute for Advancement of Technology(KIAT)grant funded by theKoreaGovernment(MOTIE)(P0008703,The CompetencyDevelopment Program for Industry Specialist).
文摘Intrusion detection involves identifying unauthorized network activity and recognizing whether the data constitute an abnormal network transmission.Recent research has focused on using semi-supervised learning mechanisms to identify abnormal network traffic to deal with labeled and unlabeled data in the industry.However,real-time training and classifying network traffic pose challenges,as they can lead to the degradation of the overall dataset and difficulties preventing attacks.Additionally,existing semi-supervised learning research might need to analyze the experimental results comprehensively.This paper proposes XA-GANomaly,a novel technique for explainable adaptive semi-supervised learning using GANomaly,an image anomalous detection model that dynamically trains small subsets to these issues.First,this research introduces a deep neural network(DNN)-based GANomaly for semi-supervised learning.Second,this paper presents the proposed adaptive algorithm for the DNN-based GANomaly,which is validated with four subsets of the adaptive dataset.Finally,this study demonstrates a monitoring system that incorporates three explainable techniques—Shapley additive explanations,reconstruction error visualization,and t-distributed stochastic neighbor embedding—to respond effectively to attacks on traffic data at each feature engineering stage,semi-supervised learning,and adaptive learning.Compared to other single-class classification techniques,the proposed DNN-based GANomaly achieves higher scores for Network Security Laboratory-Knowledge Discovery in Databases and UNSW-NB15 datasets at 13%and 8%of F1 scores and 4.17%and 11.51%for accuracy,respectively.Furthermore,experiments of the proposed adaptive learning reveal mostly improved results over the initial values.An analysis and monitoring system based on the combination of the three explainable methodologies is also described.Thus,the proposed method has the potential advantages to be applied in practical industry,and future research will explore handling unbalanced real-time datasets in various scenarios.
基金supported in part by the National Key R&D Program of China under Grant 2018YFA0701601part by the National Natural Science Foundation of China(Grant No.U22A2002,61941104,62201605)part by Tsinghua University-China Mobile Communications Group Co.,Ltd.Joint Institute。
文摘In the upcoming large-scale Internet of Things(Io T),it is increasingly challenging to defend against malicious traffic,due to the heterogeneity of Io T devices and the diversity of Io T communication protocols.In this paper,we propose a semi-supervised learning-based approach to detect malicious traffic at the access side.It overcomes the resource-bottleneck problem of traditional malicious traffic defenders which are deployed at the victim side,and also is free of labeled traffic data in model training.Specifically,we design a coarse-grained behavior model of Io T devices by self-supervised learning with unlabeled traffic data.Then,we fine-tune this model to improve its accuracy in malicious traffic detection by adopting a transfer learning method using a small amount of labeled data.Experimental results show that our method can achieve the accuracy of 99.52%and the F1-score of 99.52%with only 1%of the labeled training data based on the CICDDoS2019 dataset.Moreover,our method outperforms the stateof-the-art supervised learning-based methods in terms of accuracy,precision,recall and F1-score with 1%of the training data.
基金supported by the Fifth Key Project of Jiangsu Vocational Education Teaching Reform Research under Grant ZZZ13in part by the Science and Technology Project of Changzhou City under Grant CE20215032.
文摘Through semi-supervised learning and knowledge inheritance,a novel Takagi-Sugeno-Kang(TSK)fuzzy system framework is proposed for epilepsy data classification in this study.The new method is based on the maximum mean discrepancy(MMD)method and TSK fuzzy system,as a basic model for the classification of epilepsy data.First,formedical data,the interpretability of TSK fuzzy systems can ensure that the prediction results are traceable and safe.Second,in view of the deviation in the data distribution between the real source domain and the target domain,MMD is used to measure the distance between different data distributions.The objective function is constructed according to the MMD distance,and the distribution distance of different datasets is minimized to find the similar characteristics of different datasets.We introduce semi-supervised learning to further explore the relationship between data.Based on the MMD method,a semi-supervised learning(SSL)-MMD method is constructed by using pseudo-tags to realize the data distribution alignment of the same category.In addition,the idea of knowledge dissemination is used to learn pseudo-tags as additional data features.Finally,for epilepsy classification,the cross-domain TSK fuzzy system uses the cross-entropy function as the objective function and adopts the back-propagation strategy to optimize the parameters.The experimental results show that the new method can process complex epilepsy data and identify whether patients have epilepsy.
基金The publication of this article is funded by the Qatar National Library.
文摘Malaria is a lethal disease responsible for thousands of deaths worldwide every year.Manual methods of malaria diagnosis are timeconsuming that require a great deal of human expertise and efforts.Computerbased automated diagnosis of diseases is progressively becoming popular.Although deep learning models show high performance in the medical field,it demands a large volume of data for training which is hard to acquire for medical problems.Similarly,labeling of medical images can be done with the help of medical experts only.Several recent studies have utilized deep learning models to develop efficient malaria diagnostic system,which showed promising results.However,the most common problem with these models is that they need a large amount of data for training.This paper presents a computer-aided malaria diagnosis system that combines a semi-supervised generative adversarial network and transfer learning.The proposed model is trained in a semi-supervised manner and requires less training data than conventional deep learning models.Performance of the proposed model is evaluated on a publicly available dataset of blood smear images(with malariainfected and normal class)and achieved a classification accuracy of 96.6%.
基金supported by the National Natural Science Foundation of China (Nos.62072127,62002076,61906049)Natural Science Foundation of Guangdong Province (Nos.2023A1515011774,2020A1515010423)+4 种基金Project 6142111180404 supported by CNKLSTISS,Science and Technology Program of Guangzhou,China (No.202002030131)Guangdong basic and applied basic research fund joint fund Youth Fund (No.2019A1515110213)Open Fund Project of Fujian Provincial Key Laboratory of Information Processing and Intelligent Control (Minjiang University) (No.MJUKF-IPIC202101)Natural Science Foundation of Guangdong Province No.2020A1515010423)Scientific research project for Guangzhou University (No.RP2022003).
文摘Recent state-of-the-art semi-supervised learning(SSL)methods usually use data augmentations as core components.Such methods,however,are limited to simple transformations such as the augmentations under the instance’s naive representations or the augmentations under the instance’s semantic representations.To tackle this problem,we offer a unique insight into data augmentations and propose a novel data-augmentation-based semi-supervised learning method,called Attentive Neighborhood Feature Aug-mentation(ANFA).The motivation of our method lies in the observation that the relationship between the given feature and its neighborhood may contribute to constructing more reliable transformations for the data,and further facilitating the classifier to distinguish the ambiguous features from the low-dense regions.Specially,we first project the labeled and unlabeled data points into an embedding space and then construct a neighbor graph that serves as a similarity measure based on the similar representations in the embedding space.Then,we employ an attention mechanism to transform the target features into augmented ones based on the neighbor graph.Finally,we formulate a novel semi-supervised loss by encouraging the predictions of the interpolations of augmented features to be consistent with the corresponding interpolations of the predictions of the target features.We carried out exper-iments on SVHN and CIFAR-10 benchmark datasets and the experimental results demonstrate that our method outperforms the state-of-the-art methods when the number of labeled examples is limited.
基金supported by the Special Funds for Basic Research of Central Universities(D5000220240)the Special Funds for Education and Teaching Reform in 2023(06410-23GZ230102)。
文摘Software testing courses are characterized by strong practicality,comprehensiveness,and diversity.Due to the differences among students and the needs to design personalized solutions for their specific requirements,the design of the existing software testing courses fails to meet the demands for personalized learning.Knowledge graphs,with their rich semantics and good visualization effects,have a wide range of applications in the field of education.In response to the current problem of software testing courses which fails to meet the needs for personalized learning,this paper offers a learning path recommendation based on knowledge graphs to provide personalized learning paths for students.
文摘Online review platforms are becoming increasingly popular,encouraging dishonest merchants and service providers to deceive customers by creating fake reviews for their goods or services.Using Sybil accounts,bot farms,and real account purchases,immoral actors demonize rivals and advertise their goods.Most academic and industry efforts have been aimed at detecting fake/fraudulent product or service evaluations for years.The primary hurdle to identifying fraudulent reviews is the lack of a reliable means to distinguish fraudulent reviews from real ones.This paper adopts a semi-supervised machine learning method to detect fake reviews on any website,among other things.Online reviews are classified using a semi-supervised approach(PU-learning)since there is a shortage of labeled data,and they are dynamic.Then,classification is performed using the machine learning techniques Support Vector Machine(SVM)and Nave Bayes.The performance of the suggested system has been compared with standard works,and experimental findings are assessed using several assessment metrics.
基金This work is partly supported by the Fundamental Research Funds for the Central Universities(CUC230A013)It is partly supported by Natural Science Foundation of Beijing Municipality(No.4222038)It is also supported by National Natural Science Foundation of China(Grant No.62176240).
文摘In recent years,deep learning methods have developed rapidly and found application in many fields,including natural language processing.In the field of aspect-level sentiment analysis,deep learning methods can also greatly improve the performance of models.However,previous studies did not take into account the relationship between user feature extraction and contextual terms.To address this issue,we use data feature extraction and deep learning combined to develop an aspect-level sentiment analysis method.To be specific,we design user comment feature extraction(UCFE)to distill salient features from users’historical comments and transform them into representative user feature vectors.Then,the aspect-sentence graph convolutional neural network(ASGCN)is used to incorporate innovative techniques for calculating adjacency matrices;meanwhile,ASGCN emphasizes capturing nuanced semantics within relationships among aspect words and syntactic dependency types.Afterward,three embedding methods are devised to embed the user feature vector into the ASGCN model.The empirical validations verify the effectiveness of these models,consistently surpassing conventional benchmarks and reaffirming the indispensable role of deep learning in advancing sentiment analysis methodologies.
基金support by the National Natural Science Foundation of China(NSFC)under grant number 61873274.
文摘Contrastive self‐supervised representation learning on attributed graph networks with Graph Neural Networks has attracted considerable research interest recently.However,there are still two challenges.First,most of the real‐word system are multiple relations,where entities are linked by different types of relations,and each relation is a view of the graph network.Second,the rich multi‐scale information(structure‐level and feature‐level)of the graph network can be seen as self‐supervised signals,which are not fully exploited.A novel contrastive self‐supervised representation learning framework on attributed multiplex graph networks with multi‐scale(named CoLM^(2)S)information is presented in this study.It mainly contains two components:intra‐relation contrast learning and interrelation contrastive learning.Specifically,the contrastive self‐supervised representation learning framework on attributed single‐layer graph networks with multi‐scale information(CoLMS)framework with the graph convolutional network as encoder to capture the intra‐relation information with multi‐scale structure‐level and feature‐level selfsupervised signals is introduced first.The structure‐level information includes the edge structure and sub‐graph structure,and the feature‐level information represents the output of different graph convolutional layer.Second,according to the consensus assumption among inter‐relations,the CoLM^(2)S framework is proposed to jointly learn various graph relations in attributed multiplex graph network to achieve global consensus node embedding.The proposed method can fully distil the graph information.Extensive experiments on unsupervised node clustering and graph visualisation tasks demonstrate the effectiveness of our methods,and it outperforms existing competitive baselines.
基金supported by Institute of Information&Communications Technology Planning&Evaluation(IITP)grant funded by the Korea government(MSIT)(No.RS-2022-00155885).
文摘Cross-project software defect prediction(CPDP)aims to enhance defect prediction in target projects with limited or no historical data by leveraging information from related source projects.The existing CPDP approaches rely on static metrics or dynamic syntactic features,which have shown limited effectiveness in CPDP due to their inability to capture higher-level system properties,such as complex design patterns,relationships between multiple functions,and dependencies in different software projects,that are important for CPDP.This paper introduces a novel approach,a graph-based feature learning model for CPDP(GB-CPDP),that utilizes NetworkX to extract features and learn representations of program entities from control flow graphs(CFGs)and data dependency graphs(DDGs).These graphs capture the structural and data dependencies within the source code.The proposed approach employs Node2Vec to transform CFGs and DDGs into numerical vectors and leverages Long Short-Term Memory(LSTM)networks to learn predictive models.The process involves graph construction,feature learning through graph embedding and LSTM,and defect prediction.Experimental evaluation using nine open-source Java projects from the PROMISE dataset demonstrates that GB-CPDP outperforms state-of-the-art CPDP methods in terms of F1-measure and Area Under the Curve(AUC).The results showcase the effectiveness of GB-CPDP in improving the performance of cross-project defect prediction.
基金supported by the project of the National Natural Science Foundation of China(No.61772562)the Knowledge Innovation Program of Wuhan-Basic Research(No.2022010801010225)the Fundamental Research Funds for the Central Universities(No.2662022YJ012)。
文摘With the rapid development of the 5G communications,the edge intelligence enables Internet of Vehicles(IoV)to provide traffic forecasting to alleviate traffic congestion and improve quality of experience of users simultaneously.To enhance the forecasting performance,a novel edge-enabled probabilistic graph structure learning model(PGSLM)is proposed,which learns the graph structure and parameters by the edge sensing information and discrete probability distribution on the edges of the traffic road network.To obtain the spatio-temporal dependencies of traffic data,the learned dynamic graphs are combined with a predefined static graph to generate the graph convolution part of the recurrent graph convolution module.During the training process,a new graph training loss is introduced,which is composed of the K nearest neighbor(KNN)graph constructed by the traffic feature tensors and the graph structure.Detailed experimental results show that,compared with existing models,the proposed PGSLM improves the traffic prediction performance in terms of average absolute error and root mean square error in IoV.
基金supported by the Natural Science Foundation of China (62372277)the Natural Science Foundation of Shandong Province (ZR2022MF257, ZR2022MF295)Humanities and Social Sciences Fund of the Ministry of Education (21YJC630157)。
文摘With the popularity of online learning in educational settings, knowledge tracing(KT) plays an increasingly significant role. The task of KT is to help students learn more effectively by predicting their next mastery of knowledge based on their historical exercise sequences. Nowadays, many related works have emerged in this field, such as Bayesian knowledge tracing and deep knowledge tracing methods. Despite the progress that has been made in KT, existing techniques still have the following limitations: 1) Previous studies address KT by only exploring the observational sparsity data distribution, and the counterfactual data distribution has been largely ignored. 2) Current works designed for KT only consider either the entity relationships between questions and concepts, or the relations between two concepts, and none of them investigates the relations among students, questions, and concepts, simultaneously, leading to inaccurate student modeling. To address the above limitations,we propose a graph counterfactual augmentation method for knowledge tracing. Concretely, to consider the multiple relationships among different entities, we first uniform students, questions, and concepts in graphs, and then leverage a heterogeneous graph convolutional network to conduct representation learning.To model the counterfactual world, we conduct counterfactual transformations on students’ learning graphs by changing the corresponding treatments and then exploit the counterfactual outcomes in a contrastive learning framework. We conduct extensive experiments on three real-world datasets, and the experimental results demonstrate the superiority of our proposed Graph CA method compared with several state-of-the-art baselines.
基金This research is partially supported by the National Natural Science Foundation of China(Grant No.61772098)Chongqing Technology Innovation and Application Development Project(Grant No.cstc2020jscxmsxmX0150)+2 种基金Chongqing Science and Technology Innovation Leading Talent Support Program(CSTCCXLJRC201908)Basic and Advanced Research Projects of CSTC(No.cstc2019jcyj-zdxmX0008)Science and Technology Research Program of Chongqing Municipal Education Commission(Grant No.KJZD-K201900605).
文摘The traditional malware research is mainly based on its recognition and detection as a breakthrough point,without focusing on its propagation trends or predicting the subsequently infected nodes.The complexity of network structure,diversity of network nodes,and sparsity of data all pose difficulties in predicting propagation.This paper proposes a malware propagation prediction model based on representation learning and Graph Convolutional Networks(GCN)to address the aforementioned problems.First,to solve the problem of the inaccuracy of infection intensity calculation caused by the sparsity of node interaction behavior data in the malware propagation network,a mechanism based on a tensor to mine the infection intensity among nodes is proposed to retain the network structure information.The influence of the relationship between nodes on the infection intensity is also analyzed.Second,given the diversity and complexity of the content and structure of infected and normal nodes in the network,considering the advantages of representation learning in data feature extraction,the corresponding representation learning method is adopted for the characteristics of infection intensity among nodes.This can efficiently calculate the relationship between entities and relationships in low dimensional space to achieve the goal of low dimensional,dense,and real-valued representation learning for the characteristics of propagation spatial data.We also design a new method,Tensor2vec,to learn the potential structural features of malware propagation.Finally,considering the convolution ability of GCN for non-Euclidean data,we propose a dynamic prediction model of malware propagation based on representation learning and GCN to solve the time effectiveness problem of the malware propagation carrier.The experimental results show that the proposed model can effectively predict the behaviors of the nodes in the network and discover the influence of different characteristics of nodes on the malware propagation situation.
文摘Attacks on the cyber space is getting exponential in recent times.Illegal penetrations and breaches are real threats to the individuals and organizations.Conventional security systems are good enough to detect the known threats but when it comes to Advanced Persistent Threats(APTs)they fails.These APTs are targeted,more sophisticated and very persistent and incorporates lot of evasive techniques to bypass the existing defenses.Hence,there is a need for an effective defense system that can achieve a complete reliance of security.To address the above-mentioned issues,this paper proposes a novel honeypot system that tracks the anonymous behavior of the APT threats.The key idea of honeypot leverages the concepts of graph theory to detect such targeted attacks.The proposed honey-pot is self-realizing,strategic assisted which withholds the APTs actionable tech-niques and observes the behavior for analysis and modelling.The proposed graph theory based self learning honeypot using the resultsγ(C(n,1)),γc(C(n,1)),γsc(C(n,1))outperforms traditional techniques by detecting APTs behavioral with detection rate of 96%.
基金supported by National Key R&D Program of China under Grant No.2022ZD0119802National Natural Science Foundation of China under Grant No.61836011.
文摘The increasing adoption of renewable energy has posed challenges for voltage regulation in power distribution networks.Gridaware energy management,which includes the control of smart inverters and energy management systems,is a trending way to mitigate this problem.However,existing multi-agent reinforcement learning methods for grid-aware energy management have not sufficiently considered the importance of agent cooperation and the unique characteristics of the grid,which leads to limited performance.In this study,we propose a new approach named multi-agent hierarchical graph attention reinforcement learning framework(MAHGA)to stabilize the voltage.Specifically,under the paradigm of centralized training and decentralized execution,we model the power distribution network as a novel hierarchical graph containing the agent-level topology and the bus-level topology.Then a hierarchical graph attention model is devised to capture the complex correlation between agents.Moreover,we incorporate graph contrastive learning as an auxiliary task in the reinforcement learning process to improve representation learning from graphs.Experiments on several real-world scenarios reveal that our approach achieves the best performance and can reduce the number of voltage violations remarkably.