Visual object tracking plays a crucial role in computer vision.In recent years,researchers have proposed various methods to achieve high-performance object tracking.Among these,methods based on Transformers have becom...Visual object tracking plays a crucial role in computer vision.In recent years,researchers have proposed various methods to achieve high-performance object tracking.Among these,methods based on Transformers have become a research hotspot due to their ability to globally model and contextualize information.However,current Transformer-based object tracking methods still face challenges such as low tracking accuracy and the presence of redundant feature information.In this paper,we introduce self-calibration multi-head self-attention Transformer(SMSTracker)as a solution to these challenges.It employs a hybrid tensor decomposition self-organizing multihead self-attention transformermechanism,which not only compresses and accelerates Transformer operations but also significantly reduces redundant data,thereby enhancing the accuracy and efficiency of tracking.Additionally,we introduce a self-calibration attention fusion block to resolve common issues of attention ambiguities and inconsistencies found in traditional trackingmethods,ensuring the stability and reliability of tracking performance across various scenarios.By integrating a hybrid tensor decomposition approach with a self-organizingmulti-head self-attentive transformer mechanism,SMSTracker enhances the efficiency and accuracy of the tracking process.Experimental results show that SMSTracker achieves competitive performance in visual object tracking,promising more robust and efficient tracking systems,demonstrating its potential to providemore robust and efficient tracking solutions in real-world applications.展开更多
The satellite-terrestrial networks possess the ability to transcend geographical constraints inherent in traditional communication networks,enabling global coverage and offering users ubiquitous computing power suppor...The satellite-terrestrial networks possess the ability to transcend geographical constraints inherent in traditional communication networks,enabling global coverage and offering users ubiquitous computing power support,which is an important development direction of future communications.In this paper,we take into account a multi-scenario network model under the coverage of low earth orbit(LEO)satellite,which can provide computing resources to users in faraway areas to improve task processing efficiency.However,LEO satellites experience limitations in computing and communication resources and the channels are time-varying and complex,which makes the extraction of state information a daunting task.Therefore,we explore the dynamic resource management issue pertaining to joint computing,communication resource allocation and power control for multi-access edge computing(MEC).In order to tackle this formidable issue,we undertake the task of transforming the issue into a Markov decision process(MDP)problem and propose the self-attention based dynamic resource management(SABDRM)algorithm,which effectively extracts state information features to enhance the training process.Simulation results show that the proposed algorithm is capable of effectively reducing the long-term average delay and energy consumption of the tasks.展开更多
In the Industrial Internet of Things(IIoT),sensors generate time series data to reflect the working state.When the systems are attacked,timely identification of outliers in time series is critical to ensure security.A...In the Industrial Internet of Things(IIoT),sensors generate time series data to reflect the working state.When the systems are attacked,timely identification of outliers in time series is critical to ensure security.Although many anomaly detection methods have been proposed,the temporal correlation of the time series over the same sensor and the state(spatial)correlation between different sensors are rarely considered simultaneously in these methods.Owing to the superior capability of Transformer in learning time series features.This paper proposes a time series anomaly detection method based on a spatial-temporal network and an improved Transformer.Additionally,the methods based on graph neural networks typically include a graph structure learning module and an anomaly detection module,which are interdependent.However,in the initial phase of training,since neither of the modules has reached an optimal state,their performance may influence each other.This scenario makes the end-to-end training approach hard to effectively direct the learning trajectory of each module.This interdependence between the modules,coupled with the initial instability,may cause the model to find it hard to find the optimal solution during the training process,resulting in unsatisfactory results.We introduce an adaptive graph structure learning method to obtain the optimal model parameters and graph structure.Experiments on two publicly available datasets demonstrate that the proposed method attains higher anomaly detection results than other methods.展开更多
Nowadays,ensuring thequality of networkserviceshas become increasingly vital.Experts are turning toknowledge graph technology,with a significant emphasis on entity extraction in the identification of device configurat...Nowadays,ensuring thequality of networkserviceshas become increasingly vital.Experts are turning toknowledge graph technology,with a significant emphasis on entity extraction in the identification of device configurations.This research paper presents a novel entity extraction method that leverages a combination of active learning and attention mechanisms.Initially,an improved active learning approach is employed to select the most valuable unlabeled samples,which are subsequently submitted for expert labeling.This approach successfully addresses the problems of isolated points and sample redundancy within the network configuration sample set.Then the labeled samples are utilized to train the model for network configuration entity extraction.Furthermore,the multi-head self-attention of the transformer model is enhanced by introducing the Adaptive Weighting method based on the Laplace mixture distribution.This enhancement enables the transformer model to dynamically adapt its focus to words in various positions,displaying exceptional adaptability to abnormal data and further elevating the accuracy of the proposed model.Through comparisons with Random Sampling(RANDOM),Maximum Normalized Log-Probability(MNLP),Least Confidence(LC),Token Entrop(TE),and Entropy Query by Bagging(EQB),the proposed method,Entropy Query by Bagging and Maximum Influence Active Learning(EQBMIAL),achieves comparable performance with only 40% of the samples on both datasets,while other algorithms require 50% of the samples.Furthermore,the entity extraction algorithm with the Adaptive Weighted Multi-head Attention mechanism(AW-MHA)is compared with BILSTM-CRF,Mutil_Attention-Bilstm-Crf,Deep_Neural_Model_NER and BERT_Transformer,achieving precision rates of 75.98% and 98.32% on the two datasets,respectively.Statistical tests demonstrate the statistical significance and effectiveness of the proposed algorithms in this paper.展开更多
Wheat is a critical crop,extensively consumed worldwide,and its production enhancement is essential to meet escalating demand.The presence of diseases like stem rust,leaf rust,yellow rust,and tan spot significantly di...Wheat is a critical crop,extensively consumed worldwide,and its production enhancement is essential to meet escalating demand.The presence of diseases like stem rust,leaf rust,yellow rust,and tan spot significantly diminishes wheat yield,making the early and precise identification of these diseases vital for effective disease management.With advancements in deep learning algorithms,researchers have proposed many methods for the automated detection of disease pathogens;however,accurately detectingmultiple disease pathogens simultaneously remains a challenge.This challenge arises due to the scarcity of RGB images for multiple diseases,class imbalance in existing public datasets,and the difficulty in extracting features that discriminate between multiple classes of disease pathogens.In this research,a novel method is proposed based on Transfer Generative Adversarial Networks for augmenting existing data,thereby overcoming the problems of class imbalance and data scarcity.This study proposes a customized architecture of Vision Transformers(ViT),where the feature vector is obtained by concatenating features extracted from the custom ViT and Graph Neural Networks.This paper also proposes a Model AgnosticMeta Learning(MAML)based ensemble classifier for accurate classification.The proposedmodel,validated on public datasets for wheat disease pathogen classification,achieved a test accuracy of 99.20%and an F1-score of 97.95%.Compared with existing state-of-the-art methods,this proposed model outperforms in terms of accuracy,F1-score,and the number of disease pathogens detection.In future,more diseases can be included for detection along with some other modalities like pests and weed.展开更多
针对主流Transformer网络仅对输入像素块做自注意力计算而忽略了不同像素块间的信息交互,以及输入尺度单一导致局部特征细节模糊的问题,本文提出一种基于Transformer并用于处理视觉任务的主干网络ConvFormer. ConvFormer通过所设计的多...针对主流Transformer网络仅对输入像素块做自注意力计算而忽略了不同像素块间的信息交互,以及输入尺度单一导致局部特征细节模糊的问题,本文提出一种基于Transformer并用于处理视觉任务的主干网络ConvFormer. ConvFormer通过所设计的多尺度混洗自注意力模块(Channel-Shuffle and Multi-Scale attention,CSMS)和动态相对位置编码模块(Dynamic Relative Position Coding,DRPC)来聚合多尺度像素块间的语义信息,并在前馈网络中引入深度卷积提高网络的局部建模能力.在公开数据集ImageNet-1K,COCO 2017和ADE20K上分别进行图像分类、目标检测和语义分割实验,ConvFormer-Tiny与不同视觉任务中同量级最优网络RetNetY-4G,Swin-Tiny和ResNet50对比,精度分别提高0.3%,1.4%和0.5%.展开更多
基金supported by the National Natural Science Foundation of China under Grant 62177029the Postgraduate Research&Practice Innovation Program of Jiangsu Province(KYCX21_0740),China.
文摘Visual object tracking plays a crucial role in computer vision.In recent years,researchers have proposed various methods to achieve high-performance object tracking.Among these,methods based on Transformers have become a research hotspot due to their ability to globally model and contextualize information.However,current Transformer-based object tracking methods still face challenges such as low tracking accuracy and the presence of redundant feature information.In this paper,we introduce self-calibration multi-head self-attention Transformer(SMSTracker)as a solution to these challenges.It employs a hybrid tensor decomposition self-organizing multihead self-attention transformermechanism,which not only compresses and accelerates Transformer operations but also significantly reduces redundant data,thereby enhancing the accuracy and efficiency of tracking.Additionally,we introduce a self-calibration attention fusion block to resolve common issues of attention ambiguities and inconsistencies found in traditional trackingmethods,ensuring the stability and reliability of tracking performance across various scenarios.By integrating a hybrid tensor decomposition approach with a self-organizingmulti-head self-attentive transformer mechanism,SMSTracker enhances the efficiency and accuracy of the tracking process.Experimental results show that SMSTracker achieves competitive performance in visual object tracking,promising more robust and efficient tracking systems,demonstrating its potential to providemore robust and efficient tracking solutions in real-world applications.
基金supported by the National Key Research and Development Plan(No.2022YFB2902701)the key Natural Science Foundation of Shenzhen(No.JCYJ20220818102209020).
文摘The satellite-terrestrial networks possess the ability to transcend geographical constraints inherent in traditional communication networks,enabling global coverage and offering users ubiquitous computing power support,which is an important development direction of future communications.In this paper,we take into account a multi-scenario network model under the coverage of low earth orbit(LEO)satellite,which can provide computing resources to users in faraway areas to improve task processing efficiency.However,LEO satellites experience limitations in computing and communication resources and the channels are time-varying and complex,which makes the extraction of state information a daunting task.Therefore,we explore the dynamic resource management issue pertaining to joint computing,communication resource allocation and power control for multi-access edge computing(MEC).In order to tackle this formidable issue,we undertake the task of transforming the issue into a Markov decision process(MDP)problem and propose the self-attention based dynamic resource management(SABDRM)algorithm,which effectively extracts state information features to enhance the training process.Simulation results show that the proposed algorithm is capable of effectively reducing the long-term average delay and energy consumption of the tasks.
基金This work is partly supported by the National Key Research and Development Program of China(Grant No.2020YFB1805403)the National Natural Science Foundation of China(Grant No.62032002)the 111 Project(Grant No.B21049).
文摘In the Industrial Internet of Things(IIoT),sensors generate time series data to reflect the working state.When the systems are attacked,timely identification of outliers in time series is critical to ensure security.Although many anomaly detection methods have been proposed,the temporal correlation of the time series over the same sensor and the state(spatial)correlation between different sensors are rarely considered simultaneously in these methods.Owing to the superior capability of Transformer in learning time series features.This paper proposes a time series anomaly detection method based on a spatial-temporal network and an improved Transformer.Additionally,the methods based on graph neural networks typically include a graph structure learning module and an anomaly detection module,which are interdependent.However,in the initial phase of training,since neither of the modules has reached an optimal state,their performance may influence each other.This scenario makes the end-to-end training approach hard to effectively direct the learning trajectory of each module.This interdependence between the modules,coupled with the initial instability,may cause the model to find it hard to find the optimal solution during the training process,resulting in unsatisfactory results.We introduce an adaptive graph structure learning method to obtain the optimal model parameters and graph structure.Experiments on two publicly available datasets demonstrate that the proposed method attains higher anomaly detection results than other methods.
基金supported by the National Key R&D Program of China(2019YFB2103202).
文摘Nowadays,ensuring thequality of networkserviceshas become increasingly vital.Experts are turning toknowledge graph technology,with a significant emphasis on entity extraction in the identification of device configurations.This research paper presents a novel entity extraction method that leverages a combination of active learning and attention mechanisms.Initially,an improved active learning approach is employed to select the most valuable unlabeled samples,which are subsequently submitted for expert labeling.This approach successfully addresses the problems of isolated points and sample redundancy within the network configuration sample set.Then the labeled samples are utilized to train the model for network configuration entity extraction.Furthermore,the multi-head self-attention of the transformer model is enhanced by introducing the Adaptive Weighting method based on the Laplace mixture distribution.This enhancement enables the transformer model to dynamically adapt its focus to words in various positions,displaying exceptional adaptability to abnormal data and further elevating the accuracy of the proposed model.Through comparisons with Random Sampling(RANDOM),Maximum Normalized Log-Probability(MNLP),Least Confidence(LC),Token Entrop(TE),and Entropy Query by Bagging(EQB),the proposed method,Entropy Query by Bagging and Maximum Influence Active Learning(EQBMIAL),achieves comparable performance with only 40% of the samples on both datasets,while other algorithms require 50% of the samples.Furthermore,the entity extraction algorithm with the Adaptive Weighted Multi-head Attention mechanism(AW-MHA)is compared with BILSTM-CRF,Mutil_Attention-Bilstm-Crf,Deep_Neural_Model_NER and BERT_Transformer,achieving precision rates of 75.98% and 98.32% on the two datasets,respectively.Statistical tests demonstrate the statistical significance and effectiveness of the proposed algorithms in this paper.
基金Researchers Supporting Project Number(RSPD2024R 553),King Saud University,Riyadh,Saudi Arabia.
文摘Wheat is a critical crop,extensively consumed worldwide,and its production enhancement is essential to meet escalating demand.The presence of diseases like stem rust,leaf rust,yellow rust,and tan spot significantly diminishes wheat yield,making the early and precise identification of these diseases vital for effective disease management.With advancements in deep learning algorithms,researchers have proposed many methods for the automated detection of disease pathogens;however,accurately detectingmultiple disease pathogens simultaneously remains a challenge.This challenge arises due to the scarcity of RGB images for multiple diseases,class imbalance in existing public datasets,and the difficulty in extracting features that discriminate between multiple classes of disease pathogens.In this research,a novel method is proposed based on Transfer Generative Adversarial Networks for augmenting existing data,thereby overcoming the problems of class imbalance and data scarcity.This study proposes a customized architecture of Vision Transformers(ViT),where the feature vector is obtained by concatenating features extracted from the custom ViT and Graph Neural Networks.This paper also proposes a Model AgnosticMeta Learning(MAML)based ensemble classifier for accurate classification.The proposedmodel,validated on public datasets for wheat disease pathogen classification,achieved a test accuracy of 99.20%and an F1-score of 97.95%.Compared with existing state-of-the-art methods,this proposed model outperforms in terms of accuracy,F1-score,and the number of disease pathogens detection.In future,more diseases can be included for detection along with some other modalities like pests and weed.
文摘针对主流Transformer网络仅对输入像素块做自注意力计算而忽略了不同像素块间的信息交互,以及输入尺度单一导致局部特征细节模糊的问题,本文提出一种基于Transformer并用于处理视觉任务的主干网络ConvFormer. ConvFormer通过所设计的多尺度混洗自注意力模块(Channel-Shuffle and Multi-Scale attention,CSMS)和动态相对位置编码模块(Dynamic Relative Position Coding,DRPC)来聚合多尺度像素块间的语义信息,并在前馈网络中引入深度卷积提高网络的局部建模能力.在公开数据集ImageNet-1K,COCO 2017和ADE20K上分别进行图像分类、目标检测和语义分割实验,ConvFormer-Tiny与不同视觉任务中同量级最优网络RetNetY-4G,Swin-Tiny和ResNet50对比,精度分别提高0.3%,1.4%和0.5%.