针对图像识别中获取全局特征的局限性及难以提升识别准确性的问题,提出一种基于随机增强Swin-Tiny Transformer轻量级模型的图像识别方法.该方法在预处理阶段结合基于随机数据增强(random data augmentation based enhancement,RDABE)...针对图像识别中获取全局特征的局限性及难以提升识别准确性的问题,提出一种基于随机增强Swin-Tiny Transformer轻量级模型的图像识别方法.该方法在预处理阶段结合基于随机数据增强(random data augmentation based enhancement,RDABE)算法对图像特征进行增强,并采用Transformer的自注意力机制,以获得更全面的高层视觉语义信息.通过在玉米病害数据集上优化Swin-Tiny Transformer模型并进行参数微调,在农业领域的玉米病害上验证了该算法的适用性,实现了更精确的病害检测.实验结果表明,基于随机增强的轻量级Swin-Tiny+RDABE模型对玉米病害图像识别准确率达93.5867%.在参数权重一致,与性能优秀的轻量级Transformer、卷积神经网络(CNN)系列模型对比的实验结果表明,改进的模型准确率比Swin-Tiny Transformer,Deit3_Small,Vit_Small,Mobilenet_V3_Small,ShufflenetV2和Efficientnet_B1_Pruned模型提高了1.1877%~4.9881%,且能迅速收敛.展开更多
Recent studies have shown that deep learning(DL)models can skillfully forecast El Niño–Southern Oscillation(ENSO)events more than 1.5 years in advance.However,concerns regarding the reliability of predictions ma...Recent studies have shown that deep learning(DL)models can skillfully forecast El Niño–Southern Oscillation(ENSO)events more than 1.5 years in advance.However,concerns regarding the reliability of predictions made by DL methods persist,including potential overfitting issues and lack of interpretability.Here,we propose ResoNet,a DL model that combines CNN(convolutional neural network)and transformer architectures.This hybrid architecture enables our model to adequately capture local sea surface temperature anomalies as well as long-range inter-basin interactions across oceans.We show that ResoNet can robustly predict ENSO at lead times of 19 months,thus outperforming existing approaches in terms of the forecast horizon.According to an explainability method applied to ResoNet predictions of El Niño and La Niña from 1-to 18-month leads,we find that it predicts the Niño-3.4 index based on multiple physically reasonable mechanisms,such as the recharge oscillator concept,seasonal footprint mechanism,and Indian Ocean capacitor effect.Moreover,we demonstrate for the first time that the asymmetry between El Niño and La Niña development can be captured by ResoNet.Our results could help to alleviate skepticism about applying DL models for ENSO prediction and encourage more attempts to discover and predict climate phenomena using AI methods.展开更多
Transformer-based models have facilitated significant advances in object detection.However,their extensive computational consumption and suboptimal detection of dense small objects curtail their applicability in unman...Transformer-based models have facilitated significant advances in object detection.However,their extensive computational consumption and suboptimal detection of dense small objects curtail their applicability in unmanned aerial vehicle(UAV)imagery.Addressing these limitations,we propose a hybrid transformer-based detector,H-DETR,and enhance it for dense small objects,leading to an accurate and efficient model.Firstly,we introduce a hybrid transformer encoder,which integrates a convolutional neural network-based cross-scale fusion module with the original encoder to handle multi-scale feature sequences more efficiently.Furthermore,we propose two novel strategies to enhance detection performance without incurring additional inference computation.Query filter is designed to cope with the dense clustering inherent in drone-captured images by counteracting similar queries with a training-aware non-maximum suppression.Adversarial denoising learning is a novel enhancement method inspired by adversarial learning,which improves the detection of numerous small targets by counteracting the effects of artificial spatial and semantic noise.Extensive experiments on the VisDrone and UAVDT datasets substantiate the effectiveness of our approach,achieving a significant improvement in accuracy with a reduction in computational complexity.Our method achieves 31.9%and 21.1%AP on the VisDrone and UAVDT datasets,respectively,and has a faster inference speed,making it a competitive model in UAV image object detection.展开更多
BACKGROUND Malignant transformation(MT)of mature cystic teratoma(MCT)has a poor prognosis,especially in advanced cases.Concurrent chemoradiotherapy(CCRT)has an inhibitory effect on MT.CASE SUMMARY Herein,we present a ...BACKGROUND Malignant transformation(MT)of mature cystic teratoma(MCT)has a poor prognosis,especially in advanced cases.Concurrent chemoradiotherapy(CCRT)has an inhibitory effect on MT.CASE SUMMARY Herein,we present a case in which CCRT had a reduction effect preoperatively.A 73-year-old woman with pyelonephritis was referred to our hospital.Computed tomography revealed right hydronephrosis and a 6-cm pelvic mass.Endoscopic ultrasound-guided fine-needle biopsy(EUS-FNB)revealed squamous cell carci-noma.The patient was diagnosed with MT of MCT.Due to her poor general con-dition and renal malfunction,we selected CCRT,expecting fewer adverse effects.After CCRT,her performance status improved,and the tumor size was reduced;surgery was performed.Five months postoperatively,the patient developed dis-semination and lymph node metastases.Palliative chemotherapy was ineffective.She died 18 months after treatment initiation.CONCLUSION EUS-FNB was useful in the diagnosis of MT of MCT;CCRT suppressed the disea-se and improved quality of life.展开更多
Transforming growth factor-beta 1(TGF-β1)has been extensively studied for its pleiotropic effects on central nervous system diseases.The neuroprotective or neurotoxic effects of TGF-β1 in specific brain areas may de...Transforming growth factor-beta 1(TGF-β1)has been extensively studied for its pleiotropic effects on central nervous system diseases.The neuroprotective or neurotoxic effects of TGF-β1 in specific brain areas may depend on the pathological process and cell types involved.Voltage-gated sodium channels(VGSCs)are essential ion channels for the generation of action potentials in neurons,and are involved in various neuroexcitation-related diseases.However,the effects of TGF-β1 on the functional properties of VGSCs and firing properties in cortical neurons remain unclear.In this study,we investigated the effects of TGF-β1 on VGSC function and firing properties in primary cortical neurons from mice.We found that TGF-β1 increased VGSC current density in a dose-and time-dependent manner,which was attributable to the upregulation of Nav1.3 expression.Increased VGSC current density and Nav1.3 expression were significantly abolished by preincubation with inhibitors of mitogen-activated protein kinase kinase(PD98059),p38 mitogen-activated protein kinase(SB203580),and Jun NH2-terminal kinase 1/2 inhibitor(SP600125).Interestingly,TGF-β1 significantly increased the firing threshold of action potentials but did not change their firing rate in cortical neurons.These findings suggest that TGF-β1 can increase Nav1.3 expression through activation of the ERK1/2-JNK-MAPK pathway,which leads to a decrease in the firing threshold of action potentials in cortical neurons under pathological conditions.Thus,this contributes to the occurrence and progression of neuroexcitatory-related diseases of the central nervous system.展开更多
Transformer models have emerged as dominant networks for various tasks in computer vision compared to Convolutional Neural Networks(CNNs).The transformers demonstrate the ability to model long-range dependencies by ut...Transformer models have emerged as dominant networks for various tasks in computer vision compared to Convolutional Neural Networks(CNNs).The transformers demonstrate the ability to model long-range dependencies by utilizing a self-attention mechanism.This study aims to provide a comprehensive survey of recent transformerbased approaches in image and video applications,as well as diffusion models.We begin by discussing existing surveys of vision transformers and comparing them to this work.Then,we review the main components of a vanilla transformer network,including the self-attention mechanism,feed-forward network,position encoding,etc.In the main part of this survey,we review recent transformer-based models in three categories:Transformer for downstream tasks,Vision Transformer for Generation,and Vision Transformer for Segmentation.We also provide a comprehensive overview of recent transformer models for video tasks and diffusion models.We compare the performance of various hierarchical transformer networks for multiple tasks on popular benchmark datasets.Finally,we explore some future research directions to further improve the field.展开更多
Cloud detection from satellite and drone imagery is crucial for applications such as weather forecasting and environmentalmonitoring.Addressing the limitations of conventional convolutional neural networks,we propose ...Cloud detection from satellite and drone imagery is crucial for applications such as weather forecasting and environmentalmonitoring.Addressing the limitations of conventional convolutional neural networks,we propose an innovative transformer-based method.This method leverages transformers,which are adept at processing data sequences,to enhance cloud detection accuracy.Additionally,we introduce a Cyclic Refinement Architecture that improves the resolution and quality of feature extraction,thereby aiding in the retention of critical details often lost during cloud detection.Our extensive experimental validation shows that our approach significantly outperforms established models,excelling in high-resolution feature extraction and precise cloud segmentation.By integrating Positional Visual Transformers(PVT)with this architecture,our method advances high-resolution feature delineation and segmentation accuracy.Ultimately,our research offers a novel perspective for surmounting traditional challenges in cloud detection and contributes to the advancement of precise and dependable image analysis across various domains.展开更多
RFID-based human activity recognition(HAR)attracts attention due to its convenience,noninvasiveness,and privacy protection.Existing RFID-based HAR methods use modeling,CNN,or LSTM to extract features effectively.Still...RFID-based human activity recognition(HAR)attracts attention due to its convenience,noninvasiveness,and privacy protection.Existing RFID-based HAR methods use modeling,CNN,or LSTM to extract features effectively.Still,they have shortcomings:1)requiring complex hand-crafted data cleaning processes and 2)only addressing single-person activity recognition based on specific RF signals.To solve these problems,this paper proposes a novel device-free method based on Time-streaming Multiscale Transformer called TransTM.This model leverages the Transformer's powerful data fitting capabilities to take raw RFID RSSI data as input without pre-processing.Concretely,we propose a multiscale convolutional hybrid Transformer to capture behavioral features that recognizes singlehuman activities and human-to-human interactions.Compared with existing CNN-and LSTM-based methods,the Transformer-based method has more data fitting power,generalization,and scalability.Furthermore,using RF signals,our method achieves an excellent classification effect on human behaviorbased classification tasks.Experimental results on the actual RFID datasets show that this model achieves a high average recognition accuracy(99.1%).The dataset we collected for detecting RFID-based indoor human activities will be published.展开更多
To tackle the problem of inaccurate short-term bus load prediction,especially during holidays,a Transformer-based scheme with tailored architectural enhancements is proposed.First,the input data are clustered to reduc...To tackle the problem of inaccurate short-term bus load prediction,especially during holidays,a Transformer-based scheme with tailored architectural enhancements is proposed.First,the input data are clustered to reduce complexity and capture inherent characteristics more effectively.Gated residual connections are then employed to selectively propagate salient features across layers,while an attention mechanism focuses on identifying prominent patterns in multivariate time-series data.Ultimately,a pre-trained structure is incorporated to reduce computational complexity.Experimental results based on extensive data show that the proposed scheme achieves improved prediction accuracy over comparative algorithms by at least 32.00%consistently across all buses evaluated,and the fitting effect of holiday load curves is outstanding.Meanwhile,the pre-trained structure drastically reduces the training time of the proposed algorithm by more than 65.75%.The proposed scheme can efficiently predict bus load results while enhancing robustness for holiday predictions,making it better adapted to real-world prediction scenarios.展开更多
Convolutional neural network(CNN)has excellent ability to model locally contextual information.However,CNNs face challenges for descripting long-range semantic features,which will lead to relatively low classification...Convolutional neural network(CNN)has excellent ability to model locally contextual information.However,CNNs face challenges for descripting long-range semantic features,which will lead to relatively low classification accuracy of hyperspectral images.To address this problem,this article proposes an algorithm based on multiscale fusion and transformer network for hyperspectral image classification.Firstly,the low-level spatial-spectral features are extracted by multi-scale residual structure.Secondly,an attention module is introduced to focus on the more important spatialspectral information.Finally,high-level semantic features are represented and learned by a token learner and an improved transformer encoder.The proposed algorithm is compared with six classical hyperspectral classification algorithms on real hyperspectral images.The experimental results show that the proposed algorithm effectively improves the land cover classification accuracy of hyperspectral images.展开更多
This paper addresses the problem of predicting population density leveraging cellular station data.As wireless communication devices are commonly used,cellular station data has become integral for estimating populatio...This paper addresses the problem of predicting population density leveraging cellular station data.As wireless communication devices are commonly used,cellular station data has become integral for estimating population figures and studying their movement,thereby implying significant contributions to urban planning.However,existing research grapples with issues pertinent to preprocessing base station data and the modeling of population prediction.To address this,we propose methodologies for preprocessing cellular station data to eliminate any irregular or redundant data.The preprocessing reveals a distinct cyclical characteristic and high-frequency variation in population shift.Further,we devise a multi-view enhancement model grounded on the Transformer(MVformer),targeting the improvement of the accuracy of extended time-series population predictions.Comparative experiments,conducted on the above-mentioned population dataset using four alternate Transformer-based models,indicate that our proposedMVformer model enhances prediction accuracy by approximately 30%for both univariate and multivariate time-series prediction assignments.The performance of this model in tasks pertaining to population prediction exhibits commendable results.展开更多
In the Industrial Internet of Things(IIoT),sensors generate time series data to reflect the working state.When the systems are attacked,timely identification of outliers in time series is critical to ensure security.A...In the Industrial Internet of Things(IIoT),sensors generate time series data to reflect the working state.When the systems are attacked,timely identification of outliers in time series is critical to ensure security.Although many anomaly detection methods have been proposed,the temporal correlation of the time series over the same sensor and the state(spatial)correlation between different sensors are rarely considered simultaneously in these methods.Owing to the superior capability of Transformer in learning time series features.This paper proposes a time series anomaly detection method based on a spatial-temporal network and an improved Transformer.Additionally,the methods based on graph neural networks typically include a graph structure learning module and an anomaly detection module,which are interdependent.However,in the initial phase of training,since neither of the modules has reached an optimal state,their performance may influence each other.This scenario makes the end-to-end training approach hard to effectively direct the learning trajectory of each module.This interdependence between the modules,coupled with the initial instability,may cause the model to find it hard to find the optimal solution during the training process,resulting in unsatisfactory results.We introduce an adaptive graph structure learning method to obtain the optimal model parameters and graph structure.Experiments on two publicly available datasets demonstrate that the proposed method attains higher anomaly detection results than other methods.展开更多
Nowadays,ensuring thequality of networkserviceshas become increasingly vital.Experts are turning toknowledge graph technology,with a significant emphasis on entity extraction in the identification of device configurat...Nowadays,ensuring thequality of networkserviceshas become increasingly vital.Experts are turning toknowledge graph technology,with a significant emphasis on entity extraction in the identification of device configurations.This research paper presents a novel entity extraction method that leverages a combination of active learning and attention mechanisms.Initially,an improved active learning approach is employed to select the most valuable unlabeled samples,which are subsequently submitted for expert labeling.This approach successfully addresses the problems of isolated points and sample redundancy within the network configuration sample set.Then the labeled samples are utilized to train the model for network configuration entity extraction.Furthermore,the multi-head self-attention of the transformer model is enhanced by introducing the Adaptive Weighting method based on the Laplace mixture distribution.This enhancement enables the transformer model to dynamically adapt its focus to words in various positions,displaying exceptional adaptability to abnormal data and further elevating the accuracy of the proposed model.Through comparisons with Random Sampling(RANDOM),Maximum Normalized Log-Probability(MNLP),Least Confidence(LC),Token Entrop(TE),and Entropy Query by Bagging(EQB),the proposed method,Entropy Query by Bagging and Maximum Influence Active Learning(EQBMIAL),achieves comparable performance with only 40% of the samples on both datasets,while other algorithms require 50% of the samples.Furthermore,the entity extraction algorithm with the Adaptive Weighted Multi-head Attention mechanism(AW-MHA)is compared with BILSTM-CRF,Mutil_Attention-Bilstm-Crf,Deep_Neural_Model_NER and BERT_Transformer,achieving precision rates of 75.98% and 98.32% on the two datasets,respectively.Statistical tests demonstrate the statistical significance and effectiveness of the proposed algorithms in this paper.展开更多
文摘针对图像识别中获取全局特征的局限性及难以提升识别准确性的问题,提出一种基于随机增强Swin-Tiny Transformer轻量级模型的图像识别方法.该方法在预处理阶段结合基于随机数据增强(random data augmentation based enhancement,RDABE)算法对图像特征进行增强,并采用Transformer的自注意力机制,以获得更全面的高层视觉语义信息.通过在玉米病害数据集上优化Swin-Tiny Transformer模型并进行参数微调,在农业领域的玉米病害上验证了该算法的适用性,实现了更精确的病害检测.实验结果表明,基于随机增强的轻量级Swin-Tiny+RDABE模型对玉米病害图像识别准确率达93.5867%.在参数权重一致,与性能优秀的轻量级Transformer、卷积神经网络(CNN)系列模型对比的实验结果表明,改进的模型准确率比Swin-Tiny Transformer,Deit3_Small,Vit_Small,Mobilenet_V3_Small,ShufflenetV2和Efficientnet_B1_Pruned模型提高了1.1877%~4.9881%,且能迅速收敛.
基金supported by the Shanghai Artificial Intelligence Laboratory and National Natural Science Foundation of China(Grant No.42088101 and 42030605).
文摘Recent studies have shown that deep learning(DL)models can skillfully forecast El Niño–Southern Oscillation(ENSO)events more than 1.5 years in advance.However,concerns regarding the reliability of predictions made by DL methods persist,including potential overfitting issues and lack of interpretability.Here,we propose ResoNet,a DL model that combines CNN(convolutional neural network)and transformer architectures.This hybrid architecture enables our model to adequately capture local sea surface temperature anomalies as well as long-range inter-basin interactions across oceans.We show that ResoNet can robustly predict ENSO at lead times of 19 months,thus outperforming existing approaches in terms of the forecast horizon.According to an explainability method applied to ResoNet predictions of El Niño and La Niña from 1-to 18-month leads,we find that it predicts the Niño-3.4 index based on multiple physically reasonable mechanisms,such as the recharge oscillator concept,seasonal footprint mechanism,and Indian Ocean capacitor effect.Moreover,we demonstrate for the first time that the asymmetry between El Niño and La Niña development can be captured by ResoNet.Our results could help to alleviate skepticism about applying DL models for ENSO prediction and encourage more attempts to discover and predict climate phenomena using AI methods.
基金This research was funded by the Natural Science Foundation of Hebei Province(F2021506004).
文摘Transformer-based models have facilitated significant advances in object detection.However,their extensive computational consumption and suboptimal detection of dense small objects curtail their applicability in unmanned aerial vehicle(UAV)imagery.Addressing these limitations,we propose a hybrid transformer-based detector,H-DETR,and enhance it for dense small objects,leading to an accurate and efficient model.Firstly,we introduce a hybrid transformer encoder,which integrates a convolutional neural network-based cross-scale fusion module with the original encoder to handle multi-scale feature sequences more efficiently.Furthermore,we propose two novel strategies to enhance detection performance without incurring additional inference computation.Query filter is designed to cope with the dense clustering inherent in drone-captured images by counteracting similar queries with a training-aware non-maximum suppression.Adversarial denoising learning is a novel enhancement method inspired by adversarial learning,which improves the detection of numerous small targets by counteracting the effects of artificial spatial and semantic noise.Extensive experiments on the VisDrone and UAVDT datasets substantiate the effectiveness of our approach,achieving a significant improvement in accuracy with a reduction in computational complexity.Our method achieves 31.9%and 21.1%AP on the VisDrone and UAVDT datasets,respectively,and has a faster inference speed,making it a competitive model in UAV image object detection.
文摘BACKGROUND Malignant transformation(MT)of mature cystic teratoma(MCT)has a poor prognosis,especially in advanced cases.Concurrent chemoradiotherapy(CCRT)has an inhibitory effect on MT.CASE SUMMARY Herein,we present a case in which CCRT had a reduction effect preoperatively.A 73-year-old woman with pyelonephritis was referred to our hospital.Computed tomography revealed right hydronephrosis and a 6-cm pelvic mass.Endoscopic ultrasound-guided fine-needle biopsy(EUS-FNB)revealed squamous cell carci-noma.The patient was diagnosed with MT of MCT.Due to her poor general con-dition and renal malfunction,we selected CCRT,expecting fewer adverse effects.After CCRT,her performance status improved,and the tumor size was reduced;surgery was performed.Five months postoperatively,the patient developed dis-semination and lymph node metastases.Palliative chemotherapy was ineffective.She died 18 months after treatment initiation.CONCLUSION EUS-FNB was useful in the diagnosis of MT of MCT;CCRT suppressed the disea-se and improved quality of life.
基金supported by the Natural Science Foundation of Guangdong Province,Nos.2019A1515010649(to WC),2022A1515012044(to JS)the China Postdoctoral Science Foundation,No.2018M633091(to JS).
文摘Transforming growth factor-beta 1(TGF-β1)has been extensively studied for its pleiotropic effects on central nervous system diseases.The neuroprotective or neurotoxic effects of TGF-β1 in specific brain areas may depend on the pathological process and cell types involved.Voltage-gated sodium channels(VGSCs)are essential ion channels for the generation of action potentials in neurons,and are involved in various neuroexcitation-related diseases.However,the effects of TGF-β1 on the functional properties of VGSCs and firing properties in cortical neurons remain unclear.In this study,we investigated the effects of TGF-β1 on VGSC function and firing properties in primary cortical neurons from mice.We found that TGF-β1 increased VGSC current density in a dose-and time-dependent manner,which was attributable to the upregulation of Nav1.3 expression.Increased VGSC current density and Nav1.3 expression were significantly abolished by preincubation with inhibitors of mitogen-activated protein kinase kinase(PD98059),p38 mitogen-activated protein kinase(SB203580),and Jun NH2-terminal kinase 1/2 inhibitor(SP600125).Interestingly,TGF-β1 significantly increased the firing threshold of action potentials but did not change their firing rate in cortical neurons.These findings suggest that TGF-β1 can increase Nav1.3 expression through activation of the ERK1/2-JNK-MAPK pathway,which leads to a decrease in the firing threshold of action potentials in cortical neurons under pathological conditions.Thus,this contributes to the occurrence and progression of neuroexcitatory-related diseases of the central nervous system.
基金supported in part by the National Natural Science Foundation of China under Grants 61502162,61702175,and 61772184in part by the Fund of the State Key Laboratory of Geo-information Engineering under Grant SKLGIE2016-M-4-2+4 种基金in part by the Hunan Natural Science Foundation of China under Grant 2018JJ2059in part by the Key R&D Project of Hunan Province of China under Grant 2018GK2014in part by the Open Fund of the State Key Laboratory of Integrated Services Networks under Grant ISN17-14Chinese Scholarship Council(CSC)through College of Computer Science and Electronic Engineering,Changsha,410082Hunan University with Grant CSC No.2018GXZ020784.
文摘Transformer models have emerged as dominant networks for various tasks in computer vision compared to Convolutional Neural Networks(CNNs).The transformers demonstrate the ability to model long-range dependencies by utilizing a self-attention mechanism.This study aims to provide a comprehensive survey of recent transformerbased approaches in image and video applications,as well as diffusion models.We begin by discussing existing surveys of vision transformers and comparing them to this work.Then,we review the main components of a vanilla transformer network,including the self-attention mechanism,feed-forward network,position encoding,etc.In the main part of this survey,we review recent transformer-based models in three categories:Transformer for downstream tasks,Vision Transformer for Generation,and Vision Transformer for Segmentation.We also provide a comprehensive overview of recent transformer models for video tasks and diffusion models.We compare the performance of various hierarchical transformer networks for multiple tasks on popular benchmark datasets.Finally,we explore some future research directions to further improve the field.
基金funded by the Chongqing Normal University Startup Foundation for PhD(22XLB021)supported by the Open Research Project of the State Key Laboratory of Industrial Control Technology,Zhejiang University,China(No.ICT2023B40).
文摘Cloud detection from satellite and drone imagery is crucial for applications such as weather forecasting and environmentalmonitoring.Addressing the limitations of conventional convolutional neural networks,we propose an innovative transformer-based method.This method leverages transformers,which are adept at processing data sequences,to enhance cloud detection accuracy.Additionally,we introduce a Cyclic Refinement Architecture that improves the resolution and quality of feature extraction,thereby aiding in the retention of critical details often lost during cloud detection.Our extensive experimental validation shows that our approach significantly outperforms established models,excelling in high-resolution feature extraction and precise cloud segmentation.By integrating Positional Visual Transformers(PVT)with this architecture,our method advances high-resolution feature delineation and segmentation accuracy.Ultimately,our research offers a novel perspective for surmounting traditional challenges in cloud detection and contributes to the advancement of precise and dependable image analysis across various domains.
基金the Strategic Priority Research Program of Chinese Academy of Sciences(Grant No.XDC02040300)for this study.
文摘RFID-based human activity recognition(HAR)attracts attention due to its convenience,noninvasiveness,and privacy protection.Existing RFID-based HAR methods use modeling,CNN,or LSTM to extract features effectively.Still,they have shortcomings:1)requiring complex hand-crafted data cleaning processes and 2)only addressing single-person activity recognition based on specific RF signals.To solve these problems,this paper proposes a novel device-free method based on Time-streaming Multiscale Transformer called TransTM.This model leverages the Transformer's powerful data fitting capabilities to take raw RFID RSSI data as input without pre-processing.Concretely,we propose a multiscale convolutional hybrid Transformer to capture behavioral features that recognizes singlehuman activities and human-to-human interactions.Compared with existing CNN-and LSTM-based methods,the Transformer-based method has more data fitting power,generalization,and scalability.Furthermore,using RF signals,our method achieves an excellent classification effect on human behaviorbased classification tasks.Experimental results on the actual RFID datasets show that this model achieves a high average recognition accuracy(99.1%).The dataset we collected for detecting RFID-based indoor human activities will be published.
文摘To tackle the problem of inaccurate short-term bus load prediction,especially during holidays,a Transformer-based scheme with tailored architectural enhancements is proposed.First,the input data are clustered to reduce complexity and capture inherent characteristics more effectively.Gated residual connections are then employed to selectively propagate salient features across layers,while an attention mechanism focuses on identifying prominent patterns in multivariate time-series data.Ultimately,a pre-trained structure is incorporated to reduce computational complexity.Experimental results based on extensive data show that the proposed scheme achieves improved prediction accuracy over comparative algorithms by at least 32.00%consistently across all buses evaluated,and the fitting effect of holiday load curves is outstanding.Meanwhile,the pre-trained structure drastically reduces the training time of the proposed algorithm by more than 65.75%.The proposed scheme can efficiently predict bus load results while enhancing robustness for holiday predictions,making it better adapted to real-world prediction scenarios.
基金National Natural Science Foundation of China(No.62201457)Natural Science Foundation of Shaanxi Province(Nos.2022JQ-668,2022JQ-588)。
文摘Convolutional neural network(CNN)has excellent ability to model locally contextual information.However,CNNs face challenges for descripting long-range semantic features,which will lead to relatively low classification accuracy of hyperspectral images.To address this problem,this article proposes an algorithm based on multiscale fusion and transformer network for hyperspectral image classification.Firstly,the low-level spatial-spectral features are extracted by multi-scale residual structure.Secondly,an attention module is introduced to focus on the more important spatialspectral information.Finally,high-level semantic features are represented and learned by a token learner and an improved transformer encoder.The proposed algorithm is compared with six classical hyperspectral classification algorithms on real hyperspectral images.The experimental results show that the proposed algorithm effectively improves the land cover classification accuracy of hyperspectral images.
基金Guangdong Basic and Applied Basic Research Foundation under Grant No.2024A1515012485in part by the Shenzhen Fundamental Research Program under Grant JCYJ20220810112354002.
文摘This paper addresses the problem of predicting population density leveraging cellular station data.As wireless communication devices are commonly used,cellular station data has become integral for estimating population figures and studying their movement,thereby implying significant contributions to urban planning.However,existing research grapples with issues pertinent to preprocessing base station data and the modeling of population prediction.To address this,we propose methodologies for preprocessing cellular station data to eliminate any irregular or redundant data.The preprocessing reveals a distinct cyclical characteristic and high-frequency variation in population shift.Further,we devise a multi-view enhancement model grounded on the Transformer(MVformer),targeting the improvement of the accuracy of extended time-series population predictions.Comparative experiments,conducted on the above-mentioned population dataset using four alternate Transformer-based models,indicate that our proposedMVformer model enhances prediction accuracy by approximately 30%for both univariate and multivariate time-series prediction assignments.The performance of this model in tasks pertaining to population prediction exhibits commendable results.
基金This work is partly supported by the National Key Research and Development Program of China(Grant No.2020YFB1805403)the National Natural Science Foundation of China(Grant No.62032002)the 111 Project(Grant No.B21049).
文摘In the Industrial Internet of Things(IIoT),sensors generate time series data to reflect the working state.When the systems are attacked,timely identification of outliers in time series is critical to ensure security.Although many anomaly detection methods have been proposed,the temporal correlation of the time series over the same sensor and the state(spatial)correlation between different sensors are rarely considered simultaneously in these methods.Owing to the superior capability of Transformer in learning time series features.This paper proposes a time series anomaly detection method based on a spatial-temporal network and an improved Transformer.Additionally,the methods based on graph neural networks typically include a graph structure learning module and an anomaly detection module,which are interdependent.However,in the initial phase of training,since neither of the modules has reached an optimal state,their performance may influence each other.This scenario makes the end-to-end training approach hard to effectively direct the learning trajectory of each module.This interdependence between the modules,coupled with the initial instability,may cause the model to find it hard to find the optimal solution during the training process,resulting in unsatisfactory results.We introduce an adaptive graph structure learning method to obtain the optimal model parameters and graph structure.Experiments on two publicly available datasets demonstrate that the proposed method attains higher anomaly detection results than other methods.
基金supported by the National Key R&D Program of China(2019YFB2103202).
文摘Nowadays,ensuring thequality of networkserviceshas become increasingly vital.Experts are turning toknowledge graph technology,with a significant emphasis on entity extraction in the identification of device configurations.This research paper presents a novel entity extraction method that leverages a combination of active learning and attention mechanisms.Initially,an improved active learning approach is employed to select the most valuable unlabeled samples,which are subsequently submitted for expert labeling.This approach successfully addresses the problems of isolated points and sample redundancy within the network configuration sample set.Then the labeled samples are utilized to train the model for network configuration entity extraction.Furthermore,the multi-head self-attention of the transformer model is enhanced by introducing the Adaptive Weighting method based on the Laplace mixture distribution.This enhancement enables the transformer model to dynamically adapt its focus to words in various positions,displaying exceptional adaptability to abnormal data and further elevating the accuracy of the proposed model.Through comparisons with Random Sampling(RANDOM),Maximum Normalized Log-Probability(MNLP),Least Confidence(LC),Token Entrop(TE),and Entropy Query by Bagging(EQB),the proposed method,Entropy Query by Bagging and Maximum Influence Active Learning(EQBMIAL),achieves comparable performance with only 40% of the samples on both datasets,while other algorithms require 50% of the samples.Furthermore,the entity extraction algorithm with the Adaptive Weighted Multi-head Attention mechanism(AW-MHA)is compared with BILSTM-CRF,Mutil_Attention-Bilstm-Crf,Deep_Neural_Model_NER and BERT_Transformer,achieving precision rates of 75.98% and 98.32% on the two datasets,respectively.Statistical tests demonstrate the statistical significance and effectiveness of the proposed algorithms in this paper.