Automatic crack detection of cement pavement chiefly benefits from the rapid development of deep learning,with convolutional neural networks(CNN)playing an important role in this field.However,as the performance of cr...Automatic crack detection of cement pavement chiefly benefits from the rapid development of deep learning,with convolutional neural networks(CNN)playing an important role in this field.However,as the performance of crack detection in cement pavement improves,the depth and width of the network structure are significantly increased,which necessitates more computing power and storage space.This limitation hampers the practical implementation of crack detection models on various platforms,particularly portable devices like small mobile devices.To solve these problems,we propose a dual-encoder-based network architecture that focuses on extracting more comprehensive fracture feature information and combines cross-fusion modules and coordinated attention mechanisms formore efficient feature fusion.Firstly,we use small channel convolution to construct shallow feature extractionmodule(SFEM)to extract low-level feature information of cracks in cement pavement images,in order to obtainmore information about cracks in the shallowfeatures of images.In addition,we construct large kernel atrous convolution(LKAC)to enhance crack information,which incorporates coordination attention mechanism for non-crack information filtering,and large kernel atrous convolution with different cores,using different receptive fields to extract more detailed edge and context information.Finally,the three-stage feature map outputs from the shallow feature extraction module is cross-fused with the two-stage feature map outputs from the large kernel atrous convolution module,and the shallow feature and detailed edge feature are fully fused to obtain the final crack prediction map.We evaluate our method on three public crack datasets:DeepCrack,CFD,and Crack500.Experimental results on theDeepCrack dataset demonstrate the effectiveness of our proposed method compared to state-of-the-art crack detection methods,which achieves Precision(P)87.2%,Recall(R)87.7%,and F-score(F1)87.4%.Thanks to our lightweight crack detectionmodel,the parameter count of the model in real-world detection scenarios has been significantly reduced to less than 2M.This advancement also facilitates technical support for portable scene detection.展开更多
In order to prevent possible casualties and economic loss, it is critical to accurate prediction of the Remaining Useful Life (RUL) in rail prognostics health management. However, the traditional neural networks is di...In order to prevent possible casualties and economic loss, it is critical to accurate prediction of the Remaining Useful Life (RUL) in rail prognostics health management. However, the traditional neural networks is difficult to capture the long-term dependency relationship of the time series in the modeling of the long time series of rail damage, due to the coupling relationship of multi-channel data from multiple sensors. Here, in this paper, a novel RUL prediction model with an enhanced pulse separable convolution is used to solve this issue. Firstly, a coding module based on the improved pulse separable convolutional network is established to effectively model the relationship between the data. To enhance the network, an alternate gradient back propagation method is implemented. And an efficient channel attention (ECA) mechanism is developed for better emphasizing the useful pulse characteristics. Secondly, an optimized Transformer encoder was designed to serve as the backbone of the model. It has the ability to efficiently understand relationship between the data itself and each other at each time step of long time series with a full life cycle. More importantly, the Transformer encoder is improved by integrating pulse maximum pooling to retain more pulse timing characteristics. Finally, based on the characteristics of the front layer, the final predicted RUL value was provided and served as the end-to-end solution. The empirical findings validate the efficacy of the suggested approach in forecasting the rail RUL, surpassing various existing data-driven prognostication techniques. Meanwhile, the proposed method also shows good generalization performance on PHM2012 bearing data set.展开更多
The goal of street-to-aerial cross-view image geo-localization is to determine the location of the query street-view image by retrieving the aerial-view image from the same place.The drastic viewpoint and appearance g...The goal of street-to-aerial cross-view image geo-localization is to determine the location of the query street-view image by retrieving the aerial-view image from the same place.The drastic viewpoint and appearance gap between the aerial-view and the street-view images brings a huge challenge against this task.In this paper,we propose a novel multiscale attention encoder to capture the multiscale contextual information of the aerial/street-view images.To bridge the domain gap between these two view images,we first use an inverse polar transform to make the street-view images approximately aligned with the aerial-view images.Then,the explored multiscale attention encoder is applied to convert the image into feature representation with the guidance of the learnt multiscale information.Finally,we propose a novel global mining strategy to enable the network to pay more attention to hard negative exemplars.Experiments on standard benchmark datasets show that our approach obtains 81.39%top-1 recall rate on the CVUSA dataset and 71.52%on the CVACT dataset,achieving the state-of-the-art performance and outperforming most of the existing methods significantly.展开更多
The topological connectivity information derived from the brain functional network can bring new insights for diagnosing and analyzing dementia disorders.The brain functional network is suitable to bridge the correlat...The topological connectivity information derived from the brain functional network can bring new insights for diagnosing and analyzing dementia disorders.The brain functional network is suitable to bridge the correlation between abnormal connectivities and dementia disorders.However,it is challenging to access considerable amounts of brain functional network data,which hinders the widespread application of data-driven models in dementia diagnosis.In this study,a novel distribution-regularized adversarial graph auto-Encoder(DAGAE)with transformer is proposed to generate new fake brain functional networks to augment the brain functional network dataset,improving the dementia diagnosis accuracy of data-driven models.Specifically,the label distribution is estimated to regularize the latent space learned by the graph encoder,which canmake the learning process stable and the learned representation robust.Also,the transformer generator is devised to map the node representations into node-to-node connections by exploring the long-term dependence of highly-correlated distant brain regions.The typical topological properties and discriminative features can be preserved entirely.Furthermore,the generated brain functional networks improve the prediction performance using different classifiers,which can be applied to analyze other cognitive diseases.Attempts on the Alzheimer’s Disease Neuroimaging Initiative(ADNI)dataset demonstrate that the proposed model can generate good brain functional networks.The classification results show adding generated data can achieve the best accuracy value of 85.33%,sensitivity value of 84.00%,specificity value of 86.67%.The proposed model also achieves superior performance compared with other related augmentedmodels.Overall,the proposedmodel effectively improves cognitive disease diagnosis by generating diverse brain functional networks.展开更多
The detection of brain disease is an essential issue in medical and research areas.Deep learning techniques have shown promising results in detecting and diagnosing brain diseases using magnetic resonance imaging(MRI)...The detection of brain disease is an essential issue in medical and research areas.Deep learning techniques have shown promising results in detecting and diagnosing brain diseases using magnetic resonance imaging(MRI)images.These techniques involve training neural networks on large datasets of MRI images,allowing the networks to learn patterns and features indicative of different brain diseases.However,several challenges and limitations still need to be addressed further to improve the accuracy and effectiveness of these techniques.This paper implements a Feature Enhanced Stacked Auto Encoder(FESAE)model to detect brain diseases.The standard stack auto encoder’s results are trivial and not robust enough to boost the system’s accuracy.Therefore,the standard Stack Auto Encoder(SAE)is replaced with a Stacked Feature Enhanced Auto Encoder with a feature enhancement function to efficiently and effectively get non-trivial features with less activation energy froman image.The proposed model consists of four stages.First,pre-processing is performed to remove noise,and the greyscale image is converted to Red,Green,and Blue(RGB)to enhance feature details for discriminative feature extraction.Second,feature Extraction is performed to extract significant features for classification using DiscreteWavelet Transform(DWT)and Channelization.Third,classification is performed to classify MRI images into four major classes:Normal,Tumor,Brain Stroke,and Alzheimer’s.Finally,the FESAE model outperforms the state-of-theart,machine learning,and deep learning methods such as Artificial Neural Network(ANN),SAE,Random Forest(RF),and Logistic Regression(LR)by achieving a high accuracy of 98.61% on a dataset of 2000 MRI images.The proposed model has significant potential for assisting radiologists in diagnosing brain diseases more accurately and improving patient outcomes.展开更多
Latent information is difficult to get from the text in speech synthesis.Studies show that features from speech can get more information to help text encoding.In the field of speech encoding,a lot of work has been con...Latent information is difficult to get from the text in speech synthesis.Studies show that features from speech can get more information to help text encoding.In the field of speech encoding,a lot of work has been conducted on two aspects.The first aspect is to encode speech frame by frame.The second aspect is to encode the whole speech to a vector.But the scale in these aspects is fixed.So,encoding speech with an adjustable scale for more latent information is worthy of investigation.But current alignment approaches only support frame-by-frame encoding and speech-to-vector encoding.It remains a challenge to propose a new alignment approach to support adjustable scale speech encoding.This paper presents the dynamic speech encoder with a new alignment approach in conjunction with frame-by-frame encoding and speech-to-vector encoding.The speech feature fromourmodel achieves three functions.First,the speech feature can reconstruct the origin speech while the length of the speech feature is equal to the text length.Second,our model can get text embedding fromspeech,and the encoded speech feature is similar to the text embedding result.Finally,it can transfer the style of synthesis speech and make it more similar to the given reference speech.展开更多
Existing web-based security applications have failed in many situations due to the great intelligence of attackers.Among web applications,Cross-Site Scripting(XSS)is one of the dangerous assaults experienced while mod...Existing web-based security applications have failed in many situations due to the great intelligence of attackers.Among web applications,Cross-Site Scripting(XSS)is one of the dangerous assaults experienced while modifying an organization's or user's information.To avoid these security challenges,this article proposes a novel,all-encompassing combination of machine learning(NB,SVM,k-NN)and deep learning(RNN,CNN,LSTM)frameworks for detecting and defending against XSS attacks with high accuracy and efficiency.Based on the representation,a novel idea for merging stacking ensemble with web applications,termed“hybrid stacking”,is proposed.In order to implement the aforementioned methods,four distinct datasets,each of which contains both safe and unsafe content,are considered.The hybrid detection method can adaptively identify the attacks from the URL,and the defense mechanism inherits the advantages of URL encoding with dictionary-based mapping to improve prediction accuracy,accelerate the training process,and effectively remove the unsafe JScript/JavaScript keywords from the URL.The simulation results show that the proposed hybrid model is more efficient than the existing detection methods.It produces more than 99.5%accurate XSS attack classification results(accuracy,precision,recall,f1_score,and Receiver Operating Characteristic(ROC))and is highly resistant to XSS attacks.In order to ensure the security of the server's information,the proposed hybrid approach is demonstrated in a real-time environment.展开更多
首先利用bidirectional encoder representations from transformers(BERT)模型的强大的语境理解能力来提取数据法律文本的深层语义特征,然后引入细粒度特征提取层,依照注意力机制,重点关注文本中与数据法律问答相关的关键部分,最后对...首先利用bidirectional encoder representations from transformers(BERT)模型的强大的语境理解能力来提取数据法律文本的深层语义特征,然后引入细粒度特征提取层,依照注意力机制,重点关注文本中与数据法律问答相关的关键部分,最后对所采集的法律问答数据集进行训练和评估.结果显示:与传统的多个单一模型相比,所提出的模型在准确度、精确度、召回率、F1分数等关键性能指标上均有提升,表明该系统能够更有效地理解和回应复杂的数据法学问题,为研究数据法学的专业人士和公众用户提供更高质量的问答服务.展开更多
In order to enhance the accuracy of Air Traffic Control(ATC)cybersecurity attack detection,in this paper,a new clustering detection method is designed for air traffic control network security attacks.The feature set f...In order to enhance the accuracy of Air Traffic Control(ATC)cybersecurity attack detection,in this paper,a new clustering detection method is designed for air traffic control network security attacks.The feature set for ATC cybersecurity attacks is constructed by setting the feature states,adding recursive features,and determining the feature criticality.The expected information gain and entropy of the feature data are computed to determine the information gain of the feature data and reduce the interference of similar feature data.An autoencoder is introduced into the AI(artificial intelligence)algorithm to encode and decode the characteristics of ATC network security attack behavior to reduce the dimensionality of the ATC network security attack behavior data.Based on the above processing,an unsupervised learning algorithm for clustering detection of ATC network security attacks is designed.First,determine the distance between the clustering clusters of ATC network security attack behavior characteristics,calculate the clustering threshold,and construct the initial clustering center.Then,the new average value of all feature objects in each cluster is recalculated as the new cluster center.Second,it traverses all objects in a cluster of ATC network security attack behavior feature data.Finally,the cluster detection of ATC network security attack behavior is completed by the computation of objective functions.The experiment took three groups of experimental attack behavior data sets as the test object,and took the detection rate,false detection rate and recall rate as the test indicators,and selected three similar methods for comparative test.The experimental results show that the detection rate of this method is about 98%,the false positive rate is below 1%,and the recall rate is above 97%.Research shows that this method can improve the detection performance of security attacks in air traffic control network.展开更多
The challenging task of handwriting style synthesis requires capturing the individuality and diversity of human handwriting.The majority of currently available methods use either a generative adversarial network(GAN)o...The challenging task of handwriting style synthesis requires capturing the individuality and diversity of human handwriting.The majority of currently available methods use either a generative adversarial network(GAN)or a recurrent neural network(RNN)to generate new handwriting styles.This is why these techniques frequently fall short of producing diverse and realistic text pictures,particularly for terms that are not commonly used.To resolve that,this research proposes a novel deep learning model that consists of a style encoder and a text generator to synthesize different handwriting styles.This network excels in generating conditional text by extracting style vectors from a series of style images.The model performs admirably on a range of handwriting synthesis tasks,including the production of text that is out-of-vocabulary.It works more effectively than previous approaches by displaying lower values on key Generative Adversarial Network evaluation metrics,such Geometric Score(GS)(3.21×10^(-5))and Fréchet Inception Distance(FID)(8.75),as well as text recognition metrics,like Character Error Rate(CER)and Word Error Rate(WER).A thorough component analysis revealed the steady improvement in image production quality,highlighting the importance of specific handwriting styles.Applicable fields include digital forensics,creative writing,and document security.展开更多
Cyberbullying,a critical concern for digital safety,necessitates effective linguistic analysis tools that can navigate the complexities of language use in online spaces.To tackle this challenge,our study introduces a ...Cyberbullying,a critical concern for digital safety,necessitates effective linguistic analysis tools that can navigate the complexities of language use in online spaces.To tackle this challenge,our study introduces a new approach employing Bidirectional Encoder Representations from the Transformers(BERT)base model(cased),originally pretrained in English.This model is uniquely adapted to recognize the intricate nuances of Arabic online communication,a key aspect often overlooked in conventional cyberbullying detection methods.Our model is an end-to-end solution that has been fine-tuned on a diverse dataset of Arabic social media(SM)tweets showing a notable increase in detection accuracy and sensitivity compared to existing methods.Experimental results on a diverse Arabic dataset collected from the‘X platform’demonstrate a notable increase in detection accuracy and sensitivity compared to existing methods.E-BERT shows a substantial improvement in performance,evidenced by an accuracy of 98.45%,precision of 99.17%,recall of 99.10%,and an F1 score of 99.14%.The proposed E-BERT not only addresses a critical gap in cyberbullying detection in Arabic online forums but also sets a precedent for applying cross-lingual pretrained models in regional language applications,offering a scalable and effective framework for enhancing online safety across Arabic-speaking communities.展开更多
The rapid expansion of online content and big data has precipitated an urgent need for efficient summarization techniques to swiftly comprehend vast textual documents without compromising their original integrity.Curr...The rapid expansion of online content and big data has precipitated an urgent need for efficient summarization techniques to swiftly comprehend vast textual documents without compromising their original integrity.Current approaches in Extractive Text Summarization(ETS)leverage the modeling of inter-sentence relationships,a task of paramount importance in producing coherent summaries.This study introduces an innovative model that integrates Graph Attention Networks(GATs)with Transformer-based Bidirectional Encoder Representa-tions from Transformers(BERT)and Latent Dirichlet Allocation(LDA),further enhanced by Term Frequency-Inverse Document Frequency(TF-IDF)values,to improve sentence selection by capturing comprehensive topical information.Our approach constructs a graph with nodes representing sentences,words,and topics,thereby elevating the interconnectivity and enabling a more refined understanding of text structures.This model is stretched to Multi-Document Summarization(MDS)from Single-Document Summarization,offering significant improvements over existing models such as THGS-GMM and Topic-GraphSum,as demonstrated by empirical evaluations on benchmark news datasets like Cable News Network(CNN)/Daily Mail(DM)and Multi-News.The results consistently demonstrate superior performance,showcasing the model’s robustness in handling complex summarization tasks across single and multi-document contexts.This research not only advances the integration of BERT and LDA within a GATs but also emphasizes our model’s capacity to effectively manage global information and adapt to diverse summarization challenges.展开更多
While encryption technology safeguards the security of network communications,malicious traffic also uses encryption protocols to obscure its malicious behavior.To address the issues of traditional machine learning me...While encryption technology safeguards the security of network communications,malicious traffic also uses encryption protocols to obscure its malicious behavior.To address the issues of traditional machine learning methods relying on expert experience and the insufficient representation capabilities of existing deep learning methods for encrypted malicious traffic,we propose an encrypted malicious traffic classification method that integrates global semantic features with local spatiotemporal features,called BERT-based Spatio-Temporal Features Network(BSTFNet).At the packet-level granularity,the model captures the global semantic features of packets through the attention mechanism of the Bidirectional Encoder Representations from Transformers(BERT)model.At the byte-level granularity,we initially employ the Bidirectional Gated Recurrent Unit(BiGRU)model to extract temporal features from bytes,followed by the utilization of the Text Convolutional Neural Network(TextCNN)model with multi-sized convolution kernels to extract local multi-receptive field spatial features.The fusion of features from both granularities serves as the ultimate multidimensional representation of malicious traffic.Our approach achieves accuracy and F1-score of 99.39%and 99.40%,respectively,on the publicly available USTC-TFC2016 dataset,and effectively reduces sample confusion within the Neris and Virut categories.The experimental results demonstrate that our method has outstanding representation and classification capabilities for encrypted malicious traffic.展开更多
This study proposes a novel particle encoding mechanism that seamlessly incorporates the quantum properties of particles,with a specific emphasis on constituent quarks.The primary objective of this mechanism is to fac...This study proposes a novel particle encoding mechanism that seamlessly incorporates the quantum properties of particles,with a specific emphasis on constituent quarks.The primary objective of this mechanism is to facilitate the digital registration and identification of a wide range of particle information.Its design ensures easy integration with different event generators and digital simulations commonly used in high-energy experiments.Moreover,this innovative framework can be easily expanded to encode complex multi-quark states comprising up to nine valence quarks and accommodating an angular momentum of up to 99/2.This versatility and scalability make it a valuable tool.展开更多
Sentence classification is the process of categorizing a sentence based on the context of the sentence.Sentence categorization requires more semantic highlights than other tasks,such as dependence parsing,which requir...Sentence classification is the process of categorizing a sentence based on the context of the sentence.Sentence categorization requires more semantic highlights than other tasks,such as dependence parsing,which requires more syntactic elements.Most existing strategies focus on the general semantics of a conversation without involving the context of the sentence,recognizing the progress and comparing impacts.An ensemble pre-trained language model was taken up here to classify the conversation sentences from the conversation corpus.The conversational sentences are classified into four categories:information,question,directive,and commission.These classification label sequences are for analyzing the conversation progress and predicting the pecking order of the conversation.Ensemble of Bidirectional Encoder for Representation of Transformer(BERT),Robustly Optimized BERT pretraining Approach(RoBERTa),Generative Pre-Trained Transformer(GPT),DistilBERT and Generalized Autoregressive Pretraining for Language Understanding(XLNet)models are trained on conversation corpus with hyperparameters.Hyperparameter tuning approach is carried out for better performance on sentence classification.This Ensemble of Pre-trained Language Models with a Hyperparameter Tuning(EPLM-HT)system is trained on an annotated conversation dataset.The proposed approach outperformed compared to the base BERT,GPT,DistilBERT and XLNet transformer models.The proposed ensemble model with the fine-tuned parameters achieved an F1_score of 0.88.展开更多
Distracted driving remains a primary factor in traffic accidents and poses a significant obstacle to advancing driver assistance technologies.Improving the accuracy of distracted driving can greatly reduce the occurre...Distracted driving remains a primary factor in traffic accidents and poses a significant obstacle to advancing driver assistance technologies.Improving the accuracy of distracted driving can greatly reduce the occurrence of traffic accidents,thereby providing a guarantee for the safety of drivers.However,detecting distracted driving behaviors remains challenging in real-world scenarios with complex backgrounds,varying target scales,and different resolutions.Addressing the low detection accuracy of existing vehicle distraction detection algorithms and considering practical application scenarios,this paper proposes an improved vehicle distraction detection algorithm based on YOLOv5.The algorithm integrates Attention-based Intra-scale Feature Interaction(AIFI)into the backbone network,enabling it to focus on enhancing feature interactions within the same scale through the attention mechanism.By emphasizing important features,this approach improves detection accuracy,thereby enhancing performance in complex backgrounds.Additionally,a Triple Feature Encoding(TFE)module has been added to the neck network.This module utilizes multi-scale features,encoding and fusing them to create a more detailed and comprehensive feature representation,enhancing object detection and localization,and enabling the algorithm to fully understand the image.Finally,the shape-IoU(Intersection over Union)loss function is adopted to replace the original IoU for more precise bounding box regression.Comparative evaluation of the improved YOLOv5 distraction detection algorithm against the original YOLOv5 algorithm shows an average accuracy improvement of 1.8%,indicating significant advantages in solving distracted driving problems.展开更多
Images and videos provide a wealth of information for people in production and life.Although most digital information is transmitted via optical fiber,the image acquisition and transmission processes still rely heavil...Images and videos provide a wealth of information for people in production and life.Although most digital information is transmitted via optical fiber,the image acquisition and transmission processes still rely heavily on electronic circuits.The development of all-optical transmission networks and optical computing frameworks has pointed to the direction for the next generation of data transmission and information processing.Here,we propose a high-speed,low-cost,multiplexed parallel and one-piece all-fiber architecture for image acquisition,encoding,and transmission,called the Multicore Fiber Acquisition and Transmission Image System(MFAT).Based on different spatial and modal channels of the multicore fiber,fiber-coupled self-encoding,and digital aperture decoding technology,scenes can be observed directly from up to 1 km away.The expansion of capacity provides the possibility of parallel coded transmission of multimodal high-quality data.MFAT requires no additional signal transmitting and receiving equipment.The all-fiber processing saves the time traditionally spent on signal conversion and image pre-processing(compression,encoding,and modulation).Additionally,it provides an effective solution for 2D information acquisition and transmission tasks in extreme environments such as high temperatures and electromagnetic interference.展开更多
基金supported by the National Natural Science Foundation of China(No.62176034)the Science and Technology Research Program of Chongqing Municipal Education Commission(No.KJZD-M202300604)the Natural Science Foundation of Chongqing(Nos.cstc2021jcyj-msxmX0518,2023NSCQ-MSX1781).
文摘Automatic crack detection of cement pavement chiefly benefits from the rapid development of deep learning,with convolutional neural networks(CNN)playing an important role in this field.However,as the performance of crack detection in cement pavement improves,the depth and width of the network structure are significantly increased,which necessitates more computing power and storage space.This limitation hampers the practical implementation of crack detection models on various platforms,particularly portable devices like small mobile devices.To solve these problems,we propose a dual-encoder-based network architecture that focuses on extracting more comprehensive fracture feature information and combines cross-fusion modules and coordinated attention mechanisms formore efficient feature fusion.Firstly,we use small channel convolution to construct shallow feature extractionmodule(SFEM)to extract low-level feature information of cracks in cement pavement images,in order to obtainmore information about cracks in the shallowfeatures of images.In addition,we construct large kernel atrous convolution(LKAC)to enhance crack information,which incorporates coordination attention mechanism for non-crack information filtering,and large kernel atrous convolution with different cores,using different receptive fields to extract more detailed edge and context information.Finally,the three-stage feature map outputs from the shallow feature extraction module is cross-fused with the two-stage feature map outputs from the large kernel atrous convolution module,and the shallow feature and detailed edge feature are fully fused to obtain the final crack prediction map.We evaluate our method on three public crack datasets:DeepCrack,CFD,and Crack500.Experimental results on theDeepCrack dataset demonstrate the effectiveness of our proposed method compared to state-of-the-art crack detection methods,which achieves Precision(P)87.2%,Recall(R)87.7%,and F-score(F1)87.4%.Thanks to our lightweight crack detectionmodel,the parameter count of the model in real-world detection scenarios has been significantly reduced to less than 2M.This advancement also facilitates technical support for portable scene detection.
文摘In order to prevent possible casualties and economic loss, it is critical to accurate prediction of the Remaining Useful Life (RUL) in rail prognostics health management. However, the traditional neural networks is difficult to capture the long-term dependency relationship of the time series in the modeling of the long time series of rail damage, due to the coupling relationship of multi-channel data from multiple sensors. Here, in this paper, a novel RUL prediction model with an enhanced pulse separable convolution is used to solve this issue. Firstly, a coding module based on the improved pulse separable convolutional network is established to effectively model the relationship between the data. To enhance the network, an alternate gradient back propagation method is implemented. And an efficient channel attention (ECA) mechanism is developed for better emphasizing the useful pulse characteristics. Secondly, an optimized Transformer encoder was designed to serve as the backbone of the model. It has the ability to efficiently understand relationship between the data itself and each other at each time step of long time series with a full life cycle. More importantly, the Transformer encoder is improved by integrating pulse maximum pooling to retain more pulse timing characteristics. Finally, based on the characteristics of the front layer, the final predicted RUL value was provided and served as the end-to-end solution. The empirical findings validate the efficacy of the suggested approach in forecasting the rail RUL, surpassing various existing data-driven prognostication techniques. Meanwhile, the proposed method also shows good generalization performance on PHM2012 bearing data set.
基金National Natural Science Foundation of China,Grant/Award Number:62106177supported by the Central University Basic Research Fund of China(No.2042020KF0016)supported by the supercomputing system in the Supercomputing Center of Wuhan University.
文摘The goal of street-to-aerial cross-view image geo-localization is to determine the location of the query street-view image by retrieving the aerial-view image from the same place.The drastic viewpoint and appearance gap between the aerial-view and the street-view images brings a huge challenge against this task.In this paper,we propose a novel multiscale attention encoder to capture the multiscale contextual information of the aerial/street-view images.To bridge the domain gap between these two view images,we first use an inverse polar transform to make the street-view images approximately aligned with the aerial-view images.Then,the explored multiscale attention encoder is applied to convert the image into feature representation with the guidance of the learnt multiscale information.Finally,we propose a novel global mining strategy to enable the network to pay more attention to hard negative exemplars.Experiments on standard benchmark datasets show that our approach obtains 81.39%top-1 recall rate on the CVUSA dataset and 71.52%on the CVACT dataset,achieving the state-of-the-art performance and outperforming most of the existing methods significantly.
基金This paper is partially supported by the British Heart Foundation Accelerator Award,UK(AA\18\3\34220)Royal Society International Exchanges Cost Share Award,UK(RP202G0230)+9 种基金Hope Foundation for Cancer Research,UK(RM60G0680)Medical Research Council Confidence in Concept Award,UK(MC_PC_17171)Sino-UK Industrial Fund,UK(RP202G0289)Global Challenges Research Fund(GCRF),UK(P202PF11)LIAS Pioneering Partnerships Award,UK(P202ED10)Data Science Enhancement Fund,UK(P202RE237)Fight for Sight,UK(24NN201)Sino-UK Education Fund,UK(OP202006)Biotechnology and Biological Sciences Research Council,UK(RM32G0178B8)LIAS Seed Corn,UK(P202RE969).
文摘The topological connectivity information derived from the brain functional network can bring new insights for diagnosing and analyzing dementia disorders.The brain functional network is suitable to bridge the correlation between abnormal connectivities and dementia disorders.However,it is challenging to access considerable amounts of brain functional network data,which hinders the widespread application of data-driven models in dementia diagnosis.In this study,a novel distribution-regularized adversarial graph auto-Encoder(DAGAE)with transformer is proposed to generate new fake brain functional networks to augment the brain functional network dataset,improving the dementia diagnosis accuracy of data-driven models.Specifically,the label distribution is estimated to regularize the latent space learned by the graph encoder,which canmake the learning process stable and the learned representation robust.Also,the transformer generator is devised to map the node representations into node-to-node connections by exploring the long-term dependence of highly-correlated distant brain regions.The typical topological properties and discriminative features can be preserved entirely.Furthermore,the generated brain functional networks improve the prediction performance using different classifiers,which can be applied to analyze other cognitive diseases.Attempts on the Alzheimer’s Disease Neuroimaging Initiative(ADNI)dataset demonstrate that the proposed model can generate good brain functional networks.The classification results show adding generated data can achieve the best accuracy value of 85.33%,sensitivity value of 84.00%,specificity value of 86.67%.The proposed model also achieves superior performance compared with other related augmentedmodels.Overall,the proposedmodel effectively improves cognitive disease diagnosis by generating diverse brain functional networks.
基金supported by financial support from Universiti Sains Malaysia(USM)under FRGS Grant Number FRGS/1/2020/TK03/USM/02/1the School of Computer Sciences USM for their support.
文摘The detection of brain disease is an essential issue in medical and research areas.Deep learning techniques have shown promising results in detecting and diagnosing brain diseases using magnetic resonance imaging(MRI)images.These techniques involve training neural networks on large datasets of MRI images,allowing the networks to learn patterns and features indicative of different brain diseases.However,several challenges and limitations still need to be addressed further to improve the accuracy and effectiveness of these techniques.This paper implements a Feature Enhanced Stacked Auto Encoder(FESAE)model to detect brain diseases.The standard stack auto encoder’s results are trivial and not robust enough to boost the system’s accuracy.Therefore,the standard Stack Auto Encoder(SAE)is replaced with a Stacked Feature Enhanced Auto Encoder with a feature enhancement function to efficiently and effectively get non-trivial features with less activation energy froman image.The proposed model consists of four stages.First,pre-processing is performed to remove noise,and the greyscale image is converted to Red,Green,and Blue(RGB)to enhance feature details for discriminative feature extraction.Second,feature Extraction is performed to extract significant features for classification using DiscreteWavelet Transform(DWT)and Channelization.Third,classification is performed to classify MRI images into four major classes:Normal,Tumor,Brain Stroke,and Alzheimer’s.Finally,the FESAE model outperforms the state-of-theart,machine learning,and deep learning methods such as Artificial Neural Network(ANN),SAE,Random Forest(RF),and Logistic Regression(LR)by achieving a high accuracy of 98.61% on a dataset of 2000 MRI images.The proposed model has significant potential for assisting radiologists in diagnosing brain diseases more accurately and improving patient outcomes.
基金supported by National Key R&D Program of China (2020AAA0107901).
文摘Latent information is difficult to get from the text in speech synthesis.Studies show that features from speech can get more information to help text encoding.In the field of speech encoding,a lot of work has been conducted on two aspects.The first aspect is to encode speech frame by frame.The second aspect is to encode the whole speech to a vector.But the scale in these aspects is fixed.So,encoding speech with an adjustable scale for more latent information is worthy of investigation.But current alignment approaches only support frame-by-frame encoding and speech-to-vector encoding.It remains a challenge to propose a new alignment approach to support adjustable scale speech encoding.This paper presents the dynamic speech encoder with a new alignment approach in conjunction with frame-by-frame encoding and speech-to-vector encoding.The speech feature fromourmodel achieves three functions.First,the speech feature can reconstruct the origin speech while the length of the speech feature is equal to the text length.Second,our model can get text embedding fromspeech,and the encoded speech feature is similar to the text embedding result.Finally,it can transfer the style of synthesis speech and make it more similar to the given reference speech.
基金supported by the National Research Foundation of Korea(NRF)grant funded by the Korea government(MEST)No.2015R1A3A2031159,2016R1A5A1008055.
文摘Existing web-based security applications have failed in many situations due to the great intelligence of attackers.Among web applications,Cross-Site Scripting(XSS)is one of the dangerous assaults experienced while modifying an organization's or user's information.To avoid these security challenges,this article proposes a novel,all-encompassing combination of machine learning(NB,SVM,k-NN)and deep learning(RNN,CNN,LSTM)frameworks for detecting and defending against XSS attacks with high accuracy and efficiency.Based on the representation,a novel idea for merging stacking ensemble with web applications,termed“hybrid stacking”,is proposed.In order to implement the aforementioned methods,four distinct datasets,each of which contains both safe and unsafe content,are considered.The hybrid detection method can adaptively identify the attacks from the URL,and the defense mechanism inherits the advantages of URL encoding with dictionary-based mapping to improve prediction accuracy,accelerate the training process,and effectively remove the unsafe JScript/JavaScript keywords from the URL.The simulation results show that the proposed hybrid model is more efficient than the existing detection methods.It produces more than 99.5%accurate XSS attack classification results(accuracy,precision,recall,f1_score,and Receiver Operating Characteristic(ROC))and is highly resistant to XSS attacks.In order to ensure the security of the server's information,the proposed hybrid approach is demonstrated in a real-time environment.
文摘首先利用bidirectional encoder representations from transformers(BERT)模型的强大的语境理解能力来提取数据法律文本的深层语义特征,然后引入细粒度特征提取层,依照注意力机制,重点关注文本中与数据法律问答相关的关键部分,最后对所采集的法律问答数据集进行训练和评估.结果显示:与传统的多个单一模型相比,所提出的模型在准确度、精确度、召回率、F1分数等关键性能指标上均有提升,表明该系统能够更有效地理解和回应复杂的数据法学问题,为研究数据法学的专业人士和公众用户提供更高质量的问答服务.
基金National Natural Science Foundation of China(U2133208,U20A20161)National Natural Science Foundation of China(No.62273244)Sichuan Science and Technology Program(No.2022YFG0180).
文摘In order to enhance the accuracy of Air Traffic Control(ATC)cybersecurity attack detection,in this paper,a new clustering detection method is designed for air traffic control network security attacks.The feature set for ATC cybersecurity attacks is constructed by setting the feature states,adding recursive features,and determining the feature criticality.The expected information gain and entropy of the feature data are computed to determine the information gain of the feature data and reduce the interference of similar feature data.An autoencoder is introduced into the AI(artificial intelligence)algorithm to encode and decode the characteristics of ATC network security attack behavior to reduce the dimensionality of the ATC network security attack behavior data.Based on the above processing,an unsupervised learning algorithm for clustering detection of ATC network security attacks is designed.First,determine the distance between the clustering clusters of ATC network security attack behavior characteristics,calculate the clustering threshold,and construct the initial clustering center.Then,the new average value of all feature objects in each cluster is recalculated as the new cluster center.Second,it traverses all objects in a cluster of ATC network security attack behavior feature data.Finally,the cluster detection of ATC network security attack behavior is completed by the computation of objective functions.The experiment took three groups of experimental attack behavior data sets as the test object,and took the detection rate,false detection rate and recall rate as the test indicators,and selected three similar methods for comparative test.The experimental results show that the detection rate of this method is about 98%,the false positive rate is below 1%,and the recall rate is above 97%.Research shows that this method can improve the detection performance of security attacks in air traffic control network.
基金supported by the National Research Foundation of Korea(NRF)Grant funded by the Korean government(MSIT)(NRF-2023R1A2C1005950).
文摘The challenging task of handwriting style synthesis requires capturing the individuality and diversity of human handwriting.The majority of currently available methods use either a generative adversarial network(GAN)or a recurrent neural network(RNN)to generate new handwriting styles.This is why these techniques frequently fall short of producing diverse and realistic text pictures,particularly for terms that are not commonly used.To resolve that,this research proposes a novel deep learning model that consists of a style encoder and a text generator to synthesize different handwriting styles.This network excels in generating conditional text by extracting style vectors from a series of style images.The model performs admirably on a range of handwriting synthesis tasks,including the production of text that is out-of-vocabulary.It works more effectively than previous approaches by displaying lower values on key Generative Adversarial Network evaluation metrics,such Geometric Score(GS)(3.21×10^(-5))and Fréchet Inception Distance(FID)(8.75),as well as text recognition metrics,like Character Error Rate(CER)and Word Error Rate(WER).A thorough component analysis revealed the steady improvement in image production quality,highlighting the importance of specific handwriting styles.Applicable fields include digital forensics,creative writing,and document security.
基金funded by Scientific Research Deanship at University of Ha’il-Saudi Arabia through Project Number RG-23092。
文摘Cyberbullying,a critical concern for digital safety,necessitates effective linguistic analysis tools that can navigate the complexities of language use in online spaces.To tackle this challenge,our study introduces a new approach employing Bidirectional Encoder Representations from the Transformers(BERT)base model(cased),originally pretrained in English.This model is uniquely adapted to recognize the intricate nuances of Arabic online communication,a key aspect often overlooked in conventional cyberbullying detection methods.Our model is an end-to-end solution that has been fine-tuned on a diverse dataset of Arabic social media(SM)tweets showing a notable increase in detection accuracy and sensitivity compared to existing methods.Experimental results on a diverse Arabic dataset collected from the‘X platform’demonstrate a notable increase in detection accuracy and sensitivity compared to existing methods.E-BERT shows a substantial improvement in performance,evidenced by an accuracy of 98.45%,precision of 99.17%,recall of 99.10%,and an F1 score of 99.14%.The proposed E-BERT not only addresses a critical gap in cyberbullying detection in Arabic online forums but also sets a precedent for applying cross-lingual pretrained models in regional language applications,offering a scalable and effective framework for enhancing online safety across Arabic-speaking communities.
文摘The rapid expansion of online content and big data has precipitated an urgent need for efficient summarization techniques to swiftly comprehend vast textual documents without compromising their original integrity.Current approaches in Extractive Text Summarization(ETS)leverage the modeling of inter-sentence relationships,a task of paramount importance in producing coherent summaries.This study introduces an innovative model that integrates Graph Attention Networks(GATs)with Transformer-based Bidirectional Encoder Representa-tions from Transformers(BERT)and Latent Dirichlet Allocation(LDA),further enhanced by Term Frequency-Inverse Document Frequency(TF-IDF)values,to improve sentence selection by capturing comprehensive topical information.Our approach constructs a graph with nodes representing sentences,words,and topics,thereby elevating the interconnectivity and enabling a more refined understanding of text structures.This model is stretched to Multi-Document Summarization(MDS)from Single-Document Summarization,offering significant improvements over existing models such as THGS-GMM and Topic-GraphSum,as demonstrated by empirical evaluations on benchmark news datasets like Cable News Network(CNN)/Daily Mail(DM)and Multi-News.The results consistently demonstrate superior performance,showcasing the model’s robustness in handling complex summarization tasks across single and multi-document contexts.This research not only advances the integration of BERT and LDA within a GATs but also emphasizes our model’s capacity to effectively manage global information and adapt to diverse summarization challenges.
基金This research was funded by National Natural Science Foundation of China under Grant No.61806171Sichuan University of Science&Engineering Talent Project under Grant No.2021RC15+2 种基金Open Fund Project of Key Laboratory for Non-Destructive Testing and Engineering Computer of Sichuan Province Universities on Bridge Inspection and Engineering under Grant No.2022QYJ06Sichuan University of Science&Engineering Graduate Student Innovation Fund under Grant No.Y2023115The Scientific Research and Innovation Team Program of Sichuan University of Science and Technology under Grant No.SUSE652A006.
文摘While encryption technology safeguards the security of network communications,malicious traffic also uses encryption protocols to obscure its malicious behavior.To address the issues of traditional machine learning methods relying on expert experience and the insufficient representation capabilities of existing deep learning methods for encrypted malicious traffic,we propose an encrypted malicious traffic classification method that integrates global semantic features with local spatiotemporal features,called BERT-based Spatio-Temporal Features Network(BSTFNet).At the packet-level granularity,the model captures the global semantic features of packets through the attention mechanism of the Bidirectional Encoder Representations from Transformers(BERT)model.At the byte-level granularity,we initially employ the Bidirectional Gated Recurrent Unit(BiGRU)model to extract temporal features from bytes,followed by the utilization of the Text Convolutional Neural Network(TextCNN)model with multi-sized convolution kernels to extract local multi-receptive field spatial features.The fusion of features from both granularities serves as the ultimate multidimensional representation of malicious traffic.Our approach achieves accuracy and F1-score of 99.39%and 99.40%,respectively,on the publicly available USTC-TFC2016 dataset,and effectively reduces sample confusion within the Neris and Virut categories.The experimental results demonstrate that our method has outstanding representation and classification capabilities for encrypted malicious traffic.
基金the Department of Education of Hunan Province,China(No.21A0541)the U.S.Department of Energy(No.DE-FG03-93ER40773)H.Z.acknowledges the financial support from Key Laboratory of Quark and Lepton Physics in Central China Normal University(No.QLPL2024P01)。
文摘This study proposes a novel particle encoding mechanism that seamlessly incorporates the quantum properties of particles,with a specific emphasis on constituent quarks.The primary objective of this mechanism is to facilitate the digital registration and identification of a wide range of particle information.Its design ensures easy integration with different event generators and digital simulations commonly used in high-energy experiments.Moreover,this innovative framework can be easily expanded to encode complex multi-quark states comprising up to nine valence quarks and accommodating an angular momentum of up to 99/2.This versatility and scalability make it a valuable tool.
文摘Sentence classification is the process of categorizing a sentence based on the context of the sentence.Sentence categorization requires more semantic highlights than other tasks,such as dependence parsing,which requires more syntactic elements.Most existing strategies focus on the general semantics of a conversation without involving the context of the sentence,recognizing the progress and comparing impacts.An ensemble pre-trained language model was taken up here to classify the conversation sentences from the conversation corpus.The conversational sentences are classified into four categories:information,question,directive,and commission.These classification label sequences are for analyzing the conversation progress and predicting the pecking order of the conversation.Ensemble of Bidirectional Encoder for Representation of Transformer(BERT),Robustly Optimized BERT pretraining Approach(RoBERTa),Generative Pre-Trained Transformer(GPT),DistilBERT and Generalized Autoregressive Pretraining for Language Understanding(XLNet)models are trained on conversation corpus with hyperparameters.Hyperparameter tuning approach is carried out for better performance on sentence classification.This Ensemble of Pre-trained Language Models with a Hyperparameter Tuning(EPLM-HT)system is trained on an annotated conversation dataset.The proposed approach outperformed compared to the base BERT,GPT,DistilBERT and XLNet transformer models.The proposed ensemble model with the fine-tuned parameters achieved an F1_score of 0.88.
基金supported by the National Natural Science Foundation of China(62072158,U2004163)the Key Research and Development Special Projects of Henan Province(231111221500)Science and Technology Project of Henan Province(232102210158,242102210197).
文摘Distracted driving remains a primary factor in traffic accidents and poses a significant obstacle to advancing driver assistance technologies.Improving the accuracy of distracted driving can greatly reduce the occurrence of traffic accidents,thereby providing a guarantee for the safety of drivers.However,detecting distracted driving behaviors remains challenging in real-world scenarios with complex backgrounds,varying target scales,and different resolutions.Addressing the low detection accuracy of existing vehicle distraction detection algorithms and considering practical application scenarios,this paper proposes an improved vehicle distraction detection algorithm based on YOLOv5.The algorithm integrates Attention-based Intra-scale Feature Interaction(AIFI)into the backbone network,enabling it to focus on enhancing feature interactions within the same scale through the attention mechanism.By emphasizing important features,this approach improves detection accuracy,thereby enhancing performance in complex backgrounds.Additionally,a Triple Feature Encoding(TFE)module has been added to the neck network.This module utilizes multi-scale features,encoding and fusing them to create a more detailed and comprehensive feature representation,enhancing object detection and localization,and enabling the algorithm to fully understand the image.Finally,the shape-IoU(Intersection over Union)loss function is adopted to replace the original IoU for more precise bounding box regression.Comparative evaluation of the improved YOLOv5 distraction detection algorithm against the original YOLOv5 algorithm shows an average accuracy improvement of 1.8%,indicating significant advantages in solving distracted driving problems.
基金financial supports from the National Key R&D Program of China (2021YFA1401103)the National Natural Science Foundation of China (61925502 and 51772145)
文摘Images and videos provide a wealth of information for people in production and life.Although most digital information is transmitted via optical fiber,the image acquisition and transmission processes still rely heavily on electronic circuits.The development of all-optical transmission networks and optical computing frameworks has pointed to the direction for the next generation of data transmission and information processing.Here,we propose a high-speed,low-cost,multiplexed parallel and one-piece all-fiber architecture for image acquisition,encoding,and transmission,called the Multicore Fiber Acquisition and Transmission Image System(MFAT).Based on different spatial and modal channels of the multicore fiber,fiber-coupled self-encoding,and digital aperture decoding technology,scenes can be observed directly from up to 1 km away.The expansion of capacity provides the possibility of parallel coded transmission of multimodal high-quality data.MFAT requires no additional signal transmitting and receiving equipment.The all-fiber processing saves the time traditionally spent on signal conversion and image pre-processing(compression,encoding,and modulation).Additionally,it provides an effective solution for 2D information acquisition and transmission tasks in extreme environments such as high temperatures and electromagnetic interference.