Due to the lack of long-range association and spatial location information,fine details and accurate boundaries of complex clothing images cannot always be obtained by using the existing deep learning-based methods.Th...Due to the lack of long-range association and spatial location information,fine details and accurate boundaries of complex clothing images cannot always be obtained by using the existing deep learning-based methods.This paper presents a convolutional structure with multi-scale fusion to optimize the step of clothing feature extraction and a self-attention module to capture long-range association information.The structure enables the self-attention mechanism to directly participate in the process of information exchange through the down-scaling projection operation of the multi-scale framework.In addition,the improved self-attention module introduces the extraction of 2-dimensional relative position information to make up for its lack of ability to extract spatial position features from clothing images.The experimental results based on the colorful fashion parsing dataset(CFPD)show that the proposed network structure achieves 53.68%mean intersection over union(mIoU)and has better performance on the clothing parsing task.展开更多
Fault diagnosis is important for maintaining the safety and effectiveness of chemical process.Considering the multivariate,nonlinear,and dynamic characteristic of chemical process,many time-series-based data-driven fa...Fault diagnosis is important for maintaining the safety and effectiveness of chemical process.Considering the multivariate,nonlinear,and dynamic characteristic of chemical process,many time-series-based data-driven fault diagnosis methods have been developed in recent years.However,the existing methods have the problem of long-term dependency and are difficult to train due to the sequential way of training.To overcome these problems,a novel fault diagnosis method based on time-series and the hierarchical multihead self-attention(HMSAN)is proposed for chemical process.First,a sliding window strategy is adopted to construct the normalized time-series dataset.Second,the HMSAN is developed to extract the time-relevant features from the time-series process data.It improves the basic self-attention model in both width and depth.With the multihead structure,the HMSAN can pay attention to different aspects of the complicated chemical process and obtain the global dynamic features.However,the multiple heads in parallel lead to redundant information,which cannot improve the diagnosis performance.With the hierarchical structure,the redundant information is reduced and the deep local time-related features are further extracted.Besides,a novel many-to-one training strategy is introduced for HMSAN to simplify the training procedure and capture the long-term dependency.Finally,the effectiveness of the proposed method is demonstrated by two chemical cases.The experimental results show that the proposed method achieves a great performance on time-series industrial data and outperforms the state-of-the-art approaches.展开更多
Keyphrase greatly provides summarized and valuable information.This information can help us not only understand text semantics,but also organize and retrieve text content effectively.The task of automatically generati...Keyphrase greatly provides summarized and valuable information.This information can help us not only understand text semantics,but also organize and retrieve text content effectively.The task of automatically generating it has received considerable attention in recent decades.From the previous studies,we can see many workable solutions for obtaining keyphrases.One method is to divide the content to be summarized into multiple blocks of text,then we rank and select the most important content.The disadvantage of this method is that it cannot identify keyphrase that does not include in the text,let alone get the real semantic meaning hidden in the text.Another approach uses recurrent neural networks to generate keyphrases from the semantic aspects of the text,but the inherently sequential nature precludes parallelization within training examples,and distances have limitations on context dependencies.Previous works have demonstrated the benefits of the self-attention mechanism,which can learn global text dependency features and can be parallelized.Inspired by the above observation,we propose a keyphrase generation model,which is based entirely on the self-attention mechanism.It is an encoder-decoder model that can make up the above disadvantage effectively.In addition,we also consider the semantic similarity between keyphrases,and add semantic similarity processing module into the model.This proposed model,which is demonstrated by empirical analysis on five datasets,can achieve competitive performance compared to baseline methods.展开更多
Due to the rapid evolution of Advanced Persistent Threats(APTs)attacks,the emergence of new and rare attack samples,and even those never seen before,make it challenging for traditional rule-based detection methods to ...Due to the rapid evolution of Advanced Persistent Threats(APTs)attacks,the emergence of new and rare attack samples,and even those never seen before,make it challenging for traditional rule-based detection methods to extract universal rules for effective detection.With the progress in techniques such as transfer learning and meta-learning,few-shot network attack detection has progressed.However,challenges in few-shot network attack detection arise from the inability of time sequence flow features to adapt to the fixed length input requirement of deep learning,difficulties in capturing rich information from original flow in the case of insufficient samples,and the challenge of high-level abstract representation.To address these challenges,a few-shot network attack detection based on NFHP(Network Flow Holographic Picture)-RN(ResNet)is proposed.Specifically,leveraging inherent properties of images such as translation invariance,rotation invariance,scale invariance,and illumination invariance,network attack traffic features and contextual relationships are intuitively represented in NFHP.In addition,an improved RN network model is employed for high-level abstract feature extraction,ensuring that the extracted high-level abstract features maintain the detailed characteristics of the original traffic behavior,regardless of changes in background traffic.Finally,a meta-learning model based on the self-attention mechanism is constructed,achieving the detection of novel APT few-shot network attacks through the empirical generalization of high-level abstract feature representations of known-class network attack behaviors.Experimental results demonstrate that the proposed method can learn high-level abstract features of network attacks across different traffic detail granularities.Comparedwith state-of-the-artmethods,it achieves favorable accuracy,precision,recall,and F1 scores for the identification of unknown-class network attacks through cross-validation onmultiple datasets.展开更多
Nowadays,with the rapid development of industrial Internet technology,on the one hand,advanced industrial control systems(ICS)have improved industrial production efficiency.However,there are more and more cyber-attack...Nowadays,with the rapid development of industrial Internet technology,on the one hand,advanced industrial control systems(ICS)have improved industrial production efficiency.However,there are more and more cyber-attacks targeting industrial control systems.To ensure the security of industrial networks,intrusion detection systems have been widely used in industrial control systems,and deep neural networks have always been an effective method for identifying cyber attacks.Current intrusion detection methods still suffer from low accuracy and a high false alarm rate.Therefore,it is important to build a more efficient intrusion detection model.This paper proposes a hybrid deep learning intrusion detection method based on convolutional neural networks and bidirectional long short-term memory neural networks(CNN-BiLSTM).To address the issue of imbalanced data within the dataset and improve the model’s detection capabilities,the Synthetic Minority Over-sampling Technique-Edited Nearest Neighbors(SMOTE-ENN)algorithm is applied in the preprocessing phase.This algorithm is employed to generate synthetic instances for the minority class,simultaneously mitigating the impact of noise in the majority class.This approach aims to create a more equitable distribution of classes,thereby enhancing the model’s ability to effectively identify patterns in both minority and majority classes.In the experimental phase,the detection performance of the method is verified using two data sets.Experimental results show that the accuracy rate on the CICIDS-2017 data set reaches 97.7%.On the natural gas pipeline dataset collected by Lan Turnipseed from Mississippi State University in the United States,the accuracy rate also reaches 85.5%.展开更多
Due to their robust learning and expression ability for complex features,the deep learning(DL)model plays a vital role in bearing fault diagnosis.However,since there are fewer labeled samples in fault diagnosis,the de...Due to their robust learning and expression ability for complex features,the deep learning(DL)model plays a vital role in bearing fault diagnosis.However,since there are fewer labeled samples in fault diagnosis,the depth of DL models in fault diagnosis is generally shallower than that of DL models in other fields,which limits the diagnostic performance.To solve this problem,a novel transfer residual Swin Transformer(RST)is proposed for rolling bearings in this paper.RST has 24 residual self-attention layers,which use the hierarchical design and the shifted window-based residual self-attention.Combined with transfer learning techniques,the transfer RST model uses pre-trained parameters from ImageNet.A new end-to-end method for fault diagnosis based on deep transfer RST is proposed.Firstly,wavelet transform transforms the vibration signal into a wavelet time-frequency diagram.The signal’s time-frequency domain representation can be represented simultaneously.Secondly,the wavelet time-frequency diagram is the input of the RST model to obtain the fault type.Finally,our method is verified on public and self-built datasets.Experimental results show the superior performance of our method by comparing it with a shallow neural network.展开更多
Emotional electroencephalography(EEG)signals are a primary means of recording emotional brain activity.Currently,the most effective methods for analyzing emotional EEG signals involve feature engineering and neural ne...Emotional electroencephalography(EEG)signals are a primary means of recording emotional brain activity.Currently,the most effective methods for analyzing emotional EEG signals involve feature engineering and neural networks.However,neural networks possess a strong ability for automatic feature extraction.Is it possible to discard feature engineering and directly employ neural networks for end-to-end recognition?Based on the characteristics of EEG signals,this paper proposes an end-to-end feature extraction and classification method for a dynamic self-attention network(DySAT).The study reveals significant differences in brain activity patterns associated with different emotions across various experimenters and time periods.The results of this experiment can provide insights into the reasons behind these differences.展开更多
The haze weather environment leads to the deterioration of the visual effect of the image,and it is difficult to carry out the work of the advanced vision task.Therefore,dehazing the haze image is an important step be...The haze weather environment leads to the deterioration of the visual effect of the image,and it is difficult to carry out the work of the advanced vision task.Therefore,dehazing the haze image is an important step before the execution of the advanced vision task.Traditional dehazing algorithms achieve image dehazing by improving image brightness and contrast or constructing artificial priors such as color attenuation priors and dark channel priors.However,the effect is unstable when dealing with complex scenes.In the method based on convolutional neural network,the image dehazing network of the encoding and decoding structure does not consider the difference before and after the dehazing image,and the image spatial information is lost in the encoding stage.In order to overcome these problems,this paper proposes a novel end-to-end two-stream convolutional neural network for single-image dehazing.The network model is composed of a spatial information feature stream and a highlevel semantic feature stream.The spatial information feature stream retains the detailed information of the dehazing image,and the high-level semantic feature stream extracts the multi-scale structural features of the dehazing image.A spatial information auxiliary module is designed and placed between the feature streams.This module uses the attention mechanism to construct a unified expression of different types of information and realizes the gradual restoration of the clear image with the semantic information auxiliary spatial information in the dehazing network.A parallel residual twicing module is proposed,which performs dehazing on the difference information of features at different stages to improve the model’s ability to discriminate haze images.The peak signal-to-noise ratio(PSNR)and structural similarity are used to quantitatively evaluate the similarity between the dehazing results of each algorithm and the original image.The structure similarity and PSNR of the method in this paper reached 0.852 and 17.557dB on the HazeRD dataset,which were higher than existing comparison algorithms.On the SOTS dataset,the indicators are 0.955 and 27.348dB,which are sub-optimal results.In experiments with real haze images,this method can also achieve excellent visual restoration effects.The experimental results show that the model proposed in this paper can restore desired visual effects without fog images,and it also has good generalization performance in real haze scenes.展开更多
Circular RNAs(circRNAs)are RNAs with closed circular structure involved in many biological processes by key interactions with RNA binding proteins(RBPs).Existing methods for predicting these interactions have limitati...Circular RNAs(circRNAs)are RNAs with closed circular structure involved in many biological processes by key interactions with RNA binding proteins(RBPs).Existing methods for predicting these interactions have limitations in feature learning.In view of this,we propose a method named circ2CBA,which uses only sequence information of circRNAs to predict circRNA-RBP binding sites.We have constructed a data set which includes eight sub-datasets.First,circ2CBA encodes circRNA sequences using the one-hot method.Next,a two-layer convolutional neural network(CNN)is used to initially extract the features.After CNN,circ2CBA uses a layer of bidirectional long and short-term memory network(BiLSTM)and the self-attention mechanism to learn the features.The AUC value of circ2CBA reaches 0.8987.Comparison of circ2CBA with other three methods on our data set and an ablation experiment confirm that circ2CBA is an effective method to predict the binding sites between circRNAs and RBPs.展开更多
LIDAR point cloud-based 3D object detection aims to sense the surrounding environment by anchoring objects with the Bounding Box(BBox).However,under the three-dimensional space of autonomous driving scenes,the previou...LIDAR point cloud-based 3D object detection aims to sense the surrounding environment by anchoring objects with the Bounding Box(BBox).However,under the three-dimensional space of autonomous driving scenes,the previous object detection methods,due to the pre-processing of the original LIDAR point cloud into voxels or pillars,lose the coordinate information of the original point cloud,slow detection speed,and gain inaccurate bounding box positioning.To address the issues above,this study proposes a new two-stage network structure to extract point cloud features directly by PointNet++,which effectively preserves the original point cloud coordinate information.To improve the detection accuracy,a shell-based modeling method is proposed.It roughly determines which spherical shell the coordinates belong to.Then,the results are refined to ground truth,thereby narrowing the localization range and improving the detection accuracy.To improve the recall of 3D object detection with bounding boxes,this paper designs a self-attention module for 3D object detection with a skip connection structure.Some of these features are highlighted by weighting them on the feature dimensions.After training,it makes the feature weights that are favorable for object detection get larger.Thus,the extracted features are more adapted to the object detection task.Extensive comparison experiments and ablation experiments conducted on the KITTI dataset verify the effectiveness of our proposed method in improving recall and precision.展开更多
The production data in the industrialfield have the characteristics of multimodality,high dimensionality and large correlation differences between attributes.Existing data prediction methods cannot effectively capture ...The production data in the industrialfield have the characteristics of multimodality,high dimensionality and large correlation differences between attributes.Existing data prediction methods cannot effectively capture time series and modal features,which leads to prediction hysteresis and poor prediction stabil-ity.Aiming at the above problems,this paper proposes a time-series and modal fea-tureenhancementmethodbasedonadual-stageself-attentionmechanism(DATT),and a time series prediction method based on a gated feedforward recurrent unit(GFRU).On this basis,the DATT-GFRU neural network with a gated feedforward recurrent neural network and dual-stage self-attention mechanism is designed and implemented.Experiments show that the prediction effect of the neural network prediction model based on DATT is significantly improved.Compared with the traditional prediction model,the DATT-GFRU neural network has a smaller aver-age error of model prediction results,stable prediction performance,and strong generalization ability on the three datasets with different numbers of attributes and different training sample sizes.展开更多
Pedestrian attribute recognition is often considered as a multi-label image classification task. In order to make full use of attribute-related location information, a saliency guided self-attention network(SGSA-Net) ...Pedestrian attribute recognition is often considered as a multi-label image classification task. In order to make full use of attribute-related location information, a saliency guided self-attention network(SGSA-Net) was proposed to weakly supervise attribute localization, without annotations of attribute-related regions. Saliency priors were integrated into the spatial attention module(SAM). Meanwhile, channel-wise attention and spatial attention were introduced into the network. Moreover, a weighted binary cross-entropy loss(WCEL) function was employed to handle the imbalance of training data. Extensive experiments on richly annotated pedestrian(RAP) and pedestrian attribute(PETA) datasets demonstrated that SGSA-Net outperformed other state-of-the-art methods.展开更多
Purpose-Clothing patterns play a dominant role in costume design and have become an important link in the perception of costume art.Conventional clothing patterns design relies on experienced designers.Although the qu...Purpose-Clothing patterns play a dominant role in costume design and have become an important link in the perception of costume art.Conventional clothing patterns design relies on experienced designers.Although the quality of clothing patterns is very high on conventional design,the input time and output amount ratio is relative low for conventional design.In order to break through the bottleneck of conventional clothing patterns design,this paper proposes a novel way based on generative adversarial network(GAN)model for automatic clothing patterns generation,which not only reduces the dependence of experienced designer,but also improve the input-output ratio.Design/methodology/approach-In view of the fact that clothing patterns have high requirements for global artistic perception and local texture details,this paper improves the conventional GAN model from two aspects:a multi-scales discriminators strategy is introduced to deal with the local texture details;and the selfattention mechanism is introduced to improve the global artistic perception.Therefore,the improved GAN called multi-scales self-attention improved generative adversarial network(MS-SA-GAN)model,which is used for high resolution clothing patterns generation.Findings-To verify the feasibility and effectiveness of the proposed MS-SA-GAN model,a crawler is designed to acquire standard clothing patterns dataset from Baidu pictures,and a comparative experiment is conducted on our designed clothing patterns dataset.In experiments,we have adjusted different parameters of the proposed MS-SA-GAN model,and compared the global artistic perception and local texture details of the generated clothing patterns.Originality/value-Experimental results have shown that the clothing patterns generated by the proposed MS-SA-GANmodel are superior to the conventional algorithms in some local texture detail indexes.In addition,a group of clothing design professionals is invited to evaluate the global artistic perception through a valencearousal scale.The scale results have shown that the proposed MS-SA-GAN model achieves a better global art perception.展开更多
With rapid economic development,the per capita ownership of automobiles in our country has begun to rise year by year.More researchers have paid attention to using scientific methods to solve traffic flow problems.Tra...With rapid economic development,the per capita ownership of automobiles in our country has begun to rise year by year.More researchers have paid attention to using scientific methods to solve traffic flow problems.Traffic flow prediction is not simply affected by the number of vehicles,but also contains various complex factors,such as time,road conditions,and people flow.However,the existing methods ignore the complexity of road conditions and the correlation between individual nodes,which leads to the poor performance.In this study,a deep learning model SAMGCN is proposed to effectively capture the correlation between individual nodes to improve the performance of traffic flow prediction.First,the theory of spatiotemporal decoupling is used to divide each time of each node into finer particles.Second,multimodule fusion is used to mine the potential periodic relationships in the data.Finally,GRU is used to obtain the potential time relationship of the three modules.Extensive experiments were conducted on two traffic flow datasets,PeMS04 and PeMS08 in the Caltrans Performance Measurement System to prove the validity of the proposed model.展开更多
Methanol-to-olefins,as a promising non-oil pathway for the synthesis of light olefins,has been successfully industrialized.The accurate prediction of process variables can yield significant benefits for advanced proce...Methanol-to-olefins,as a promising non-oil pathway for the synthesis of light olefins,has been successfully industrialized.The accurate prediction of process variables can yield significant benefits for advanced process control and optimization.The challenge of this task is underscored by the failure of traditional methods in capturing the complex characteristics of industrial processes,such as high nonlinearities,dynamics,and data distribution shift caused by diverse operating conditions.In this paper,we propose a novel hybrid spatial-temporal deep learning prediction model to address these issues.Firstly,a unique data normalization technique called reversible instance normalization is employed to solve the problem of different data distributions.Subsequently,convolutional neural network integrated with the self-attention mechanism are utilized to extract the temporal patterns.Meanwhile,a multi-graph convolutional network is leveraged to model the spatial interactions.Afterward,the extracted temporal and spatial features are fused as input into a fully connected neural network to complete the prediction.Finally,the outputs are denormalized to obtain the ultimate results.The monitoring results of the dynamic trends of process variables in an actual industrial methanol-to-olefins process demonstrate that our model not only achieves superior prediction performance but also can reveal complex spatial-temporal relationships using the learned attention matrices and adjacency matrices,making the model more interpretable.Lastly,this model is deployed onto an end-to-end Industrial Internet Platform,which achieves effective practical results.展开更多
Shield tunneling machines are paramount underground engineering equipment and play a key role in tunnel construction.During the shield construction process,the“mud cake”formed by the difficult-to-remove clay attache...Shield tunneling machines are paramount underground engineering equipment and play a key role in tunnel construction.During the shield construction process,the“mud cake”formed by the difficult-to-remove clay attached to the cutterhead severely affects the shield construction efficiency and is harmful to the healthy operation of a shield tunneling machine.In this study,we propose an enhanced transformer-based detection model for detecting the cutterhead clogging status of shield tunneling machines.First,the working state data of shield machines are selected from historical excavation data,and a long short-term memory-autoencoder neural network module is constructed to remove outliers.Next,variational mode decomposition and wavelet transform are employed to denoise the data.After the preprocessing,nonoverlapping rectangular windows are used to intercept the working state data to obtain the time slices used for analysis,and several time-domain features of these periods are extracted.Owing to the data imbalance in the original dataset,the k-means-synthetic minority oversampling technique algorithm is adopted to oversample the extracted time-domain features of the clogging data in the training set to balance the dataset and improve the model performance.Finally,an enhanced transformer-based neural network is constructed to extract essential implicit features and detect cutterhead clogging status.Data collected from actual tunnel construction projects are used to verify the proposed model.The results show that the proposed model achieves accurate detection of shield machine cutterhead clogging status,with 98.85%accuracy and a 0.9786 F1 score.Moreover,the proposed model significantly outperforms the comparison models.展开更多
A deep neural network model generally consists of different modules that play essential roles in performing a task.The optimal design of a module for use in modeling a physical problem is directly related to the succe...A deep neural network model generally consists of different modules that play essential roles in performing a task.The optimal design of a module for use in modeling a physical problem is directly related to the success of the model.In this work,the effectiveness of a number of special modules,the self-attention mechanism for recognizing the importance of molecular sequence information in a polymer,as well as the big-stride representation and conditional random field for enhancing the network ability to produce desired local configurations,is numerically studied.Network models containing these modules are trained by using the well documented data of the native structures of the HP model and assessed according to their capability in making structural predictions of unseen data.The specific network design of self-attention mechanism adopted here is modified from a similar idea in natural language recognition.The big-stride representation module introduced in this work is shown to drastically improve network's capability to model polymer segments of strong lattice position correlations.展开更多
Accurate long-term power forecasting is important in the decision-making operation of the power grid and power consumption management of customers to ensure the power system’s reliable power supply and the grid econ...Accurate long-term power forecasting is important in the decision-making operation of the power grid and power consumption management of customers to ensure the power system’s reliable power supply and the grid economy’s reliable operation.However,most time-series forecasting models do not perform well in dealing with long-time-series prediction tasks with a large amount of data.To address this challenge,we propose a parallel time-series prediction model called LDformer.First,we combine Informer with long short-term memory(LSTM)to obtain deep representation abilities in the time series.Then,we propose a parallel encoder module to improve the robustness of the model and combine convolutional layers with an attention mechanism to avoid value redundancy in the attention mechanism.Finally,we propose a probabilistic sparse(ProbSparse)self-attention mechanism combined with UniDrop to reduce the computational overhead and mitigate the risk of losing some key connections in the sequence.Experimental results on five datasets show that LDformer outperforms the state-of-the-art methods for most of the cases when handling the different long-time-series prediction tasks.展开更多
An intelligent single radar image de-raining method based on unsupervised self-attention generative adversarial networks is proposed to improve the accuracy of wave height parameter inversion results.The method builds...An intelligent single radar image de-raining method based on unsupervised self-attention generative adversarial networks is proposed to improve the accuracy of wave height parameter inversion results.The method builds a trainable end-to-end de-raining model with an unsupervised cycle-consistent adversarial network as an AI framework,which does not require pairs of rain-contaminated and corresponding ground-truth rain-free images for training.The model is trained by feeding rain-contaminated and clean radar images in an unpaired manner,and the atmospheric scattering model parameters are not required as a prior condition.Additionally,a self-attention mechanism is introduced into the model,allowing it to focus on rain clutter when processing radar images.This combines global and local rain clutter context information to output more accurate and clear de-raining radar images.The proposed method is validated by applying it to actualfield test data,which shows that compared with the wave height derived from the original rain-contaminated data,the root-mean-square error is reduced by 0.11 m and the correlation coefficient of the wave height is increased by 14%using the de-raining method.These results demonstrate that the method effectively reduces the impact of rain on the accuracy of wave height parameter estimation from marine X-band radar images.展开更多
Video summarization has established itself as a fundamental technique for generating compact and concise video, which alleviates managing and browsing large-scale video data. Existing methods fail to fully consider th...Video summarization has established itself as a fundamental technique for generating compact and concise video, which alleviates managing and browsing large-scale video data. Existing methods fail to fully consider the local and global relations among frames of video, leading to a deteriorated summarization performance. To address the above problem, we propose a graph convolutional attention network(GCAN) for video summarization. GCAN consists of two parts, embedding learning and context fusion, where embedding learning includes the temporal branch and graph branch. In particular, GCAN uses dilated temporal convolution to model local cues and temporal self-attention to exploit global cues for video frames. It learns graph embedding via a multi-layer graph convolutional network to reveal the intrinsic structure of frame samples. The context fusion part combines the output streams from the temporal branch and graph branch to create the context-aware representation of frames, on which the importance scores are evaluated for selecting representative frames to generate video summary. Experiments are carried out on two benchmark databases, Sum Me and TVSum, showing that the proposed GCAN approach enjoys superior performance compared to several state-of-the-art alternatives in three evaluation settings.展开更多
文摘Due to the lack of long-range association and spatial location information,fine details and accurate boundaries of complex clothing images cannot always be obtained by using the existing deep learning-based methods.This paper presents a convolutional structure with multi-scale fusion to optimize the step of clothing feature extraction and a self-attention module to capture long-range association information.The structure enables the self-attention mechanism to directly participate in the process of information exchange through the down-scaling projection operation of the multi-scale framework.In addition,the improved self-attention module introduces the extraction of 2-dimensional relative position information to make up for its lack of ability to extract spatial position features from clothing images.The experimental results based on the colorful fashion parsing dataset(CFPD)show that the proposed network structure achieves 53.68%mean intersection over union(mIoU)and has better performance on the clothing parsing task.
基金supported by the National Natural Science Foundation of China(62073140,62073141)the Shanghai Rising-Star Program(21QA1401800).
文摘Fault diagnosis is important for maintaining the safety and effectiveness of chemical process.Considering the multivariate,nonlinear,and dynamic characteristic of chemical process,many time-series-based data-driven fault diagnosis methods have been developed in recent years.However,the existing methods have the problem of long-term dependency and are difficult to train due to the sequential way of training.To overcome these problems,a novel fault diagnosis method based on time-series and the hierarchical multihead self-attention(HMSAN)is proposed for chemical process.First,a sliding window strategy is adopted to construct the normalized time-series dataset.Second,the HMSAN is developed to extract the time-relevant features from the time-series process data.It improves the basic self-attention model in both width and depth.With the multihead structure,the HMSAN can pay attention to different aspects of the complicated chemical process and obtain the global dynamic features.However,the multiple heads in parallel lead to redundant information,which cannot improve the diagnosis performance.With the hierarchical structure,the redundant information is reduced and the deep local time-related features are further extracted.Besides,a novel many-to-one training strategy is introduced for HMSAN to simplify the training procedure and capture the long-term dependency.Finally,the effectiveness of the proposed method is demonstrated by two chemical cases.The experimental results show that the proposed method achieves a great performance on time-series industrial data and outperforms the state-of-the-art approaches.
文摘Keyphrase greatly provides summarized and valuable information.This information can help us not only understand text semantics,but also organize and retrieve text content effectively.The task of automatically generating it has received considerable attention in recent decades.From the previous studies,we can see many workable solutions for obtaining keyphrases.One method is to divide the content to be summarized into multiple blocks of text,then we rank and select the most important content.The disadvantage of this method is that it cannot identify keyphrase that does not include in the text,let alone get the real semantic meaning hidden in the text.Another approach uses recurrent neural networks to generate keyphrases from the semantic aspects of the text,but the inherently sequential nature precludes parallelization within training examples,and distances have limitations on context dependencies.Previous works have demonstrated the benefits of the self-attention mechanism,which can learn global text dependency features and can be parallelized.Inspired by the above observation,we propose a keyphrase generation model,which is based entirely on the self-attention mechanism.It is an encoder-decoder model that can make up the above disadvantage effectively.In addition,we also consider the semantic similarity between keyphrases,and add semantic similarity processing module into the model.This proposed model,which is demonstrated by empirical analysis on five datasets,can achieve competitive performance compared to baseline methods.
基金supported by the National Natural Science Foundation of China(Nos.U19A208162202320)+2 种基金the Fundamental Research Funds for the Central Universities(No.SCU2023D008)the Science and Engineering Connotation Development Project of Sichuan University(No.2020SCUNG129)the Key Laboratory of Data Protection and Intelligent Management(Sichuan University),Ministry of Education.
文摘Due to the rapid evolution of Advanced Persistent Threats(APTs)attacks,the emergence of new and rare attack samples,and even those never seen before,make it challenging for traditional rule-based detection methods to extract universal rules for effective detection.With the progress in techniques such as transfer learning and meta-learning,few-shot network attack detection has progressed.However,challenges in few-shot network attack detection arise from the inability of time sequence flow features to adapt to the fixed length input requirement of deep learning,difficulties in capturing rich information from original flow in the case of insufficient samples,and the challenge of high-level abstract representation.To address these challenges,a few-shot network attack detection based on NFHP(Network Flow Holographic Picture)-RN(ResNet)is proposed.Specifically,leveraging inherent properties of images such as translation invariance,rotation invariance,scale invariance,and illumination invariance,network attack traffic features and contextual relationships are intuitively represented in NFHP.In addition,an improved RN network model is employed for high-level abstract feature extraction,ensuring that the extracted high-level abstract features maintain the detailed characteristics of the original traffic behavior,regardless of changes in background traffic.Finally,a meta-learning model based on the self-attention mechanism is constructed,achieving the detection of novel APT few-shot network attacks through the empirical generalization of high-level abstract feature representations of known-class network attack behaviors.Experimental results demonstrate that the proposed method can learn high-level abstract features of network attacks across different traffic detail granularities.Comparedwith state-of-the-artmethods,it achieves favorable accuracy,precision,recall,and F1 scores for the identification of unknown-class network attacks through cross-validation onmultiple datasets.
基金support from the Liaoning Province Nature Fund Project(No.2022-MS-291)the Scientific Research Project of Liaoning Province Education Department(LJKMZ20220781,LJKMZ20220783,LJKQZ20222457,JYTMS20231488).
文摘Nowadays,with the rapid development of industrial Internet technology,on the one hand,advanced industrial control systems(ICS)have improved industrial production efficiency.However,there are more and more cyber-attacks targeting industrial control systems.To ensure the security of industrial networks,intrusion detection systems have been widely used in industrial control systems,and deep neural networks have always been an effective method for identifying cyber attacks.Current intrusion detection methods still suffer from low accuracy and a high false alarm rate.Therefore,it is important to build a more efficient intrusion detection model.This paper proposes a hybrid deep learning intrusion detection method based on convolutional neural networks and bidirectional long short-term memory neural networks(CNN-BiLSTM).To address the issue of imbalanced data within the dataset and improve the model’s detection capabilities,the Synthetic Minority Over-sampling Technique-Edited Nearest Neighbors(SMOTE-ENN)algorithm is applied in the preprocessing phase.This algorithm is employed to generate synthetic instances for the minority class,simultaneously mitigating the impact of noise in the majority class.This approach aims to create a more equitable distribution of classes,thereby enhancing the model’s ability to effectively identify patterns in both minority and majority classes.In the experimental phase,the detection performance of the method is verified using two data sets.Experimental results show that the accuracy rate on the CICIDS-2017 data set reaches 97.7%.On the natural gas pipeline dataset collected by Lan Turnipseed from Mississippi State University in the United States,the accuracy rate also reaches 85.5%.
基金supported in part by the National Natural Science Foundation of China(General Program)under Grants 62073193 and 61873333in part by the National Key Research and Development Project(General Program)under Grant 2020YFE0204900in part by the Key Research and Development Plan of Shandong Province(General Program)under Grant 2021CXGC010204.
文摘Due to their robust learning and expression ability for complex features,the deep learning(DL)model plays a vital role in bearing fault diagnosis.However,since there are fewer labeled samples in fault diagnosis,the depth of DL models in fault diagnosis is generally shallower than that of DL models in other fields,which limits the diagnostic performance.To solve this problem,a novel transfer residual Swin Transformer(RST)is proposed for rolling bearings in this paper.RST has 24 residual self-attention layers,which use the hierarchical design and the shifted window-based residual self-attention.Combined with transfer learning techniques,the transfer RST model uses pre-trained parameters from ImageNet.A new end-to-end method for fault diagnosis based on deep transfer RST is proposed.Firstly,wavelet transform transforms the vibration signal into a wavelet time-frequency diagram.The signal’s time-frequency domain representation can be represented simultaneously.Secondly,the wavelet time-frequency diagram is the input of the RST model to obtain the fault type.Finally,our method is verified on public and self-built datasets.Experimental results show the superior performance of our method by comparing it with a shallow neural network.
文摘Emotional electroencephalography(EEG)signals are a primary means of recording emotional brain activity.Currently,the most effective methods for analyzing emotional EEG signals involve feature engineering and neural networks.However,neural networks possess a strong ability for automatic feature extraction.Is it possible to discard feature engineering and directly employ neural networks for end-to-end recognition?Based on the characteristics of EEG signals,this paper proposes an end-to-end feature extraction and classification method for a dynamic self-attention network(DySAT).The study reveals significant differences in brain activity patterns associated with different emotions across various experimenters and time periods.The results of this experiment can provide insights into the reasons behind these differences.
基金supported by the National Natural Science Foundationof China under Grant No. 61803061, 61906026Innovation research groupof universities in Chongqing+4 种基金the Chongqing Natural Science Foundationunder Grant cstc2020jcyj-msxmX0577, cstc2020jcyj-msxmX0634“Chengdu-Chongqing Economic Circle” innovation funding of Chongqing Municipal Education Commission KJCXZD2020028the Science andTechnology Research Program of Chongqing Municipal Education Commission grants KJQN202000602Ministry of Education China MobileResearch Fund (MCM 20180404)Special key project of Chongqingtechnology innovation and application development: cstc2019jscxzdztzx0068.
文摘The haze weather environment leads to the deterioration of the visual effect of the image,and it is difficult to carry out the work of the advanced vision task.Therefore,dehazing the haze image is an important step before the execution of the advanced vision task.Traditional dehazing algorithms achieve image dehazing by improving image brightness and contrast or constructing artificial priors such as color attenuation priors and dark channel priors.However,the effect is unstable when dealing with complex scenes.In the method based on convolutional neural network,the image dehazing network of the encoding and decoding structure does not consider the difference before and after the dehazing image,and the image spatial information is lost in the encoding stage.In order to overcome these problems,this paper proposes a novel end-to-end two-stream convolutional neural network for single-image dehazing.The network model is composed of a spatial information feature stream and a highlevel semantic feature stream.The spatial information feature stream retains the detailed information of the dehazing image,and the high-level semantic feature stream extracts the multi-scale structural features of the dehazing image.A spatial information auxiliary module is designed and placed between the feature streams.This module uses the attention mechanism to construct a unified expression of different types of information and realizes the gradual restoration of the clear image with the semantic information auxiliary spatial information in the dehazing network.A parallel residual twicing module is proposed,which performs dehazing on the difference information of features at different stages to improve the model’s ability to discriminate haze images.The peak signal-to-noise ratio(PSNR)and structural similarity are used to quantitatively evaluate the similarity between the dehazing results of each algorithm and the original image.The structure similarity and PSNR of the method in this paper reached 0.852 and 17.557dB on the HazeRD dataset,which were higher than existing comparison algorithms.On the SOTS dataset,the indicators are 0.955 and 27.348dB,which are sub-optimal results.In experiments with real haze images,this method can also achieve excellent visual restoration effects.The experimental results show that the model proposed in this paper can restore desired visual effects without fog images,and it also has good generalization performance in real haze scenes.
基金supported by the National Natural Science Foundation of China(Grant Nos.61972451,61902230)the Fundamental Research Funds for the Central Universities,Shaanxi Normal University(GK202103091)。
文摘Circular RNAs(circRNAs)are RNAs with closed circular structure involved in many biological processes by key interactions with RNA binding proteins(RBPs).Existing methods for predicting these interactions have limitations in feature learning.In view of this,we propose a method named circ2CBA,which uses only sequence information of circRNAs to predict circRNA-RBP binding sites.We have constructed a data set which includes eight sub-datasets.First,circ2CBA encodes circRNA sequences using the one-hot method.Next,a two-layer convolutional neural network(CNN)is used to initially extract the features.After CNN,circ2CBA uses a layer of bidirectional long and short-term memory network(BiLSTM)and the self-attention mechanism to learn the features.The AUC value of circ2CBA reaches 0.8987.Comparison of circ2CBA with other three methods on our data set and an ablation experiment confirm that circ2CBA is an effective method to predict the binding sites between circRNAs and RBPs.
基金This work was supported,in part,by the National Nature Science Foundation of China under grant numbers 62272236in part,by the Natural Science Foundation of Jiangsu Province under grant numbers BK20201136,BK20191401in part,by the Priority Academic Program Development of Jiangsu Higher Education Institutions(PAPD)fund.
文摘LIDAR point cloud-based 3D object detection aims to sense the surrounding environment by anchoring objects with the Bounding Box(BBox).However,under the three-dimensional space of autonomous driving scenes,the previous object detection methods,due to the pre-processing of the original LIDAR point cloud into voxels or pillars,lose the coordinate information of the original point cloud,slow detection speed,and gain inaccurate bounding box positioning.To address the issues above,this study proposes a new two-stage network structure to extract point cloud features directly by PointNet++,which effectively preserves the original point cloud coordinate information.To improve the detection accuracy,a shell-based modeling method is proposed.It roughly determines which spherical shell the coordinates belong to.Then,the results are refined to ground truth,thereby narrowing the localization range and improving the detection accuracy.To improve the recall of 3D object detection with bounding boxes,this paper designs a self-attention module for 3D object detection with a skip connection structure.Some of these features are highlighted by weighting them on the feature dimensions.After training,it makes the feature weights that are favorable for object detection get larger.Thus,the extracted features are more adapted to the object detection task.Extensive comparison experiments and ablation experiments conducted on the KITTI dataset verify the effectiveness of our proposed method in improving recall and precision.
基金This work is financially supported by:The National Key R&D Program of China(No.2020YFB1712600)The Fundamental Research Funds for Central University(No.3072022QBZ0601)+1 种基金The National Natural Science Foundation of China(No.62272126)The National Natural Science Foundation of China(No.61872104).
文摘The production data in the industrialfield have the characteristics of multimodality,high dimensionality and large correlation differences between attributes.Existing data prediction methods cannot effectively capture time series and modal features,which leads to prediction hysteresis and poor prediction stabil-ity.Aiming at the above problems,this paper proposes a time-series and modal fea-tureenhancementmethodbasedonadual-stageself-attentionmechanism(DATT),and a time series prediction method based on a gated feedforward recurrent unit(GFRU).On this basis,the DATT-GFRU neural network with a gated feedforward recurrent neural network and dual-stage self-attention mechanism is designed and implemented.Experiments show that the prediction effect of the neural network prediction model based on DATT is significantly improved.Compared with the traditional prediction model,the DATT-GFRU neural network has a smaller aver-age error of model prediction results,stable prediction performance,and strong generalization ability on the three datasets with different numbers of attributes and different training sample sizes.
基金supported by the National Natural Science Foundation of China (41874173)。
文摘Pedestrian attribute recognition is often considered as a multi-label image classification task. In order to make full use of attribute-related location information, a saliency guided self-attention network(SGSA-Net) was proposed to weakly supervise attribute localization, without annotations of attribute-related regions. Saliency priors were integrated into the spatial attention module(SAM). Meanwhile, channel-wise attention and spatial attention were introduced into the network. Moreover, a weighted binary cross-entropy loss(WCEL) function was employed to handle the imbalance of training data. Extensive experiments on richly annotated pedestrian(RAP) and pedestrian attribute(PETA) datasets demonstrated that SGSA-Net outperformed other state-of-the-art methods.
基金This paper is supported by university fund project of Hubei Institute of Fine Arts,named“The construction of blended teaching mode based on flipped classroom-Taking the Course of“Fashion Painting Illustration”as an Example.”(No.202028)。
文摘Purpose-Clothing patterns play a dominant role in costume design and have become an important link in the perception of costume art.Conventional clothing patterns design relies on experienced designers.Although the quality of clothing patterns is very high on conventional design,the input time and output amount ratio is relative low for conventional design.In order to break through the bottleneck of conventional clothing patterns design,this paper proposes a novel way based on generative adversarial network(GAN)model for automatic clothing patterns generation,which not only reduces the dependence of experienced designer,but also improve the input-output ratio.Design/methodology/approach-In view of the fact that clothing patterns have high requirements for global artistic perception and local texture details,this paper improves the conventional GAN model from two aspects:a multi-scales discriminators strategy is introduced to deal with the local texture details;and the selfattention mechanism is introduced to improve the global artistic perception.Therefore,the improved GAN called multi-scales self-attention improved generative adversarial network(MS-SA-GAN)model,which is used for high resolution clothing patterns generation.Findings-To verify the feasibility and effectiveness of the proposed MS-SA-GAN model,a crawler is designed to acquire standard clothing patterns dataset from Baidu pictures,and a comparative experiment is conducted on our designed clothing patterns dataset.In experiments,we have adjusted different parameters of the proposed MS-SA-GAN model,and compared the global artistic perception and local texture details of the generated clothing patterns.Originality/value-Experimental results have shown that the clothing patterns generated by the proposed MS-SA-GANmodel are superior to the conventional algorithms in some local texture detail indexes.In addition,a group of clothing design professionals is invited to evaluate the global artistic perception through a valencearousal scale.The scale results have shown that the proposed MS-SA-GAN model achieves a better global art perception.
基金supported by the National Key R&D Program of China under Grant No.2020YFB1710200the National Natural Science Foundation of China under Grant No.61872105 and No.62072136.
文摘With rapid economic development,the per capita ownership of automobiles in our country has begun to rise year by year.More researchers have paid attention to using scientific methods to solve traffic flow problems.Traffic flow prediction is not simply affected by the number of vehicles,but also contains various complex factors,such as time,road conditions,and people flow.However,the existing methods ignore the complexity of road conditions and the correlation between individual nodes,which leads to the poor performance.In this study,a deep learning model SAMGCN is proposed to effectively capture the correlation between individual nodes to improve the performance of traffic flow prediction.First,the theory of spatiotemporal decoupling is used to divide each time of each node into finer particles.Second,multimodule fusion is used to mine the potential periodic relationships in the data.Finally,GRU is used to obtain the potential time relationship of the three modules.Extensive experiments were conducted on two traffic flow datasets,PeMS04 and PeMS08 in the Caltrans Performance Measurement System to prove the validity of the proposed model.
基金the National Natural Science Foundation of China(Grant No.21991093)the Strategic Priority Research Program of Chinese Academy of Sciences(Grant No.XDA29050200)+1 种基金the Dalian Institute of Chemical Physics(DICP I202135)the Energy Science and Technology Revolution Project(Grant No.E2010412).
文摘Methanol-to-olefins,as a promising non-oil pathway for the synthesis of light olefins,has been successfully industrialized.The accurate prediction of process variables can yield significant benefits for advanced process control and optimization.The challenge of this task is underscored by the failure of traditional methods in capturing the complex characteristics of industrial processes,such as high nonlinearities,dynamics,and data distribution shift caused by diverse operating conditions.In this paper,we propose a novel hybrid spatial-temporal deep learning prediction model to address these issues.Firstly,a unique data normalization technique called reversible instance normalization is employed to solve the problem of different data distributions.Subsequently,convolutional neural network integrated with the self-attention mechanism are utilized to extract the temporal patterns.Meanwhile,a multi-graph convolutional network is leveraged to model the spatial interactions.Afterward,the extracted temporal and spatial features are fused as input into a fully connected neural network to complete the prediction.Finally,the outputs are denormalized to obtain the ultimate results.The monitoring results of the dynamic trends of process variables in an actual industrial methanol-to-olefins process demonstrate that our model not only achieves superior prediction performance but also can reveal complex spatial-temporal relationships using the learned attention matrices and adjacency matrices,making the model more interpretable.Lastly,this model is deployed onto an end-to-end Industrial Internet Platform,which achieves effective practical results.
基金supported by the National Key R&D Program of China (Grant No.2018YFB1702503)Shanghai Municipal Science and Technology Major Project (Grant No.2021SHZDZX0102)the State Key Laboratory of Mechanical System and Vibration (Grant No.MSVZD202103)。
文摘Shield tunneling machines are paramount underground engineering equipment and play a key role in tunnel construction.During the shield construction process,the“mud cake”formed by the difficult-to-remove clay attached to the cutterhead severely affects the shield construction efficiency and is harmful to the healthy operation of a shield tunneling machine.In this study,we propose an enhanced transformer-based detection model for detecting the cutterhead clogging status of shield tunneling machines.First,the working state data of shield machines are selected from historical excavation data,and a long short-term memory-autoencoder neural network module is constructed to remove outliers.Next,variational mode decomposition and wavelet transform are employed to denoise the data.After the preprocessing,nonoverlapping rectangular windows are used to intercept the working state data to obtain the time slices used for analysis,and several time-domain features of these periods are extracted.Owing to the data imbalance in the original dataset,the k-means-synthetic minority oversampling technique algorithm is adopted to oversample the extracted time-domain features of the clogging data in the training set to balance the dataset and improve the model performance.Finally,an enhanced transformer-based neural network is constructed to extract essential implicit features and detect cutterhead clogging status.Data collected from actual tunnel construction projects are used to verify the proposed model.The results show that the proposed model achieves accurate detection of shield machine cutterhead clogging status,with 98.85%accuracy and a 0.9786 F1 score.Moreover,the proposed model significantly outperforms the comparison models.
基金financially supported by the National Natural Science Foundation of China(Nos.21973018 and 21534002)the Natural Sciences and Engineering Research Council(NSERC)of Canada。
文摘A deep neural network model generally consists of different modules that play essential roles in performing a task.The optimal design of a module for use in modeling a physical problem is directly related to the success of the model.In this work,the effectiveness of a number of special modules,the self-attention mechanism for recognizing the importance of molecular sequence information in a polymer,as well as the big-stride representation and conditional random field for enhancing the network ability to produce desired local configurations,is numerically studied.Network models containing these modules are trained by using the well documented data of the native structures of the HP model and assessed according to their capability in making structural predictions of unseen data.The specific network design of self-attention mechanism adopted here is modified from a similar idea in natural language recognition.The big-stride representation module introduced in this work is shown to drastically improve network's capability to model polymer segments of strong lattice position correlations.
基金Project supported by the National Natural Science Foundation of China(No.71961028)the Key Research and Development Program of Gansu Province,China(No.22YF7GA171)+2 种基金the University Industry Support Program of Gansu Province,China(No.2023QB-115)the Innovation Fund for Science and Technology-based Small and Medium Enterprises of Gansu Province,China(No.23CXGA0136)the Scientific Research Project of the Lanzhou Science and Technology Program,China(No.2018-01-58)。
文摘Accurate long-term power forecasting is important in the decision-making operation of the power grid and power consumption management of customers to ensure the power system’s reliable power supply and the grid economy’s reliable operation.However,most time-series forecasting models do not perform well in dealing with long-time-series prediction tasks with a large amount of data.To address this challenge,we propose a parallel time-series prediction model called LDformer.First,we combine Informer with long short-term memory(LSTM)to obtain deep representation abilities in the time series.Then,we propose a parallel encoder module to improve the robustness of the model and combine convolutional layers with an attention mechanism to avoid value redundancy in the attention mechanism.Finally,we propose a probabilistic sparse(ProbSparse)self-attention mechanism combined with UniDrop to reduce the computational overhead and mitigate the risk of losing some key connections in the sequence.Experimental results on five datasets show that LDformer outperforms the state-of-the-art methods for most of the cases when handling the different long-time-series prediction tasks.
基金supported by the National Key Research and Development Program of China[grant no 2021YFF0602104-1].
文摘An intelligent single radar image de-raining method based on unsupervised self-attention generative adversarial networks is proposed to improve the accuracy of wave height parameter inversion results.The method builds a trainable end-to-end de-raining model with an unsupervised cycle-consistent adversarial network as an AI framework,which does not require pairs of rain-contaminated and corresponding ground-truth rain-free images for training.The model is trained by feeding rain-contaminated and clean radar images in an unpaired manner,and the atmospheric scattering model parameters are not required as a prior condition.Additionally,a self-attention mechanism is introduced into the model,allowing it to focus on rain clutter when processing radar images.This combines global and local rain clutter context information to output more accurate and clear de-raining radar images.The proposed method is validated by applying it to actualfield test data,which shows that compared with the wave height derived from the original rain-contaminated data,the root-mean-square error is reduced by 0.11 m and the correlation coefficient of the wave height is increased by 14%using the de-raining method.These results demonstrate that the method effectively reduces the impact of rain on the accuracy of wave height parameter estimation from marine X-band radar images.
基金Project supported by the National Natural Science Foundation of China (Nos. 61872122 and 61502131)the Zhejiang Provincial Natural Science Foundation of China (No. LY18F020015)+1 种基金the Open Pro ject Program of the State Key Lab of CAD&CG,China (No. 1802)the Zhejiang Provincial Key Research and Development Program,China (No. 2020C01067)。
文摘Video summarization has established itself as a fundamental technique for generating compact and concise video, which alleviates managing and browsing large-scale video data. Existing methods fail to fully consider the local and global relations among frames of video, leading to a deteriorated summarization performance. To address the above problem, we propose a graph convolutional attention network(GCAN) for video summarization. GCAN consists of two parts, embedding learning and context fusion, where embedding learning includes the temporal branch and graph branch. In particular, GCAN uses dilated temporal convolution to model local cues and temporal self-attention to exploit global cues for video frames. It learns graph embedding via a multi-layer graph convolutional network to reveal the intrinsic structure of frame samples. The context fusion part combines the output streams from the temporal branch and graph branch to create the context-aware representation of frames, on which the importance scores are evaluated for selecting representative frames to generate video summary. Experiments are carried out on two benchmark databases, Sum Me and TVSum, showing that the proposed GCAN approach enjoys superior performance compared to several state-of-the-art alternatives in three evaluation settings.