The growing prevalence of knowledge reasoning using knowledge graphs(KGs)has substantially improved the accuracy and efficiency of intelligent medical diagnosis.However,current models primarily integrate electronic me...The growing prevalence of knowledge reasoning using knowledge graphs(KGs)has substantially improved the accuracy and efficiency of intelligent medical diagnosis.However,current models primarily integrate electronic medical records(EMRs)and KGs into the knowledge reasoning process,ignoring the differing significance of various types of knowledge in EMRs and the diverse data types present in the text.To better integrate EMR text information,we propose a novel intelligent diagnostic model named the Graph ATtention network incorporating Text representation in knowledge reasoning(GATiT),which comprises text representation,subgraph construction,knowledge reasoning,and diagnostic classification.In the text representation process,GATiT uses a pre-trained model to obtain text representations of the EMRs and additionally enhances embeddings by including chief complaint information and numerical information in the input.In the subgraph construction process,GATiT constructs text subgraphs and disease subgraphs from the KG,utilizing EMR text and the disease to be diagnosed.To differentiate the varying importance of nodes within the subgraphs features such as node categories,relevance scores,and other relevant factors are introduced into the text subgraph.Themessage-passing strategy and attention weight calculation of the graph attention network are adjusted to learn these features in the knowledge reasoning process.Finally,in the diagnostic classification process,the interactive attention-based fusion method integrates the results of knowledge reasoning with text representations to produce the final diagnosis results.Experimental results on multi-label and single-label EMR datasets demonstrate the model’s superiority over several state-of-theart methods.展开更多
Advanced carbon emission factors of a power grid can provide users with effective carbon reduction advice,which is of immense importance in mobilizing the entire society to reduce carbon emissions.The method of calcul...Advanced carbon emission factors of a power grid can provide users with effective carbon reduction advice,which is of immense importance in mobilizing the entire society to reduce carbon emissions.The method of calculating node carbon emission factors based on the carbon emissions flow theory requires real-time parameters of a power grid.Therefore,it cannot provide carbon factor information beforehand.To address this issue,a prediction model based on the graph attention network is proposed.The model uses a graph structure that is suitable for the topology of the power grid and designs a supervised network using the loads of the grid nodes and the corresponding carbon factor data.The network extracts features and transmits information more suitable for the power system and can flexibly adjust the equivalent topology,thereby increasing the diversity of the structure.Its input and output data are simple,without the power grid parameters.We demonstrated its effect by testing IEEE-39 bus and IEEE-118 bus systems with average error rates of 2.46%and 2.51%.展开更多
Considering the nonlinear structure and spatial-temporal correlation of traffic network,and the influence of potential correlation between nodes of traffic network on the spatial features,this paper proposes a traffic...Considering the nonlinear structure and spatial-temporal correlation of traffic network,and the influence of potential correlation between nodes of traffic network on the spatial features,this paper proposes a traffic speed prediction model based on the combination of graph attention network with self-adaptive adjacency matrix(SAdpGAT)and bidirectional gated recurrent unit(BiGRU).First-ly,the model introduces graph attention network(GAT)to extract the spatial features of real road network and potential road network respectively in spatial dimension.Secondly,the spatial features are input into BiGRU to extract the time series features.Finally,the prediction results of the real road network and the potential road network are connected to generate the final prediction results of the model.The experimental results show that the prediction accuracy of the proposed model is im-proved obviously on METR-LA and PEMS-BAY datasets,which proves the advantages of the pro-posed spatial-temporal model in traffic speed prediction.展开更多
Fault detection and diagnosis(FDD)plays a significant role in ensuring the safety and stability of chemical processes.With the development of artificial intelligence(AI)and big data technologies,data-driven approaches...Fault detection and diagnosis(FDD)plays a significant role in ensuring the safety and stability of chemical processes.With the development of artificial intelligence(AI)and big data technologies,data-driven approaches with excellent performance are widely used for FDD in chemical processes.However,improved predictive accuracy has often been achieved through increased model complexity,which turns models into black-box methods and causes uncertainty regarding their decisions.In this study,a causal temporal graph attention network(CTGAN)is proposed for fault diagnosis of chemical processes.A chemical causal graph is built by causal inference to represent the propagation path of faults.The attention mechanism and chemical causal graph were combined to help us notice the key variables relating to fault fluctuations.Experiments in the Tennessee Eastman(TE)process and the green ammonia(GA)process showed that CTGAN achieved high performance and good explainability.展开更多
Object detection has made a significant leap forward in recent years.However,the detection of small objects continues to be a great difficulty for various reasons,such as they have a very small size and they are susce...Object detection has made a significant leap forward in recent years.However,the detection of small objects continues to be a great difficulty for various reasons,such as they have a very small size and they are susceptible to missed detection due to background noise.Additionally,small object information is affected due to the downsampling operations.Deep learning-based detection methods have been utilized to address the challenge posed by small objects.In this work,we propose a novel method,the Multi-Convolutional Block Attention Network(MCBAN),to increase the detection accuracy of minute objects aiming to overcome the challenge of information loss during the downsampling process.The multi-convolutional attention block(MCAB);channel attention and spatial attention module(SAM)that make up MCAB,have been crafted to accomplish small object detection with higher precision.We have carried out the experiments on the Karlsruhe Institute of Technology and Toyota Technological Institute(KITTI)and Pattern Analysis,Statical Modeling and Computational Learning(PASCAL)Visual Object Classes(VOC)datasets and have followed a step-wise process to analyze the results.These experiment results demonstrate that significant gains in performance are achieved,such as 97.75%for KITTI and 88.97%for PASCAL VOC.The findings of this study assert quite unequivocally the fact that MCBAN is much more efficient in the small object detection domain as compared to other existing approaches.展开更多
Accurate detection of pipeline leakage is essential to maintain the safety of pipeline transportation.Recently,deep learning(DL)has emerged as a promising tool for pipeline leakage detection(PLD).However,most existing...Accurate detection of pipeline leakage is essential to maintain the safety of pipeline transportation.Recently,deep learning(DL)has emerged as a promising tool for pipeline leakage detection(PLD).However,most existing DL methods have difficulty in achieving good performance in identifying leakage types due to the complex time dynamics of pipeline data.On the other hand,the initial parameter selection in the detection model is generally random,which may lead to unstable recognition performance.For this reason,a hybrid DL framework referred to as parameter-optimized recurrent attention network(PRAN)is presented in this paper to improve the accuracy of PLD.First,a parameter-optimized long short-term memory(LSTM)network is introduced to extract effective and robust features,which exploits a particle swarm optimization(PSO)algorithm with cross-entropy fitness function to search for globally optimal parameters.With this framework,the learning representation capability of the model is improved and the convergence rate is accelerated.Moreover,an anomaly-attention mechanism(AM)is proposed to discover class discriminative information by weighting the hidden states,which contributes to amplifying the normalabnormal distinguishable discrepancy,further improving the accuracy of PLD.After that,the proposed PRAN not only implements the adaptive optimization of network parameters,but also enlarges the contribution of normal-abnormal discrepancy,thereby overcoming the drawbacks of instability and poor generalization.Finally,the experimental results demonstrate the effectiveness and superiority of the proposed PRAN for PLD.展开更多
Continuous sign language recognition(CSLR)is challenging due to the complexity of video background,hand gesture variability,and temporal modeling difficulties.This work proposes a CSLR method based on a spatialtempora...Continuous sign language recognition(CSLR)is challenging due to the complexity of video background,hand gesture variability,and temporal modeling difficulties.This work proposes a CSLR method based on a spatialtemporal graph attention network to focus on essential features of video series.The method considers local details of sign language movements by taking the information on joints and bones as inputs and constructing a spatialtemporal graph to reflect inter-frame relevance and physical connections between nodes.The graph-based multihead attention mechanism is utilized with adjacent matrix calculation for better local-feature exploration,and short-term motion correlation modeling is completed via a temporal convolutional network.We adopted BLSTM to learn the long-termdependence and connectionist temporal classification to align the word-level sequences.The proposed method achieves competitive results regarding word error rates(1.59%)on the Chinese Sign Language dataset and the mean Jaccard Index(65.78%)on the ChaLearn LAP Continuous Gesture Dataset.展开更多
Numerous works prove that existing neighbor-averaging graph neural networks(GNNs)cannot efficiently catch structure features,and many works show that injecting structure,distance,position,or spatial features can signi...Numerous works prove that existing neighbor-averaging graph neural networks(GNNs)cannot efficiently catch structure features,and many works show that injecting structure,distance,position,or spatial features can significantly improve the performance of GNNs,however,injecting high-level structure and distance into GNNs is an intuitive but untouched idea.This work sheds light on this issue and proposes a scheme to enhance graph attention networks(GATs)by encoding distance and hop-wise structure statistics.Firstly,the hop-wise structure and distributional distance information are extracted based on several hop-wise ego-nets of every target node.Secondly,the derived structure information,distance information,and intrinsic features are encoded into the same vector space and then added together to get initial embedding vectors.Thirdly,the derived embedding vectors are fed into GATs,such as GAT and adaptive graph diffusion network(AGDN)to get the soft labels.Fourthly,the soft labels are fed into correct and smooth(C&S)to conduct label propagation and get final predictions.Experiments show that the distance and hop-wise structures encoding enhanced graph attention networks(DHSEGATs)achieve a competitive result.展开更多
The natural language to SQL(NL2SQL)task is an emerging research area that aims to transform a natural language with a given database into an SQL query.The earlier approaches were to process the input into a heterogene...The natural language to SQL(NL2SQL)task is an emerging research area that aims to transform a natural language with a given database into an SQL query.The earlier approaches were to process the input into a heterogeneous graph.However,previous models failed to distinguish the types of multi-hop connections of the heterogeneous graph,which tended to ignore crucial semantic path information.To this end,a two-layer attention network is presented to focus on essential neighbor nodes and mine enlightening semantic paths for feature encoding.The weighted edge is introduced for schema linking to connect the nodes with semantic similarity.In the decoding phase,a rule-based pruning strategy is offered to refine the generated SQL queries.From the experimental results,the approach is shown to learn a good encoding representation and decode the representation to generate results with practical meaning.展开更多
Accurate pancreas segmentation is critical for the diagnosis and management of diseases of the pancreas. It is challenging to precisely delineate pancreas due to the highly variations in volume, shape and location. In...Accurate pancreas segmentation is critical for the diagnosis and management of diseases of the pancreas. It is challenging to precisely delineate pancreas due to the highly variations in volume, shape and location. In recent years, coarse-to-fine methods have been widely used to alleviate class imbalance issue and improve pancreas segmentation accuracy. However,cascaded methods could be computationally intensive and the refined results are significantly dependent on the performance of its coarse segmentation results. To balance the segmentation accuracy and computational efficiency, we propose a Discriminative Feature Attention Network for pancreas segmentation, to effectively highlight pancreas features and improve segmentation accuracy without explicit pancreas location. The final segmentation is obtained by applying a simple yet effective post-processing step. Two experiments on both public NIH pancreas CT dataset and abdominal BTCV multi-organ dataset are individually conducted to show the effectiveness of our method for 2 D pancreas segmentation. We obtained average Dice Similarity Coefficient(DSC) of 82.82±6.09%, average Jaccard Index(JI) of 71.13± 8.30% and average Symmetric Average Surface Distance(ASD) of 1.69 ± 0.83 mm on the NIH dataset. Compared to the existing deep learning-based pancreas segmentation methods, our experimental results achieve the best average DSC and JI value.展开更多
In computer vision,object recognition and image categorization have proven to be difficult challenges.They have,nevertheless,generated responses to a wide range of difficult issues from a variety of fields.Convolution...In computer vision,object recognition and image categorization have proven to be difficult challenges.They have,nevertheless,generated responses to a wide range of difficult issues from a variety of fields.Convolution Neural Networks(CNNs)have recently been identified as the most widely proposed deep learning(DL)algorithms in the literature.CNNs have unquestionably delivered cutting-edge achievements,particularly in the areas of image classification,speech recognition,and video processing.However,it has been noticed that the CNN-training assignment demands a large amount of data,which is in low supply,especially in the medical industry,and as a result,the training process takes longer.In this paper,we describe an attentionaware CNN architecture for classifying chest X-ray images to diagnose Pneumonia in order to address the aforementioned difficulties.AttentionModules provide attention-aware properties to the Attention Network.The attentionaware features of various modules alter as the layers become deeper.Using a bottom-up top-down feedforward structure,the feedforward and feedback attention processes are integrated into a single feedforward process inside each attention module.In the present work,a deep neural network(DNN)is combined with an attention mechanism to test the prediction of Pneumonia disease using chest X-ray pictures.To produce attention-aware features,the suggested networkwas built by merging channel and spatial attentionmodules in DNN architecture.With this network,we worked on a publicly available Kaggle chest X-ray dataset.Extensive testing was carried out to validate the suggested model.In the experimental results,we attained an accuracy of 95.47%and an F-score of 0.92,indicating that the suggested model outperformed against the baseline models.展开更多
Tumour segmentation in medical images(especially 3D tumour segmentation)is highly challenging due to the possible similarity between tumours and adjacent tissues,occurrence of multiple tumours and variable tumour shap...Tumour segmentation in medical images(especially 3D tumour segmentation)is highly challenging due to the possible similarity between tumours and adjacent tissues,occurrence of multiple tumours and variable tumour shapes and sizes.The popular deep learning‐based segmentation algorithms generally rely on the convolutional neural network(CNN)and Transformer.The former cannot extract the global image features effectively while the latter lacks the inductive bias and involves the complicated computation for 3D volume data.The existing hybrid CNN‐Transformer network can only provide the limited performance improvement or even poorer segmentation performance than the pure CNN.To address these issues,a short‐term and long‐term memory self‐attention network is proposed.Firstly,a distinctive self‐attention block uses the Transformer to explore the correlation among the region features at different levels extracted by the CNN.Then,the memory structure filters and combines the above information to exclude the similar regions and detect the multiple tumours.Finally,the multi‐layer reconstruction blocks will predict the tumour boundaries.Experimental results demonstrate that our method outperforms other methods in terms of subjective visual and quantitative evaluation.Compared with the most competitive method,the proposed method provides Dice(82.4%vs.76.6%)and Hausdorff distance 95%(HD95)(10.66 vs.11.54 mm)on the KiTS19 as well as Dice(80.2%vs.78.4%)and HD95(9.632 vs.12.17 mm)on the LiTS.展开更多
Referring expressions comprehension is the task of locating the image region described by a natural language expression,which refer to the properties of the region or the relationships with other regions.Most previous...Referring expressions comprehension is the task of locating the image region described by a natural language expression,which refer to the properties of the region or the relationships with other regions.Most previous work handles this problem by selecting the most relevant regions from a set of candidate regions,when there are many candidate regions in the set these methods are inefficient.Inspired by recent success of image captioning by using deep learning methods,in this paper we proposed a framework to understand the referring expressions by multiple steps of reasoning.We present a model for referring expressions comprehension by selecting the most relevant region directly from the image.The core of our model is a recurrent attention network which can be seen as an extension of Memory Network.The proposed model capable of improving the results by multiple computational hops.We evaluate the proposed model on two referring expression datasets:Visual Genome and Flickr30k Entities.The experimental results demonstrate that the proposed model outperform previous state-of-the-art methods both in accuracy and efficiency.We also conduct an ablation experiment to show that the performance of the model is not getting better with the increase of the attention layers.展开更多
Automatic text summarization(ATS)plays a significant role in Natural Language Processing(NLP).Abstractive summarization produces summaries by identifying and compressing the most important information in a document.Ho...Automatic text summarization(ATS)plays a significant role in Natural Language Processing(NLP).Abstractive summarization produces summaries by identifying and compressing the most important information in a document.However,there are only relatively several comprehensively evaluated abstractive summarization models that work well for specific types of reports due to their unstructured and oral language text characteristics.In particular,Chinese complaint reports,generated by urban complainers and collected by government employees,describe existing resident problems in daily life.Meanwhile,the reflected problems are required to respond speedily.Therefore,automatic summarization tasks for these reports have been developed.However,similar to traditional summarization models,the generated summaries still exist problems of informativeness and conciseness.To address these issues and generate suitably informative and less redundant summaries,a topic-based abstractive summarization method is proposed to obtain global and local features.Additionally,a heterogeneous graph of the original document is constructed using word-level and topic-level features.Experiments and analyses on public review datasets(Yelp and Amazon)and our constructed dataset(Chinese complaint reports)show that the proposed framework effectively improves the performance of the abstractive summarization model for Chinese complaint reports.展开更多
PM2.5 concentration prediction is of great significance to environmental protection and human health.Achieving accurate prediction of PM2.5 concentration has become an important research task.However,PM2.5 pollutants ...PM2.5 concentration prediction is of great significance to environmental protection and human health.Achieving accurate prediction of PM2.5 concentration has become an important research task.However,PM2.5 pollutants can spread in the earth’s atmosphere,causing mutual influence between different cities.To effectively capture the air pollution relationship between cities,this paper proposes a novel spatiotemporal model combining graph attention neural network(GAT)and gated recurrent unit(GRU),named GAT-GRU for PM2.5 concentration prediction.Specifically,GAT is used to learn the spatial dependence of PM2.5 concentration data in different cities,and GRU is to extract the temporal dependence of the long-term data series.The proposed model integrates the learned spatio-temporal dependencies to capture long-term complex spatio-temporal features.Considering that air pollution is related to the meteorological conditions of the city,the knowledge acquired from meteorological data is used in the model to enhance PM2.5 prediction performance.The input of the GAT-GRU model consists of PM2.5 concentration data and meteorological data.In order to verify the effectiveness of the proposed GAT-GRU prediction model,this paper designs experiments on real-world datasets compared with other baselines.Experimental results prove that our model achieves excellent performance in PM2.5 concentration prediction.展开更多
Aim: To diagnose COVID-19 more efficiently and more correctly, this study proposed a novel attention network forCOVID-19 (ANC). Methods: Two datasets were used in this study. An 18-way data augmentation was proposed t...Aim: To diagnose COVID-19 more efficiently and more correctly, this study proposed a novel attention network forCOVID-19 (ANC). Methods: Two datasets were used in this study. An 18-way data augmentation was proposed toavoid overfitting. Then, convolutional block attention module (CBAM) was integrated to our model, the structureof which is fine-tuned. Finally, Grad-CAM was used to provide an explainable diagnosis. Results: The accuracyof our ANC methods on two datasets are 96.32% ± 1.06%, and 96.00% ± 1.03%, respectively. Conclusions: Thisproposed ANC method is superior to 9 state-of-the-art approaches.展开更多
In the smart logistics industry,unmanned forklifts that intelligently identify logistics pallets can improve work efficiency in warehousing and transportation and are better than traditional manual forklifts driven by...In the smart logistics industry,unmanned forklifts that intelligently identify logistics pallets can improve work efficiency in warehousing and transportation and are better than traditional manual forklifts driven by humans.Therefore,they play a critical role in smart warehousing,and semantics segmentation is an effective method to realize the intelligent identification of logistics pallets.However,most current recognition algorithms are ineffective due to the diverse types of pallets,their complex shapes,frequent blockades in production environments,and changing lighting conditions.This paper proposes a novel multi-feature fusion-guided multiscale bidirectional attention(MFMBA)neural network for logistics pallet segmentation.To better predict the foreground category(the pallet)and the background category(the cargo)of a pallet image,our approach extracts three types of features(grayscale,texture,and Hue,Saturation,Value features)and fuses them.The multiscale architecture deals with the problem that the size and shape of the pallet may appear different in the image in the actual,complex environment,which usually makes feature extraction difficult.Our study proposes a multiscale architecture that can extract additional semantic features.Also,since a traditional attention mechanism only assigns attention rights from a single direction,we designed a bidirectional attention mechanism that assigns cross-attention weights to each feature from two directions,horizontally and vertically,significantly improving segmentation.Finally,comparative experimental results show that the precision of the proposed algorithm is 0.53%–8.77%better than that of other methods we compared.展开更多
Document images often contain various page components and complex logical structures,which make document layout analysis task challenging.For most deep learning-based document layout analysis methods,convolutional neu...Document images often contain various page components and complex logical structures,which make document layout analysis task challenging.For most deep learning-based document layout analysis methods,convolutional neural networks(CNNs)are adopted as the feature extraction networks.In this paper,a hybrid spatial-channel attention network(HSCA-Net)is proposed to improve feature extraction capability by introducing attention mechanism to explore more salient properties within document pages.The HSCA-Net consists of spatial attention module(SAM),channel attention module(CAM),and designed lateral attention connection.CAM adaptively adjusts channel feature responses by emphasizing selective information,which depends on the contribution of the features of each channel.SAM guides CNNs to focus on the informative contents and capture global context information among page objects.The lateral attention connection incorporates SAM and CAM into multiscale feature pyramid network,and thus retains original feature information.The effectiveness and adaptability of HSCA-Net are evaluated through multiple experiments on publicly available datasets such as PubLayNet,ICDAR-POD,and Article Regions.Experimental results demonstrate that HSCA-Net achieves state-of-the-art performance on document layout analysis task.展开更多
The identification and recording of drilling conditions are crucial for ensuring drilling safety and efficiency. However, the traditional approach of relying on the subjective determination of drilling masters based o...The identification and recording of drilling conditions are crucial for ensuring drilling safety and efficiency. However, the traditional approach of relying on the subjective determination of drilling masters based on experience formulas is slow and not suitable for rapid drilling. In this paper, we propose a drilling condition classification method based on a neural network model. The model uses an improved Bidirectional Gated Recurrent Unit (BiGRU) combined with an attention mechanism to accurately classify seven common drilling conditions simultaneously, achieving an average accuracy of 91.63%. The model also demonstrates excellent generalization ability, real-time performance, and accuracy, making it suitable for actual production. Additionally, the model has excellent expandability, which enhances its potential for further application.展开更多
Currently,numerical models based on idealized assumptions,complex algorithms and high computational costs are unsatisfactory for ocean surface current prediction.Moreover,the complex temporal and spatial variability o...Currently,numerical models based on idealized assumptions,complex algorithms and high computational costs are unsatisfactory for ocean surface current prediction.Moreover,the complex temporal and spatial variability of ocean currents also makes the prediction methods based on time series data challenging.The deep network model can automatically learn and extract complex features hidden in large amount of complex data,so it is a promising method for high quality prediction of ocean currents.In this paper,we propose a spatiotemporal coupled attention deep network model STCANet that can extract abundant temporal and spatial coupling information on the behavior characteristics of ocean currents for improving the prediction accuracy.Firstly,Spatial Module is designed and implemented to extract the spatiotemporal coupling characteristics of ocean currents,and meanwhile the spatial correlations and dependencies among adjacent sea areas are obtained through Spatial Channel Attention Module(SCAM).Secondly,we use the GatedRecurrent-Unit(GRU)to extract temporal relationships of ocean currents,and design and implement the nearest neighbor time attention module to extract the interdependences of ocean currents between adjacent times,which can further improve the accuracy of ocean current prediction.Finally,a series of comparative experiments on the MediSea_Dataset and EastSea_Dataset showed that the prediction quality of our model greatly outperforms those of other benchmark models such as History Average(HA),Autoregressive Integrated Moving Average Model(ARIMA),Long Short-term Memory(LSTM),Gate Recurrent Unit(GRU)and CNN_GRU.展开更多
基金supported in part by the Science and Technology Innovation 2030-“New Generation of Artificial Intelligence”Major Project(No.2021ZD0111000)Henan Provincial Science and Technology Research Project(No.232102211039).
文摘The growing prevalence of knowledge reasoning using knowledge graphs(KGs)has substantially improved the accuracy and efficiency of intelligent medical diagnosis.However,current models primarily integrate electronic medical records(EMRs)and KGs into the knowledge reasoning process,ignoring the differing significance of various types of knowledge in EMRs and the diverse data types present in the text.To better integrate EMR text information,we propose a novel intelligent diagnostic model named the Graph ATtention network incorporating Text representation in knowledge reasoning(GATiT),which comprises text representation,subgraph construction,knowledge reasoning,and diagnostic classification.In the text representation process,GATiT uses a pre-trained model to obtain text representations of the EMRs and additionally enhances embeddings by including chief complaint information and numerical information in the input.In the subgraph construction process,GATiT constructs text subgraphs and disease subgraphs from the KG,utilizing EMR text and the disease to be diagnosed.To differentiate the varying importance of nodes within the subgraphs features such as node categories,relevance scores,and other relevant factors are introduced into the text subgraph.Themessage-passing strategy and attention weight calculation of the graph attention network are adjusted to learn these features in the knowledge reasoning process.Finally,in the diagnostic classification process,the interactive attention-based fusion method integrates the results of knowledge reasoning with text representations to produce the final diagnosis results.Experimental results on multi-label and single-label EMR datasets demonstrate the model’s superiority over several state-of-theart methods.
基金This work is supposed by the Science and Technology Projects of China Southern Power Grid(YNKJXM20222402).
文摘Advanced carbon emission factors of a power grid can provide users with effective carbon reduction advice,which is of immense importance in mobilizing the entire society to reduce carbon emissions.The method of calculating node carbon emission factors based on the carbon emissions flow theory requires real-time parameters of a power grid.Therefore,it cannot provide carbon factor information beforehand.To address this issue,a prediction model based on the graph attention network is proposed.The model uses a graph structure that is suitable for the topology of the power grid and designs a supervised network using the loads of the grid nodes and the corresponding carbon factor data.The network extracts features and transmits information more suitable for the power system and can flexibly adjust the equivalent topology,thereby increasing the diversity of the structure.Its input and output data are simple,without the power grid parameters.We demonstrated its effect by testing IEEE-39 bus and IEEE-118 bus systems with average error rates of 2.46%and 2.51%.
基金the National Natural Science Foundation of China(No.61461027,61762059)the Provincial Science and Technology Program supported the Key Project of Natural Science Foundation of Gansu Province(No.22JR5RA226)。
文摘Considering the nonlinear structure and spatial-temporal correlation of traffic network,and the influence of potential correlation between nodes of traffic network on the spatial features,this paper proposes a traffic speed prediction model based on the combination of graph attention network with self-adaptive adjacency matrix(SAdpGAT)and bidirectional gated recurrent unit(BiGRU).First-ly,the model introduces graph attention network(GAT)to extract the spatial features of real road network and potential road network respectively in spatial dimension.Secondly,the spatial features are input into BiGRU to extract the time series features.Finally,the prediction results of the real road network and the potential road network are connected to generate the final prediction results of the model.The experimental results show that the prediction accuracy of the proposed model is im-proved obviously on METR-LA and PEMS-BAY datasets,which proves the advantages of the pro-posed spatial-temporal model in traffic speed prediction.
基金support of the National Key Research and Development Program of China(2021YFB4000505).
文摘Fault detection and diagnosis(FDD)plays a significant role in ensuring the safety and stability of chemical processes.With the development of artificial intelligence(AI)and big data technologies,data-driven approaches with excellent performance are widely used for FDD in chemical processes.However,improved predictive accuracy has often been achieved through increased model complexity,which turns models into black-box methods and causes uncertainty regarding their decisions.In this study,a causal temporal graph attention network(CTGAN)is proposed for fault diagnosis of chemical processes.A chemical causal graph is built by causal inference to represent the propagation path of faults.The attention mechanism and chemical causal graph were combined to help us notice the key variables relating to fault fluctuations.Experiments in the Tennessee Eastman(TE)process and the green ammonia(GA)process showed that CTGAN achieved high performance and good explainability.
基金funded by Yayasan UTP FRG(YUTP-FRG),grant number 015LC0-280 and Computer and Information Science Department of Universiti Teknologi PETRONAS.
文摘Object detection has made a significant leap forward in recent years.However,the detection of small objects continues to be a great difficulty for various reasons,such as they have a very small size and they are susceptible to missed detection due to background noise.Additionally,small object information is affected due to the downsampling operations.Deep learning-based detection methods have been utilized to address the challenge posed by small objects.In this work,we propose a novel method,the Multi-Convolutional Block Attention Network(MCBAN),to increase the detection accuracy of minute objects aiming to overcome the challenge of information loss during the downsampling process.The multi-convolutional attention block(MCAB);channel attention and spatial attention module(SAM)that make up MCAB,have been crafted to accomplish small object detection with higher precision.We have carried out the experiments on the Karlsruhe Institute of Technology and Toyota Technological Institute(KITTI)and Pattern Analysis,Statical Modeling and Computational Learning(PASCAL)Visual Object Classes(VOC)datasets and have followed a step-wise process to analyze the results.These experiment results demonstrate that significant gains in performance are achieved,such as 97.75%for KITTI and 88.97%for PASCAL VOC.The findings of this study assert quite unequivocally the fact that MCBAN is much more efficient in the small object detection domain as compared to other existing approaches.
基金This work was supported in part by the National Natural Science Foundation of China(U21A2019,61873058),Hainan Province Science and Technology Special Fund of China(ZDYF2022SHFZ105)the Alexander von Humboldt Foundation of Germany.
文摘Accurate detection of pipeline leakage is essential to maintain the safety of pipeline transportation.Recently,deep learning(DL)has emerged as a promising tool for pipeline leakage detection(PLD).However,most existing DL methods have difficulty in achieving good performance in identifying leakage types due to the complex time dynamics of pipeline data.On the other hand,the initial parameter selection in the detection model is generally random,which may lead to unstable recognition performance.For this reason,a hybrid DL framework referred to as parameter-optimized recurrent attention network(PRAN)is presented in this paper to improve the accuracy of PLD.First,a parameter-optimized long short-term memory(LSTM)network is introduced to extract effective and robust features,which exploits a particle swarm optimization(PSO)algorithm with cross-entropy fitness function to search for globally optimal parameters.With this framework,the learning representation capability of the model is improved and the convergence rate is accelerated.Moreover,an anomaly-attention mechanism(AM)is proposed to discover class discriminative information by weighting the hidden states,which contributes to amplifying the normalabnormal distinguishable discrepancy,further improving the accuracy of PLD.After that,the proposed PRAN not only implements the adaptive optimization of network parameters,but also enlarges the contribution of normal-abnormal discrepancy,thereby overcoming the drawbacks of instability and poor generalization.Finally,the experimental results demonstrate the effectiveness and superiority of the proposed PRAN for PLD.
基金supported by the Key Research&Development Plan Project of Shandong Province,China(No.2017GGX10127).
文摘Continuous sign language recognition(CSLR)is challenging due to the complexity of video background,hand gesture variability,and temporal modeling difficulties.This work proposes a CSLR method based on a spatialtemporal graph attention network to focus on essential features of video series.The method considers local details of sign language movements by taking the information on joints and bones as inputs and constructing a spatialtemporal graph to reflect inter-frame relevance and physical connections between nodes.The graph-based multihead attention mechanism is utilized with adjacent matrix calculation for better local-feature exploration,and short-term motion correlation modeling is completed via a temporal convolutional network.We adopted BLSTM to learn the long-termdependence and connectionist temporal classification to align the word-level sequences.The proposed method achieves competitive results regarding word error rates(1.59%)on the Chinese Sign Language dataset and the mean Jaccard Index(65.78%)on the ChaLearn LAP Continuous Gesture Dataset.
文摘Numerous works prove that existing neighbor-averaging graph neural networks(GNNs)cannot efficiently catch structure features,and many works show that injecting structure,distance,position,or spatial features can significantly improve the performance of GNNs,however,injecting high-level structure and distance into GNNs is an intuitive but untouched idea.This work sheds light on this issue and proposes a scheme to enhance graph attention networks(GATs)by encoding distance and hop-wise structure statistics.Firstly,the hop-wise structure and distributional distance information are extracted based on several hop-wise ego-nets of every target node.Secondly,the derived structure information,distance information,and intrinsic features are encoded into the same vector space and then added together to get initial embedding vectors.Thirdly,the derived embedding vectors are fed into GATs,such as GAT and adaptive graph diffusion network(AGDN)to get the soft labels.Fourthly,the soft labels are fed into correct and smooth(C&S)to conduct label propagation and get final predictions.Experiments show that the distance and hop-wise structures encoding enhanced graph attention networks(DHSEGATs)achieve a competitive result.
文摘The natural language to SQL(NL2SQL)task is an emerging research area that aims to transform a natural language with a given database into an SQL query.The earlier approaches were to process the input into a heterogeneous graph.However,previous models failed to distinguish the types of multi-hop connections of the heterogeneous graph,which tended to ignore crucial semantic path information.To this end,a two-layer attention network is presented to focus on essential neighbor nodes and mine enlightening semantic paths for feature encoding.The weighted edge is introduced for schema linking to connect the nodes with semantic similarity.In the decoding phase,a rule-based pruning strategy is offered to refine the generated SQL queries.From the experimental results,the approach is shown to learn a good encoding representation and decode the representation to generate results with practical meaning.
基金Supported by the Ph.D. Research Startup Project of Minnan Normal University(KJ2021020)the National Natural Science Foundation of China(12090020 and 12090025)Zhejiang Provincial Natural Science Foundation of China(LSD19H180005)。
文摘Accurate pancreas segmentation is critical for the diagnosis and management of diseases of the pancreas. It is challenging to precisely delineate pancreas due to the highly variations in volume, shape and location. In recent years, coarse-to-fine methods have been widely used to alleviate class imbalance issue and improve pancreas segmentation accuracy. However,cascaded methods could be computationally intensive and the refined results are significantly dependent on the performance of its coarse segmentation results. To balance the segmentation accuracy and computational efficiency, we propose a Discriminative Feature Attention Network for pancreas segmentation, to effectively highlight pancreas features and improve segmentation accuracy without explicit pancreas location. The final segmentation is obtained by applying a simple yet effective post-processing step. Two experiments on both public NIH pancreas CT dataset and abdominal BTCV multi-organ dataset are individually conducted to show the effectiveness of our method for 2 D pancreas segmentation. We obtained average Dice Similarity Coefficient(DSC) of 82.82±6.09%, average Jaccard Index(JI) of 71.13± 8.30% and average Symmetric Average Surface Distance(ASD) of 1.69 ± 0.83 mm on the NIH dataset. Compared to the existing deep learning-based pancreas segmentation methods, our experimental results achieve the best average DSC and JI value.
文摘In computer vision,object recognition and image categorization have proven to be difficult challenges.They have,nevertheless,generated responses to a wide range of difficult issues from a variety of fields.Convolution Neural Networks(CNNs)have recently been identified as the most widely proposed deep learning(DL)algorithms in the literature.CNNs have unquestionably delivered cutting-edge achievements,particularly in the areas of image classification,speech recognition,and video processing.However,it has been noticed that the CNN-training assignment demands a large amount of data,which is in low supply,especially in the medical industry,and as a result,the training process takes longer.In this paper,we describe an attentionaware CNN architecture for classifying chest X-ray images to diagnose Pneumonia in order to address the aforementioned difficulties.AttentionModules provide attention-aware properties to the Attention Network.The attentionaware features of various modules alter as the layers become deeper.Using a bottom-up top-down feedforward structure,the feedforward and feedback attention processes are integrated into a single feedforward process inside each attention module.In the present work,a deep neural network(DNN)is combined with an attention mechanism to test the prediction of Pneumonia disease using chest X-ray pictures.To produce attention-aware features,the suggested networkwas built by merging channel and spatial attentionmodules in DNN architecture.With this network,we worked on a publicly available Kaggle chest X-ray dataset.Extensive testing was carried out to validate the suggested model.In the experimental results,we attained an accuracy of 95.47%and an F-score of 0.92,indicating that the suggested model outperformed against the baseline models.
基金supported by the National Key Research and Development Program of China under Grant No.2018YFE0206900the National Natural Science Foundation of China under Grant No.61871440 and CAAI‐Huawei Mind-Spore Open Fund.
文摘Tumour segmentation in medical images(especially 3D tumour segmentation)is highly challenging due to the possible similarity between tumours and adjacent tissues,occurrence of multiple tumours and variable tumour shapes and sizes.The popular deep learning‐based segmentation algorithms generally rely on the convolutional neural network(CNN)and Transformer.The former cannot extract the global image features effectively while the latter lacks the inductive bias and involves the complicated computation for 3D volume data.The existing hybrid CNN‐Transformer network can only provide the limited performance improvement or even poorer segmentation performance than the pure CNN.To address these issues,a short‐term and long‐term memory self‐attention network is proposed.Firstly,a distinctive self‐attention block uses the Transformer to explore the correlation among the region features at different levels extracted by the CNN.Then,the memory structure filters and combines the above information to exclude the similar regions and detect the multiple tumours.Finally,the multi‐layer reconstruction blocks will predict the tumour boundaries.Experimental results demonstrate that our method outperforms other methods in terms of subjective visual and quantitative evaluation.Compared with the most competitive method,the proposed method provides Dice(82.4%vs.76.6%)and Hausdorff distance 95%(HD95)(10.66 vs.11.54 mm)on the KiTS19 as well as Dice(80.2%vs.78.4%)and HD95(9.632 vs.12.17 mm)on the LiTS.
基金This work was supported in part by audio-visual new media laboratory operation and maintenance of Academy of Broadcasting Science,Grant No.200304in part by the National Key Research and Development Program of China(Grant No.2019YFB1406201).
文摘Referring expressions comprehension is the task of locating the image region described by a natural language expression,which refer to the properties of the region or the relationships with other regions.Most previous work handles this problem by selecting the most relevant regions from a set of candidate regions,when there are many candidate regions in the set these methods are inefficient.Inspired by recent success of image captioning by using deep learning methods,in this paper we proposed a framework to understand the referring expressions by multiple steps of reasoning.We present a model for referring expressions comprehension by selecting the most relevant region directly from the image.The core of our model is a recurrent attention network which can be seen as an extension of Memory Network.The proposed model capable of improving the results by multiple computational hops.We evaluate the proposed model on two referring expression datasets:Visual Genome and Flickr30k Entities.The experimental results demonstrate that the proposed model outperform previous state-of-the-art methods both in accuracy and efficiency.We also conduct an ablation experiment to show that the performance of the model is not getting better with the increase of the attention layers.
基金supported byNationalNatural Science Foundation of China(52274205)and Project of Education Department of Liaoning Province(LJKZ0338).
文摘Automatic text summarization(ATS)plays a significant role in Natural Language Processing(NLP).Abstractive summarization produces summaries by identifying and compressing the most important information in a document.However,there are only relatively several comprehensively evaluated abstractive summarization models that work well for specific types of reports due to their unstructured and oral language text characteristics.In particular,Chinese complaint reports,generated by urban complainers and collected by government employees,describe existing resident problems in daily life.Meanwhile,the reflected problems are required to respond speedily.Therefore,automatic summarization tasks for these reports have been developed.However,similar to traditional summarization models,the generated summaries still exist problems of informativeness and conciseness.To address these issues and generate suitably informative and less redundant summaries,a topic-based abstractive summarization method is proposed to obtain global and local features.Additionally,a heterogeneous graph of the original document is constructed using word-level and topic-level features.Experiments and analyses on public review datasets(Yelp and Amazon)and our constructed dataset(Chinese complaint reports)show that the proposed framework effectively improves the performance of the abstractive summarization model for Chinese complaint reports.
基金Authors The research project is partially supported by National Natural ScienceFoundation of China under Grant No. 62072015, U19B2039, U1811463National Key R&D Programof China 2018YFB1600903.
文摘PM2.5 concentration prediction is of great significance to environmental protection and human health.Achieving accurate prediction of PM2.5 concentration has become an important research task.However,PM2.5 pollutants can spread in the earth’s atmosphere,causing mutual influence between different cities.To effectively capture the air pollution relationship between cities,this paper proposes a novel spatiotemporal model combining graph attention neural network(GAT)and gated recurrent unit(GRU),named GAT-GRU for PM2.5 concentration prediction.Specifically,GAT is used to learn the spatial dependence of PM2.5 concentration data in different cities,and GRU is to extract the temporal dependence of the long-term data series.The proposed model integrates the learned spatio-temporal dependencies to capture long-term complex spatio-temporal features.Considering that air pollution is related to the meteorological conditions of the city,the knowledge acquired from meteorological data is used in the model to enhance PM2.5 prediction performance.The input of the GAT-GRU model consists of PM2.5 concentration data and meteorological data.In order to verify the effectiveness of the proposed GAT-GRU prediction model,this paper designs experiments on real-world datasets compared with other baselines.Experimental results prove that our model achieves excellent performance in PM2.5 concentration prediction.
基金This paper is partially supported by Open Fund for Jiangsu Key Laboratory of Advanced Manufacturing Technology(HGAMTL-1703)Guangxi Key Laboratory of Trusted Software(kx201901)+5 种基金Fundamental Research Funds for the Central Universities(CDLS-2020-03)Key Laboratory of Child Development and Learning Science(Southeast University),Ministry of EducationRoyal Society International Exchanges Cost Share Award,UK(RP202G0230)Medical Research Council Confidence in Concept Award,UK(MC_PC_17171)Hope Foundation for Cancer Research,UK(RM60G0680)British Heart Foundation Accelerator Award,UK.
文摘Aim: To diagnose COVID-19 more efficiently and more correctly, this study proposed a novel attention network forCOVID-19 (ANC). Methods: Two datasets were used in this study. An 18-way data augmentation was proposed toavoid overfitting. Then, convolutional block attention module (CBAM) was integrated to our model, the structureof which is fine-tuned. Finally, Grad-CAM was used to provide an explainable diagnosis. Results: The accuracyof our ANC methods on two datasets are 96.32% ± 1.06%, and 96.00% ± 1.03%, respectively. Conclusions: Thisproposed ANC method is superior to 9 state-of-the-art approaches.
基金supported by the Postgraduate Scientific Research Innovation Project of Hunan Province under Grant QL20210212the Scientific Innovation Fund for Postgraduates of Central South University of Forestry and Technology under Grant CX202102043.
文摘In the smart logistics industry,unmanned forklifts that intelligently identify logistics pallets can improve work efficiency in warehousing and transportation and are better than traditional manual forklifts driven by humans.Therefore,they play a critical role in smart warehousing,and semantics segmentation is an effective method to realize the intelligent identification of logistics pallets.However,most current recognition algorithms are ineffective due to the diverse types of pallets,their complex shapes,frequent blockades in production environments,and changing lighting conditions.This paper proposes a novel multi-feature fusion-guided multiscale bidirectional attention(MFMBA)neural network for logistics pallet segmentation.To better predict the foreground category(the pallet)and the background category(the cargo)of a pallet image,our approach extracts three types of features(grayscale,texture,and Hue,Saturation,Value features)and fuses them.The multiscale architecture deals with the problem that the size and shape of the pallet may appear different in the image in the actual,complex environment,which usually makes feature extraction difficult.Our study proposes a multiscale architecture that can extract additional semantic features.Also,since a traditional attention mechanism only assigns attention rights from a single direction,we designed a bidirectional attention mechanism that assigns cross-attention weights to each feature from two directions,horizontally and vertically,significantly improving segmentation.Finally,comparative experimental results show that the precision of the proposed algorithm is 0.53%–8.77%better than that of other methods we compared.
文摘Document images often contain various page components and complex logical structures,which make document layout analysis task challenging.For most deep learning-based document layout analysis methods,convolutional neural networks(CNNs)are adopted as the feature extraction networks.In this paper,a hybrid spatial-channel attention network(HSCA-Net)is proposed to improve feature extraction capability by introducing attention mechanism to explore more salient properties within document pages.The HSCA-Net consists of spatial attention module(SAM),channel attention module(CAM),and designed lateral attention connection.CAM adaptively adjusts channel feature responses by emphasizing selective information,which depends on the contribution of the features of each channel.SAM guides CNNs to focus on the informative contents and capture global context information among page objects.The lateral attention connection incorporates SAM and CAM into multiscale feature pyramid network,and thus retains original feature information.The effectiveness and adaptability of HSCA-Net are evaluated through multiple experiments on publicly available datasets such as PubLayNet,ICDAR-POD,and Article Regions.Experimental results demonstrate that HSCA-Net achieves state-of-the-art performance on document layout analysis task.
基金supported by open fund(PLN2021-23)of National Key Laboratory of Oil and Gas Reservoir Geology and Exploitation(Southwest Petroleum University).
文摘The identification and recording of drilling conditions are crucial for ensuring drilling safety and efficiency. However, the traditional approach of relying on the subjective determination of drilling masters based on experience formulas is slow and not suitable for rapid drilling. In this paper, we propose a drilling condition classification method based on a neural network model. The model uses an improved Bidirectional Gated Recurrent Unit (BiGRU) combined with an attention mechanism to accurately classify seven common drilling conditions simultaneously, achieving an average accuracy of 91.63%. The model also demonstrates excellent generalization ability, real-time performance, and accuracy, making it suitable for actual production. Additionally, the model has excellent expandability, which enhances its potential for further application.
基金The authors would like to thank the financial support from the National Key Research and Development Program of China(Nos.2020YFE0201200,2019YFC1509100)the partial support by the Youth Program of Natural Science Foundation of China(No.41706010)the Fundamental Research Funds for the Central Universities(No.202264002).
文摘Currently,numerical models based on idealized assumptions,complex algorithms and high computational costs are unsatisfactory for ocean surface current prediction.Moreover,the complex temporal and spatial variability of ocean currents also makes the prediction methods based on time series data challenging.The deep network model can automatically learn and extract complex features hidden in large amount of complex data,so it is a promising method for high quality prediction of ocean currents.In this paper,we propose a spatiotemporal coupled attention deep network model STCANet that can extract abundant temporal and spatial coupling information on the behavior characteristics of ocean currents for improving the prediction accuracy.Firstly,Spatial Module is designed and implemented to extract the spatiotemporal coupling characteristics of ocean currents,and meanwhile the spatial correlations and dependencies among adjacent sea areas are obtained through Spatial Channel Attention Module(SCAM).Secondly,we use the GatedRecurrent-Unit(GRU)to extract temporal relationships of ocean currents,and design and implement the nearest neighbor time attention module to extract the interdependences of ocean currents between adjacent times,which can further improve the accuracy of ocean current prediction.Finally,a series of comparative experiments on the MediSea_Dataset and EastSea_Dataset showed that the prediction quality of our model greatly outperforms those of other benchmark models such as History Average(HA),Autoregressive Integrated Moving Average Model(ARIMA),Long Short-term Memory(LSTM),Gate Recurrent Unit(GRU)and CNN_GRU.