Graph Convolutional Neural Networks(GCNs)have been widely used in various fields due to their powerful capabilities in processing graph-structured data.However,GCNs encounter significant challenges when applied to sca...Graph Convolutional Neural Networks(GCNs)have been widely used in various fields due to their powerful capabilities in processing graph-structured data.However,GCNs encounter significant challenges when applied to scale-free graphs with power-law distributions,resulting in substantial distortions.Moreover,most of the existing GCN models are shallow structures,which restricts their ability to capture dependencies among distant nodes and more refined high-order node features in scale-free graphs with hierarchical structures.To more broadly and precisely apply GCNs to real-world graphs exhibiting scale-free or hierarchical structures and utilize multi-level aggregation of GCNs for capturing high-level information in local representations,we propose the Hyperbolic Deep Graph Convolutional Neural Network(HDGCNN),an end-to-end deep graph representation learning framework that can map scale-free graphs from Euclidean space to hyperbolic space.In HDGCNN,we define the fundamental operations of deep graph convolutional neural networks in hyperbolic space.Additionally,we introduce a hyperbolic feature transformation method based on identity mapping and a dense connection scheme based on a novel non-local message passing framework.In addition,we present a neighborhood aggregation method that combines initial structural featureswith hyperbolic attention coefficients.Through the above methods,HDGCNN effectively leverages both the structural features and node features of graph data,enabling enhanced exploration of non-local structural features and more refined node features in scale-free or hierarchical graphs.Experimental results demonstrate that HDGCNN achieves remarkable performance improvements over state-ofthe-art GCNs in node classification and link prediction tasks,even when utilizing low-dimensional embedding representations.Furthermore,when compared to shallow hyperbolic graph convolutional neural network models,HDGCNN exhibits notable advantages and performance enhancements.展开更多
The shale gas development process is complex in terms of its flow mechanisms and the accuracy of the production forecasting is influenced by geological parameters and engineering parameters.Therefore,to quantitatively...The shale gas development process is complex in terms of its flow mechanisms and the accuracy of the production forecasting is influenced by geological parameters and engineering parameters.Therefore,to quantitatively evaluate the relative importance of model parameters on the production forecasting performance,sensitivity analysis of parameters is required.The parameters are ranked according to the sensitivity coefficients for the subsequent optimization scheme design.A data-driven global sensitivity analysis(GSA)method using convolutional neural networks(CNN)is proposed to identify the influencing parameters in shale gas production.The CNN is trained on a large dataset,validated against numerical simulations,and utilized as a surrogate model for efficient sensitivity analysis.Our approach integrates CNN with the Sobol'global sensitivity analysis method,presenting three key scenarios for sensitivity analysis:analysis of the production stage as a whole,analysis by fixed time intervals,and analysis by declining rate.The findings underscore the predominant influence of reservoir thickness and well length on shale gas production.Furthermore,the temporal sensitivity analysis reveals the dynamic shifts in parameter importance across the distinct production stages.展开更多
Deep neural network-based relational extraction research has made significant progress in recent years,andit provides data support for many natural language processing downstream tasks such as building knowledgegraph,...Deep neural network-based relational extraction research has made significant progress in recent years,andit provides data support for many natural language processing downstream tasks such as building knowledgegraph,sentiment analysis and question-answering systems.However,previous studies ignored much unusedstructural information in sentences that could enhance the performance of the relation extraction task.Moreover,most existing dependency-based models utilize self-attention to distinguish the importance of context,whichhardly deals withmultiple-structure information.To efficiently leverage multiple structure information,this paperproposes a dynamic structure attention mechanism model based on textual structure information,which deeplyintegrates word embedding,named entity recognition labels,part of speech,dependency tree and dependency typeinto a graph convolutional network.Specifically,our model extracts text features of different structures from theinput sentence.Textual Structure information Graph Convolutional Networks employs the dynamic structureattention mechanism to learn multi-structure attention,effectively distinguishing important contextual features invarious structural information.In addition,multi-structure weights are carefully designed as amergingmechanismin the different structure attention to dynamically adjust the final attention.This paper combines these featuresand trains a graph convolutional network for relation extraction.We experiment on supervised relation extractiondatasets including SemEval 2010 Task 8,TACRED,TACREV,and Re-TACED,the result significantly outperformsthe previous.展开更多
A significant advantage of medical image processing is that it allows non-invasive exploration of internal anatomy in great detail.It is possible to create and study 3D models of anatomical structures to improve treatm...A significant advantage of medical image processing is that it allows non-invasive exploration of internal anatomy in great detail.It is possible to create and study 3D models of anatomical structures to improve treatment outcomes,develop more effective medical devices,or arrive at a more accurate diagnosis.This paper aims to present a fused evolutionary algorithm that takes advantage of both whale optimization and bacterial foraging optimization to optimize feature extraction.The classification process was conducted with the aid of a convolu-tional neural network(CNN)with dual graphs.Evaluation of the performance of the fused model is carried out with various methods.In the initial input Com-puter Tomography(CT)image,150 images are pre-processed and segmented to identify cancerous and non-cancerous nodules.The geometrical,statistical,struc-tural,and texture features are extracted from the preprocessed segmented image using various methods such as Gray-level co-occurrence matrix(GLCM),Histo-gram-oriented gradient features(HOG),and Gray-level dependence matrix(GLDM).To select the optimal features,a novel fusion approach known as Whale-Bacterial Foraging Optimization is proposed.For the classification of lung cancer,dual graph convolutional neural networks have been employed.A com-parison of classification algorithms and optimization algorithms has been con-ducted.According to the evaluated results,the proposed fused algorithm is successful with an accuracy of 98.72%in predicting lung tumors,and it outper-forms other conventional approaches.展开更多
Graph convolutional networks(GCNs)have received significant attention from various research fields due to the excellent performance in learning graph representations.Although GCN performs well compared with other meth...Graph convolutional networks(GCNs)have received significant attention from various research fields due to the excellent performance in learning graph representations.Although GCN performs well compared with other methods,it still faces challenges.Training a GCN model for large-scale graphs in a conventional way requires high computation and storage costs.Therefore,motivated by an urgent need in terms of efficiency and scalability in training GCN,sampling methods have been proposed and achieved a significant effect.In this paper,we categorize sampling methods based on the sampling mechanisms and provide a comprehensive survey of sampling methods for efficient training of GCN.To highlight the characteristics and differences of sampling methods,we present a detailed comparison within each category and further give an overall comparative analysis for the sampling methods in all categories.Finally,we discuss some challenges and future research directions of the sampling methods.展开更多
Recommendation Information Systems(RIS)are pivotal in helping users in swiftly locating desired content from the vast amount of information available on the Internet.Graph Convolution Network(GCN)algorithms have been ...Recommendation Information Systems(RIS)are pivotal in helping users in swiftly locating desired content from the vast amount of information available on the Internet.Graph Convolution Network(GCN)algorithms have been employed to implement the RIS efficiently.However,the GCN algorithm faces limitations in terms of performance enhancement owing to the due to the embedding value-vanishing problem that occurs during the learning process.To address this issue,we propose a Weighted Forwarding method using the GCN(WF-GCN)algorithm.The proposed method involves multiplying the embedding results with different weights for each hop layer during graph learning.By applying the WF-GCN algorithm,which adjusts weights for each hop layer before forwarding to the next,nodes with many neighbors achieve higher embedding values.This approach facilitates the learning of more hop layers within the GCN framework.The efficacy of the WF-GCN was demonstrated through its application to various datasets.In the MovieLens dataset,the implementation of WF-GCN in LightGCN resulted in significant performance improvements,with recall and NDCG increasing by up to+163.64%and+132.04%,respectively.Similarly,in the Last.FM dataset,LightGCN using WF-GCN enhanced with WF-GCN showed substantial improvements,with the recall and NDCG metrics rising by up to+174.40%and+169.95%,respectively.Furthermore,the application of WF-GCN to Self-supervised Graph Learning(SGL)and Simple Graph Contrastive Learning(SimGCL)also demonstrated notable enhancements in both recall and NDCG across these datasets.展开更多
Traffic flow prediction is an important part of the intelligent transportation system. Accurate multi-step traffic flow prediction plays an important role in improving the operational efficiency of the traffic network...Traffic flow prediction is an important part of the intelligent transportation system. Accurate multi-step traffic flow prediction plays an important role in improving the operational efficiency of the traffic network. Since traffic flow data has complex spatio-temporal correlation and non-linearity, existing prediction methods are mainly accomplished through a combination of a Graph Convolutional Network (GCN) and a recurrent neural network. The combination strategy has an excellent performance in traffic prediction tasks. However, multi-step prediction error accumulates with the predicted step size. Some scholars use multiple sampling sequences to achieve more accurate prediction results. But it requires high hardware conditions and multiplied training time. Considering the spatiotemporal correlation of traffic flow and influence of external factors, we propose an Attention Based Spatio-Temporal Graph Convolutional Network considering External Factors (ABSTGCN-EF) for multi-step traffic flow prediction. This model models the traffic flow as diffusion on a digraph and extracts the spatial characteristics of traffic flow through GCN. We add meaningful time-slots attention to the encoder-decoder to form an Attention Encoder Network (AEN) to handle temporal correlation. The attention vector is used as a competitive choice to draw the correlation between predicted states and historical states. We considered the impact of three external factors (daytime, weekdays, and traffic accident markers) on the traffic flow prediction tasks. Experiments on two public data sets show that it makes sense to consider external factors. The prediction performance of our ABSTGCN-EF model achieves 7.2%–8.7% higher than the state-of-the-art baselines.展开更多
The ever-growing available visual data(i.e.,uploaded videos and pictures by internet users)has attracted the research community’s attention in the computer vision field.Therefore,finding efficient solutions to extrac...The ever-growing available visual data(i.e.,uploaded videos and pictures by internet users)has attracted the research community’s attention in the computer vision field.Therefore,finding efficient solutions to extract knowledge from these sources is imperative.Recently,the BlazePose system has been released for skeleton extraction from images oriented to mobile devices.With this skeleton graph representation in place,a Spatial-Temporal Graph Convolutional Network can be implemented to predict the action.We hypothesize that just by changing the skeleton input data for a different set of joints that offers more information about the action of interest,it is possible to increase the performance of the Spatial-Temporal Graph Convolutional Network for HAR tasks.Hence,in this study,we present the first implementation of the BlazePose skeleton topology upon this architecture for action recognition.Moreover,we propose the Enhanced-BlazePose topology that can achieve better results than its predecessor.Additionally,we propose different skeleton detection thresholds that can improve the accuracy performance even further.We reached a top-1 accuracy performance of 40.1%on the Kinetics dataset.For the NTU-RGB+D dataset,we achieved 87.59%and 92.1%accuracy for Cross-Subject and Cross-View evaluation criteria,respectively.展开更多
As the scale of the power system continues to expand,the environment for power operations becomes more and more complex.Existing risk management and control methods for power operations can only set the same risk dete...As the scale of the power system continues to expand,the environment for power operations becomes more and more complex.Existing risk management and control methods for power operations can only set the same risk detection standard and conduct the risk detection for any scenario indiscriminately.Therefore,more reliable and accurate security control methods are urgently needed.In order to improve the accuracy and reliability of the operation risk management and control method,this paper proposes a method for identifying the key links in the whole process of electric power operation based on the spatiotemporal hybrid convolutional neural network.To provide early warning and control of targeted risks,first,the video stream is framed adaptively according to the pixel changes in the video stream.Then,the optimized MobileNet is used to extract the feature map of the video stream,which contains both time-series and static spatial scene information.The feature maps are combined and non-linearly mapped to realize the identification of dynamic operating scenes.Finally,training samples and test samples are produced by using the whole process image of a power company in Xinjiang as a case study,and the proposed algorithm is compared with the unimproved MobileNet.The experimental results demonstrated that the method proposed in this paper can accurately identify the type and start and end time of each operation link in the whole process of electric power operation,and has good real-time performance.The average accuracy of the algorithm can reach 87.8%,and the frame rate is 61 frames/s,which is of great significance for improving the reliability and accuracy of security control methods.展开更多
The ability to accurately predict urban traffic flows is crucial for optimising city operations.Consequently,various methods for forecasting urban traffic have been developed,focusing on analysing historical data to u...The ability to accurately predict urban traffic flows is crucial for optimising city operations.Consequently,various methods for forecasting urban traffic have been developed,focusing on analysing historical data to understand complex mobility patterns.Deep learning techniques,such as graph neural networks(GNNs),are popular for their ability to capture spatio-temporal dependencies.However,these models often become overly complex due to the large number of hyper-parameters involved.In this study,we introduce Dynamic Multi-Graph Spatial-Temporal Graph Neural Ordinary Differential Equation Networks(DMST-GNODE),a framework based on ordinary differential equations(ODEs)that autonomously discovers effective spatial-temporal graph neural network(STGNN)architectures for traffic prediction tasks.The comparative analysis of DMST-GNODE and baseline models indicates that DMST-GNODE model demonstrates superior performance across multiple datasets,consistently achieving the lowest Root Mean Square Error(RMSE)and Mean Absolute Error(MAE)values,alongside the highest accuracy.On the BKK(Bangkok)dataset,it outperformed other models with an RMSE of 3.3165 and an accuracy of 0.9367 for a 20-min interval,maintaining this trend across 40 and 60 min.Similarly,on the PeMS08 dataset,DMST-GNODE achieved the best performance with an RMSE of 19.4863 and an accuracy of 0.9377 at 20 min,demonstrating its effectiveness over longer periods.The Los_Loop dataset results further emphasise this model’s advantage,with an RMSE of 3.3422 and an accuracy of 0.7643 at 20 min,consistently maintaining superiority across all time intervals.These numerical highlights indicate that DMST-GNODE not only outperforms baseline models but also achieves higher accuracy and lower errors across different time intervals and datasets.展开更多
Owing to the expansion of the grid interconnection scale,the spatiotemporal distribution characteristics of the frequency response of power systems after the occurrence of disturbances have become increasingly importa...Owing to the expansion of the grid interconnection scale,the spatiotemporal distribution characteristics of the frequency response of power systems after the occurrence of disturbances have become increasingly important.These characteristics can provide effective support in coordinated security control.However,traditional model-based frequencyprediction methods cannot satisfactorily meet the requirements of online applications owing to the long calculation time and accurate power-system models.Therefore,this study presents a rolling frequency-prediction model based on a graph convolutional network(GCN)and a long short-term memory(LSTM)spatiotemporal network and named as STGCN-LSTM.In the proposed method,the measurement data from phasor measurement units after the occurrence of disturbances are used to construct the spatiotemporal input.An improved GCN embedded with topology information is used to extract the spatial features,while the LSTM network is used to extract the temporal features.The spatiotemporal-network-regression model is further trained,and asynchronous-frequency-sequence prediction is realized by utilizing the rolling update of measurement information.The proposed spatiotemporal-network-based prediction model can achieve accurate frequency prediction by considering the spatiotemporal distribution characteristics of the frequency response.The noise immunity and robustness of the proposed method are verified on the IEEE 39-bus and IEEE 118-bus systems.展开更多
Clothing parsing, also known as clothing image segmentation, is the problem of assigning a clothing category label to each pixel in clothing images. To address the lack of positional and global prior in existing cloth...Clothing parsing, also known as clothing image segmentation, is the problem of assigning a clothing category label to each pixel in clothing images. To address the lack of positional and global prior in existing clothing parsing algorithms, this paper proposes an enhanced positional attention module(EPAM) to collect positional information in the vertical direction of each pixel, and an efficient global prior module(GPM) to aggregate contextual information from different sub-regions. The EPAM and GPM based residual network(EG-ResNet) could effectively exploit the intrinsic features of clothing images while capturing information between different scales and sub-regions. Experimental results show that the proposed EG-ResNet achieves promising performance in clothing parsing of the colorful fashion parsing dataset(CFPD)(51.12% of mean Intersection over Union(mIoU) and 92.79% of pixel-wise accuracy(PA)) compared with other state-of-the-art methods.展开更多
Scene recognition is a fundamental task in computer vision,which generally includes three vital stages,namely feature extraction,feature transformation and classification.Early research mainly focuses on feature extra...Scene recognition is a fundamental task in computer vision,which generally includes three vital stages,namely feature extraction,feature transformation and classification.Early research mainly focuses on feature extraction,but with the rise of Convolutional Neural Networks(CNNs),more and more feature transformation methods are proposed based on CNN features.In this work,a novel feature transformation algorithm called Graph Encoded Local Discriminative Region Representation(GEDRR)is proposed to find discriminative local representations for scene images and explore the relationship between the discriminative regions.In addition,we propose a method using the multi-head attention module to enhance and fuse convolutional feature maps.Combining the two methods and the global representation,a scene recognition framework called Global and Graph Encoded Local Discriminative Region Representation(G2ELDR2)is proposed.The experimental results on three scene datasets demonstrate the effectiveness of our model,which outperforms many state-of-the-arts.展开更多
Action recognition has been recognized as an activity in which individuals’behaviour can be observed.Assembling profiles of regular activities such as activities of daily living can support identifying trends in the ...Action recognition has been recognized as an activity in which individuals’behaviour can be observed.Assembling profiles of regular activities such as activities of daily living can support identifying trends in the data during critical events.A skeleton representation of the human body has been proven to be effective for this task.The skeletons are presented in graphs form-like.However,the topology of a graph is not structured like Euclideanbased data.Therefore,a new set of methods to perform the convolution operation upon the skeleton graph is proposed.Our proposal is based on the Spatial Temporal-Graph Convolutional Network(ST-GCN)framework.In this study,we proposed an improved set of label mapping methods for the ST-GCN framework.We introduce three split techniques(full distance split,connection split,and index split)as an alternative approach for the convolution operation.The experiments presented in this study have been trained using two benchmark datasets:NTU-RGB+D and Kinetics to evaluate the performance.Our results indicate that our split techniques outperform the previous partition strategies and aremore stable during training without using the edge importance weighting additional training parameter.Therefore,our proposal can provide a more realistic solution for real-time applications centred on daily living recognition systems activities for indoor environments.展开更多
Background The use of remote photoplethysmography(rPPG)to estimate blood volume pulse in a noncontact manner has been an active research topic in recent years.Existing methods are primarily based on a singlescale regi...Background The use of remote photoplethysmography(rPPG)to estimate blood volume pulse in a noncontact manner has been an active research topic in recent years.Existing methods are primarily based on a singlescale region of interest(ROI).However,some noise signals that are not easily separated in a single-scale space can be easily separated in a multi-scale space.Also,existing spatiotemporal networks mainly focus on local spatiotemporal information and do not emphasize temporal information,which is crucial in pulse extraction problems,resulting in insufficient spatiotemporal feature modelling.Methods Here,we propose a multi-scale facial video pulse extraction network based on separable spatiotemporal convolution(SSTC)and dimension separable attention(DSAT).First,to solve the problem of a single-scale ROI,we constructed a multi-scale feature space for initial signal separation.Second,SSTC and DSAT were designed for efficient spatiotemporal correlation modeling,which increased the information interaction between the long-span time and space dimensions;this placed more emphasis on temporal features.Results The signal-to-noise ratio(SNR)of the proposed network reached 9.58dB on the PURE dataset and 6.77dB on the UBFC-rPPG dataset,outperforming state-of-the-art algorithms.Conclusions The results showed that fusing multi-scale signals yielded better results than methods based on only single-scale signals.The proposed SSTC and dimension-separable attention mechanism will contribute to more accurate pulse signal extraction.展开更多
The classification of point cloud data is the key technology of point cloud data information acquisition and 3D reconstruction, which has a wide range of applications. However, the existing point cloud classification ...The classification of point cloud data is the key technology of point cloud data information acquisition and 3D reconstruction, which has a wide range of applications. However, the existing point cloud classification methods have some shortcomings when extracting point cloud features, such as insufficient extraction of local information and overlooking the information in other neighborhood features in the point cloud, and not focusing on the point cloud channel information and spatial information. To solve the above problems, a point cloud classification network based on graph convolution and fusion attention mechanism is proposed to achieve more accurate classification results. Firstly, the point cloud is regarded as a node on the graph, the k-nearest neighbor algorithm is used to compose the graph and the information between points is dynamically captured by stacking multiple graph convolution layers;then, with the assistance of 2D experience of attention mechanism, an attention mechanism which has the capability to integrate more attention to point cloud spatial and channel information is introduced to increase the feature information of point cloud, aggregate local useful features and suppress useless features. Through the classification experiments on ModelNet40 dataset, the experimental results show that compared with PointNet network without considering the local feature information of the point cloud, the average classification accuracy of the proposed model has a 4.4% improvement and the overall classification accuracy has a 4.4% improvement. Compared with other networks, the classification accuracy of the proposed model has also been improved.展开更多
Land use and land cover change(LUCC)process exhibits spatial correlation and temporal dependency.Accurate extraction of spatiotemporal features is important in enhancing the modeling capabilities of LUCC.Cellular auto...Land use and land cover change(LUCC)process exhibits spatial correlation and temporal dependency.Accurate extraction of spatiotemporal features is important in enhancing the modeling capabilities of LUCC.Cellular automaton(CA)models,recognized as powerful tools for simulating dynamic LUCC processes,are traditionally applied in LUCC,focusing on time-slice driving factor data,often neglecting the temporal dimension.However,the transformer architecture,a highly acclaimed model in machine learning,has been rarely integrated into CA models for the simulation of dynamic LUCC processes.To fill this gap,we proposed a novel spatiotemporal urban LUCC simulation model,namely,transformer-convolutional neural network(TC)-CA.Based on CA models that involve the utilization of a convolutional neural network(CNN)for extracting latent spatial features,TC-CA extends this paradigm by incorporating a transformer architecture to extract spatiotemporal information from temporal driving factor data and temporal spatial features.The evaluation results with Wuxi city as a study area indicated the advantage of our proposed TC-CA against random forest-CA,conventional CNN-CA,artificial neural network-CA,and transformer-CA.Compared with the three non-transformer-based CAs,the TC-CA improved the figure of merit by up to 2.85%-8.14%.This study contributes a fresh spatiotemporal perspective and transformer approach to the field of LUCC modeling.展开更多
Data-driven prediction of time series is significant in many scientific research fields such as global climate change and weather forecast.For global monthly mean temperature series,considering the strong potential of...Data-driven prediction of time series is significant in many scientific research fields such as global climate change and weather forecast.For global monthly mean temperature series,considering the strong potential of deep neural network for extracting data features,this paper proposes a data-driven model,ResGraphNet,which improves the prediction accuracy of time series by an embedded residual module in GraphSAGE layers.The experimental results of a global mean temperature dataset,HadCRUT5,show that compared with 11 traditional prediction technologies,the proposed ResGraphNet obtains the best accuracy.The error indicator predicted by the proposed ResGraphNet is smaller than that of the other 11 prediction models.Furthermore,the performance on seven temperature datasets shows the excellent generalization of the ResGraphNet.Finally,based on our proposed ResGraphNet,the predicted 2022 annual anomaly of global temperature is 0.74722℃,which provides confidence for limiting warming to 1.5℃ above pre-industrial levels.展开更多
The relation is a semantic expression relevant to two named entities in a sentence.Since a sentence usually contains several named entities,it is essential to learn a structured sentence representation that encodes de...The relation is a semantic expression relevant to two named entities in a sentence.Since a sentence usually contains several named entities,it is essential to learn a structured sentence representation that encodes dependency information specific to the two named entities.In related work,graph convolutional neural networks are widely adopted to learn semantic dependencies,where a dependency tree initializes the adjacency matrix.However,this approach has two main issues.First,parsing a sentence heavily relies on external toolkits,which can be errorprone.Second,the dependency tree only encodes the syntactical structure of a sentence,which may not align with the relational semantic expression.In this paper,we propose an automatic graph learningmethod to autonomously learn a sentence’s structural information.Instead of using a fixed adjacency matrix initialized by a dependency tree,we introduce an Adaptive Adjacency Matrix to encode the semantic dependency between tokens.The elements of thismatrix are dynamically learned during the training process and optimized by task-relevant learning objectives,enabling the construction of task-relevant semantic dependencies within a sentence.Our model demonstrates superior performance on the TACRED and SemEval 2010 datasets,surpassing previous works by 1.3%and 0.8%,respectively.These experimental results show that our model excels in the relation extraction task,outperforming prior models.展开更多
基金supported by the National Natural Science Foundation of China-China State Railway Group Co.,Ltd.Railway Basic Research Joint Fund (Grant No.U2268217)the Scientific Funding for China Academy of Railway Sciences Corporation Limited (No.2021YJ183).
文摘Graph Convolutional Neural Networks(GCNs)have been widely used in various fields due to their powerful capabilities in processing graph-structured data.However,GCNs encounter significant challenges when applied to scale-free graphs with power-law distributions,resulting in substantial distortions.Moreover,most of the existing GCN models are shallow structures,which restricts their ability to capture dependencies among distant nodes and more refined high-order node features in scale-free graphs with hierarchical structures.To more broadly and precisely apply GCNs to real-world graphs exhibiting scale-free or hierarchical structures and utilize multi-level aggregation of GCNs for capturing high-level information in local representations,we propose the Hyperbolic Deep Graph Convolutional Neural Network(HDGCNN),an end-to-end deep graph representation learning framework that can map scale-free graphs from Euclidean space to hyperbolic space.In HDGCNN,we define the fundamental operations of deep graph convolutional neural networks in hyperbolic space.Additionally,we introduce a hyperbolic feature transformation method based on identity mapping and a dense connection scheme based on a novel non-local message passing framework.In addition,we present a neighborhood aggregation method that combines initial structural featureswith hyperbolic attention coefficients.Through the above methods,HDGCNN effectively leverages both the structural features and node features of graph data,enabling enhanced exploration of non-local structural features and more refined node features in scale-free or hierarchical graphs.Experimental results demonstrate that HDGCNN achieves remarkable performance improvements over state-ofthe-art GCNs in node classification and link prediction tasks,even when utilizing low-dimensional embedding representations.Furthermore,when compared to shallow hyperbolic graph convolutional neural network models,HDGCNN exhibits notable advantages and performance enhancements.
基金supported by the National Natural Science Foundation of China (Nos.52274048 and 52374017)Beijing Natural Science Foundation (No.3222037)the CNPC 14th five-year perspective fundamental research project (No.2021DJ2104)。
文摘The shale gas development process is complex in terms of its flow mechanisms and the accuracy of the production forecasting is influenced by geological parameters and engineering parameters.Therefore,to quantitatively evaluate the relative importance of model parameters on the production forecasting performance,sensitivity analysis of parameters is required.The parameters are ranked according to the sensitivity coefficients for the subsequent optimization scheme design.A data-driven global sensitivity analysis(GSA)method using convolutional neural networks(CNN)is proposed to identify the influencing parameters in shale gas production.The CNN is trained on a large dataset,validated against numerical simulations,and utilized as a surrogate model for efficient sensitivity analysis.Our approach integrates CNN with the Sobol'global sensitivity analysis method,presenting three key scenarios for sensitivity analysis:analysis of the production stage as a whole,analysis by fixed time intervals,and analysis by declining rate.The findings underscore the predominant influence of reservoir thickness and well length on shale gas production.Furthermore,the temporal sensitivity analysis reveals the dynamic shifts in parameter importance across the distinct production stages.
文摘Deep neural network-based relational extraction research has made significant progress in recent years,andit provides data support for many natural language processing downstream tasks such as building knowledgegraph,sentiment analysis and question-answering systems.However,previous studies ignored much unusedstructural information in sentences that could enhance the performance of the relation extraction task.Moreover,most existing dependency-based models utilize self-attention to distinguish the importance of context,whichhardly deals withmultiple-structure information.To efficiently leverage multiple structure information,this paperproposes a dynamic structure attention mechanism model based on textual structure information,which deeplyintegrates word embedding,named entity recognition labels,part of speech,dependency tree and dependency typeinto a graph convolutional network.Specifically,our model extracts text features of different structures from theinput sentence.Textual Structure information Graph Convolutional Networks employs the dynamic structureattention mechanism to learn multi-structure attention,effectively distinguishing important contextual features invarious structural information.In addition,multi-structure weights are carefully designed as amergingmechanismin the different structure attention to dynamically adjust the final attention.This paper combines these featuresand trains a graph convolutional network for relation extraction.We experiment on supervised relation extractiondatasets including SemEval 2010 Task 8,TACRED,TACREV,and Re-TACED,the result significantly outperformsthe previous.
文摘A significant advantage of medical image processing is that it allows non-invasive exploration of internal anatomy in great detail.It is possible to create and study 3D models of anatomical structures to improve treatment outcomes,develop more effective medical devices,or arrive at a more accurate diagnosis.This paper aims to present a fused evolutionary algorithm that takes advantage of both whale optimization and bacterial foraging optimization to optimize feature extraction.The classification process was conducted with the aid of a convolu-tional neural network(CNN)with dual graphs.Evaluation of the performance of the fused model is carried out with various methods.In the initial input Com-puter Tomography(CT)image,150 images are pre-processed and segmented to identify cancerous and non-cancerous nodules.The geometrical,statistical,struc-tural,and texture features are extracted from the preprocessed segmented image using various methods such as Gray-level co-occurrence matrix(GLCM),Histo-gram-oriented gradient features(HOG),and Gray-level dependence matrix(GLDM).To select the optimal features,a novel fusion approach known as Whale-Bacterial Foraging Optimization is proposed.For the classification of lung cancer,dual graph convolutional neural networks have been employed.A com-parison of classification algorithms and optimization algorithms has been con-ducted.According to the evaluated results,the proposed fused algorithm is successful with an accuracy of 98.72%in predicting lung tumors,and it outper-forms other conventional approaches.
基金supported by the National Natural Science Foundation of China(61732018,61872335,61802367,61876215)the Strategic Priority Research Program of Chinese Academy of Sciences(XDC05000000)+1 种基金Beijing Academy of Artificial Intelligence(BAAI),the Open Project Program of the State Key Laboratory of Mathematical Engineering and Advanced Computing(2019A07)the Open Project of Zhejiang Laboratory,and a grant from the Institute for Guo Qiang,Tsinghua University.Recommended by Associate Editor Long Chen.
文摘Graph convolutional networks(GCNs)have received significant attention from various research fields due to the excellent performance in learning graph representations.Although GCN performs well compared with other methods,it still faces challenges.Training a GCN model for large-scale graphs in a conventional way requires high computation and storage costs.Therefore,motivated by an urgent need in terms of efficiency and scalability in training GCN,sampling methods have been proposed and achieved a significant effect.In this paper,we categorize sampling methods based on the sampling mechanisms and provide a comprehensive survey of sampling methods for efficient training of GCN.To highlight the characteristics and differences of sampling methods,we present a detailed comparison within each category and further give an overall comparative analysis for the sampling methods in all categories.Finally,we discuss some challenges and future research directions of the sampling methods.
基金This work was supported by the Kyonggi University Research Grant 2022.
文摘Recommendation Information Systems(RIS)are pivotal in helping users in swiftly locating desired content from the vast amount of information available on the Internet.Graph Convolution Network(GCN)algorithms have been employed to implement the RIS efficiently.However,the GCN algorithm faces limitations in terms of performance enhancement owing to the due to the embedding value-vanishing problem that occurs during the learning process.To address this issue,we propose a Weighted Forwarding method using the GCN(WF-GCN)algorithm.The proposed method involves multiplying the embedding results with different weights for each hop layer during graph learning.By applying the WF-GCN algorithm,which adjusts weights for each hop layer before forwarding to the next,nodes with many neighbors achieve higher embedding values.This approach facilitates the learning of more hop layers within the GCN framework.The efficacy of the WF-GCN was demonstrated through its application to various datasets.In the MovieLens dataset,the implementation of WF-GCN in LightGCN resulted in significant performance improvements,with recall and NDCG increasing by up to+163.64%and+132.04%,respectively.Similarly,in the Last.FM dataset,LightGCN using WF-GCN enhanced with WF-GCN showed substantial improvements,with the recall and NDCG metrics rising by up to+174.40%and+169.95%,respectively.Furthermore,the application of WF-GCN to Self-supervised Graph Learning(SGL)and Simple Graph Contrastive Learning(SimGCL)also demonstrated notable enhancements in both recall and NDCG across these datasets.
基金supported by the Nation Natural Science Foundation of China(NSFC)under Grant No.61462042 and No.61966018.
文摘Traffic flow prediction is an important part of the intelligent transportation system. Accurate multi-step traffic flow prediction plays an important role in improving the operational efficiency of the traffic network. Since traffic flow data has complex spatio-temporal correlation and non-linearity, existing prediction methods are mainly accomplished through a combination of a Graph Convolutional Network (GCN) and a recurrent neural network. The combination strategy has an excellent performance in traffic prediction tasks. However, multi-step prediction error accumulates with the predicted step size. Some scholars use multiple sampling sequences to achieve more accurate prediction results. But it requires high hardware conditions and multiplied training time. Considering the spatiotemporal correlation of traffic flow and influence of external factors, we propose an Attention Based Spatio-Temporal Graph Convolutional Network considering External Factors (ABSTGCN-EF) for multi-step traffic flow prediction. This model models the traffic flow as diffusion on a digraph and extracts the spatial characteristics of traffic flow through GCN. We add meaningful time-slots attention to the encoder-decoder to form an Attention Encoder Network (AEN) to handle temporal correlation. The attention vector is used as a competitive choice to draw the correlation between predicted states and historical states. We considered the impact of three external factors (daytime, weekdays, and traffic accident markers) on the traffic flow prediction tasks. Experiments on two public data sets show that it makes sense to consider external factors. The prediction performance of our ABSTGCN-EF model achieves 7.2%–8.7% higher than the state-of-the-art baselines.
文摘The ever-growing available visual data(i.e.,uploaded videos and pictures by internet users)has attracted the research community’s attention in the computer vision field.Therefore,finding efficient solutions to extract knowledge from these sources is imperative.Recently,the BlazePose system has been released for skeleton extraction from images oriented to mobile devices.With this skeleton graph representation in place,a Spatial-Temporal Graph Convolutional Network can be implemented to predict the action.We hypothesize that just by changing the skeleton input data for a different set of joints that offers more information about the action of interest,it is possible to increase the performance of the Spatial-Temporal Graph Convolutional Network for HAR tasks.Hence,in this study,we present the first implementation of the BlazePose skeleton topology upon this architecture for action recognition.Moreover,we propose the Enhanced-BlazePose topology that can achieve better results than its predecessor.Additionally,we propose different skeleton detection thresholds that can improve the accuracy performance even further.We reached a top-1 accuracy performance of 40.1%on the Kinetics dataset.For the NTU-RGB+D dataset,we achieved 87.59%and 92.1%accuracy for Cross-Subject and Cross-View evaluation criteria,respectively.
基金This paper is supported by the Science and technology projects of Yunnan Province(Grant No.202202AD080004).
文摘As the scale of the power system continues to expand,the environment for power operations becomes more and more complex.Existing risk management and control methods for power operations can only set the same risk detection standard and conduct the risk detection for any scenario indiscriminately.Therefore,more reliable and accurate security control methods are urgently needed.In order to improve the accuracy and reliability of the operation risk management and control method,this paper proposes a method for identifying the key links in the whole process of electric power operation based on the spatiotemporal hybrid convolutional neural network.To provide early warning and control of targeted risks,first,the video stream is framed adaptively according to the pixel changes in the video stream.Then,the optimized MobileNet is used to extract the feature map of the video stream,which contains both time-series and static spatial scene information.The feature maps are combined and non-linearly mapped to realize the identification of dynamic operating scenes.Finally,training samples and test samples are produced by using the whole process image of a power company in Xinjiang as a case study,and the proposed algorithm is compared with the unimproved MobileNet.The experimental results demonstrated that the method proposed in this paper can accurately identify the type and start and end time of each operation link in the whole process of electric power operation,and has good real-time performance.The average accuracy of the algorithm can reach 87.8%,and the frame rate is 61 frames/s,which is of great significance for improving the reliability and accuracy of security control methods.
文摘The ability to accurately predict urban traffic flows is crucial for optimising city operations.Consequently,various methods for forecasting urban traffic have been developed,focusing on analysing historical data to understand complex mobility patterns.Deep learning techniques,such as graph neural networks(GNNs),are popular for their ability to capture spatio-temporal dependencies.However,these models often become overly complex due to the large number of hyper-parameters involved.In this study,we introduce Dynamic Multi-Graph Spatial-Temporal Graph Neural Ordinary Differential Equation Networks(DMST-GNODE),a framework based on ordinary differential equations(ODEs)that autonomously discovers effective spatial-temporal graph neural network(STGNN)architectures for traffic prediction tasks.The comparative analysis of DMST-GNODE and baseline models indicates that DMST-GNODE model demonstrates superior performance across multiple datasets,consistently achieving the lowest Root Mean Square Error(RMSE)and Mean Absolute Error(MAE)values,alongside the highest accuracy.On the BKK(Bangkok)dataset,it outperformed other models with an RMSE of 3.3165 and an accuracy of 0.9367 for a 20-min interval,maintaining this trend across 40 and 60 min.Similarly,on the PeMS08 dataset,DMST-GNODE achieved the best performance with an RMSE of 19.4863 and an accuracy of 0.9377 at 20 min,demonstrating its effectiveness over longer periods.The Los_Loop dataset results further emphasise this model’s advantage,with an RMSE of 3.3422 and an accuracy of 0.7643 at 20 min,consistently maintaining superiority across all time intervals.These numerical highlights indicate that DMST-GNODE not only outperforms baseline models but also achieves higher accuracy and lower errors across different time intervals and datasets.
基金supported by the National Natural Science Foundation of China(Grant Nos.51627811,51725702)the Science and Technology Project of State Grid Corporation of Beijing(Grant No.SGBJDK00DWJS2100164).
文摘Owing to the expansion of the grid interconnection scale,the spatiotemporal distribution characteristics of the frequency response of power systems after the occurrence of disturbances have become increasingly important.These characteristics can provide effective support in coordinated security control.However,traditional model-based frequencyprediction methods cannot satisfactorily meet the requirements of online applications owing to the long calculation time and accurate power-system models.Therefore,this study presents a rolling frequency-prediction model based on a graph convolutional network(GCN)and a long short-term memory(LSTM)spatiotemporal network and named as STGCN-LSTM.In the proposed method,the measurement data from phasor measurement units after the occurrence of disturbances are used to construct the spatiotemporal input.An improved GCN embedded with topology information is used to extract the spatial features,while the LSTM network is used to extract the temporal features.The spatiotemporal-network-regression model is further trained,and asynchronous-frequency-sequence prediction is realized by utilizing the rolling update of measurement information.The proposed spatiotemporal-network-based prediction model can achieve accurate frequency prediction by considering the spatiotemporal distribution characteristics of the frequency response.The noise immunity and robustness of the proposed method are verified on the IEEE 39-bus and IEEE 118-bus systems.
基金National Natural Science Foundation of China (No.62006039)Shanghai Special Fund for Software and Integrated Circuit Industry Development,China (No.180330)。
文摘Clothing parsing, also known as clothing image segmentation, is the problem of assigning a clothing category label to each pixel in clothing images. To address the lack of positional and global prior in existing clothing parsing algorithms, this paper proposes an enhanced positional attention module(EPAM) to collect positional information in the vertical direction of each pixel, and an efficient global prior module(GPM) to aggregate contextual information from different sub-regions. The EPAM and GPM based residual network(EG-ResNet) could effectively exploit the intrinsic features of clothing images while capturing information between different scales and sub-regions. Experimental results show that the proposed EG-ResNet achieves promising performance in clothing parsing of the colorful fashion parsing dataset(CFPD)(51.12% of mean Intersection over Union(mIoU) and 92.79% of pixel-wise accuracy(PA)) compared with other state-of-the-art methods.
基金This research is partially supported by the Programme for Professor of Special Appointment(Eastern Scholar)at Shanghai Institutions of Higher Learning,and also partially supported by JSPS KAKENHI Grant No.15K00159.
文摘Scene recognition is a fundamental task in computer vision,which generally includes three vital stages,namely feature extraction,feature transformation and classification.Early research mainly focuses on feature extraction,but with the rise of Convolutional Neural Networks(CNNs),more and more feature transformation methods are proposed based on CNN features.In this work,a novel feature transformation algorithm called Graph Encoded Local Discriminative Region Representation(GEDRR)is proposed to find discriminative local representations for scene images and explore the relationship between the discriminative regions.In addition,we propose a method using the multi-head attention module to enhance and fuse convolutional feature maps.Combining the two methods and the global representation,a scene recognition framework called Global and Graph Encoded Local Discriminative Region Representation(G2ELDR2)is proposed.The experimental results on three scene datasets demonstrate the effectiveness of our model,which outperforms many state-of-the-arts.
文摘Action recognition has been recognized as an activity in which individuals’behaviour can be observed.Assembling profiles of regular activities such as activities of daily living can support identifying trends in the data during critical events.A skeleton representation of the human body has been proven to be effective for this task.The skeletons are presented in graphs form-like.However,the topology of a graph is not structured like Euclideanbased data.Therefore,a new set of methods to perform the convolution operation upon the skeleton graph is proposed.Our proposal is based on the Spatial Temporal-Graph Convolutional Network(ST-GCN)framework.In this study,we proposed an improved set of label mapping methods for the ST-GCN framework.We introduce three split techniques(full distance split,connection split,and index split)as an alternative approach for the convolution operation.The experiments presented in this study have been trained using two benchmark datasets:NTU-RGB+D and Kinetics to evaluate the performance.Our results indicate that our split techniques outperform the previous partition strategies and aremore stable during training without using the edge importance weighting additional training parameter.Therefore,our proposal can provide a more realistic solution for real-time applications centred on daily living recognition systems activities for indoor environments.
基金Supported by the National Natural Science Foundation of China(61903336,61976190)the Natural Science Foundation of Zhejiang Province(LY21F030015)。
文摘Background The use of remote photoplethysmography(rPPG)to estimate blood volume pulse in a noncontact manner has been an active research topic in recent years.Existing methods are primarily based on a singlescale region of interest(ROI).However,some noise signals that are not easily separated in a single-scale space can be easily separated in a multi-scale space.Also,existing spatiotemporal networks mainly focus on local spatiotemporal information and do not emphasize temporal information,which is crucial in pulse extraction problems,resulting in insufficient spatiotemporal feature modelling.Methods Here,we propose a multi-scale facial video pulse extraction network based on separable spatiotemporal convolution(SSTC)and dimension separable attention(DSAT).First,to solve the problem of a single-scale ROI,we constructed a multi-scale feature space for initial signal separation.Second,SSTC and DSAT were designed for efficient spatiotemporal correlation modeling,which increased the information interaction between the long-span time and space dimensions;this placed more emphasis on temporal features.Results The signal-to-noise ratio(SNR)of the proposed network reached 9.58dB on the PURE dataset and 6.77dB on the UBFC-rPPG dataset,outperforming state-of-the-art algorithms.Conclusions The results showed that fusing multi-scale signals yielded better results than methods based on only single-scale signals.The proposed SSTC and dimension-separable attention mechanism will contribute to more accurate pulse signal extraction.
文摘The classification of point cloud data is the key technology of point cloud data information acquisition and 3D reconstruction, which has a wide range of applications. However, the existing point cloud classification methods have some shortcomings when extracting point cloud features, such as insufficient extraction of local information and overlooking the information in other neighborhood features in the point cloud, and not focusing on the point cloud channel information and spatial information. To solve the above problems, a point cloud classification network based on graph convolution and fusion attention mechanism is proposed to achieve more accurate classification results. Firstly, the point cloud is regarded as a node on the graph, the k-nearest neighbor algorithm is used to compose the graph and the information between points is dynamically captured by stacking multiple graph convolution layers;then, with the assistance of 2D experience of attention mechanism, an attention mechanism which has the capability to integrate more attention to point cloud spatial and channel information is introduced to increase the feature information of point cloud, aggregate local useful features and suppress useless features. Through the classification experiments on ModelNet40 dataset, the experimental results show that compared with PointNet network without considering the local feature information of the point cloud, the average classification accuracy of the proposed model has a 4.4% improvement and the overall classification accuracy has a 4.4% improvement. Compared with other networks, the classification accuracy of the proposed model has also been improved.
基金National Natural Science Foundation of China,No.42271418,No.42171088State Key Laboratory of Earth Surface Processes and Resource Ecology,No.2022-ZD-04,No.2023-WT-02。
文摘Land use and land cover change(LUCC)process exhibits spatial correlation and temporal dependency.Accurate extraction of spatiotemporal features is important in enhancing the modeling capabilities of LUCC.Cellular automaton(CA)models,recognized as powerful tools for simulating dynamic LUCC processes,are traditionally applied in LUCC,focusing on time-slice driving factor data,often neglecting the temporal dimension.However,the transformer architecture,a highly acclaimed model in machine learning,has been rarely integrated into CA models for the simulation of dynamic LUCC processes.To fill this gap,we proposed a novel spatiotemporal urban LUCC simulation model,namely,transformer-convolutional neural network(TC)-CA.Based on CA models that involve the utilization of a convolutional neural network(CNN)for extracting latent spatial features,TC-CA extends this paradigm by incorporating a transformer architecture to extract spatiotemporal information from temporal driving factor data and temporal spatial features.The evaluation results with Wuxi city as a study area indicated the advantage of our proposed TC-CA against random forest-CA,conventional CNN-CA,artificial neural network-CA,and transformer-CA.Compared with the three non-transformer-based CAs,the TC-CA improved the figure of merit by up to 2.85%-8.14%.This study contributes a fresh spatiotemporal perspective and transformer approach to the field of LUCC modeling.
基金Supported by the National Natural Science Foundation of China under Grant 41974137.
文摘Data-driven prediction of time series is significant in many scientific research fields such as global climate change and weather forecast.For global monthly mean temperature series,considering the strong potential of deep neural network for extracting data features,this paper proposes a data-driven model,ResGraphNet,which improves the prediction accuracy of time series by an embedded residual module in GraphSAGE layers.The experimental results of a global mean temperature dataset,HadCRUT5,show that compared with 11 traditional prediction technologies,the proposed ResGraphNet obtains the best accuracy.The error indicator predicted by the proposed ResGraphNet is smaller than that of the other 11 prediction models.Furthermore,the performance on seven temperature datasets shows the excellent generalization of the ResGraphNet.Finally,based on our proposed ResGraphNet,the predicted 2022 annual anomaly of global temperature is 0.74722℃,which provides confidence for limiting warming to 1.5℃ above pre-industrial levels.
基金supported by the Technology Projects of Guizhou Province under Grant[2024]003National Natural Science Foundation of China(GrantNos.62166007,62066008,62066007)Guizhou Provincial Science and Technology Projects under Grant No.ZK[2023]300.
文摘The relation is a semantic expression relevant to two named entities in a sentence.Since a sentence usually contains several named entities,it is essential to learn a structured sentence representation that encodes dependency information specific to the two named entities.In related work,graph convolutional neural networks are widely adopted to learn semantic dependencies,where a dependency tree initializes the adjacency matrix.However,this approach has two main issues.First,parsing a sentence heavily relies on external toolkits,which can be errorprone.Second,the dependency tree only encodes the syntactical structure of a sentence,which may not align with the relational semantic expression.In this paper,we propose an automatic graph learningmethod to autonomously learn a sentence’s structural information.Instead of using a fixed adjacency matrix initialized by a dependency tree,we introduce an Adaptive Adjacency Matrix to encode the semantic dependency between tokens.The elements of thismatrix are dynamically learned during the training process and optimized by task-relevant learning objectives,enabling the construction of task-relevant semantic dependencies within a sentence.Our model demonstrates superior performance on the TACRED and SemEval 2010 datasets,surpassing previous works by 1.3%and 0.8%,respectively.These experimental results show that our model excels in the relation extraction task,outperforming prior models.