This paper proposes the generalized regression neural network(GRNN)model and multi-GRNN model with a gating network by selecting the data of Shanghai index,the stocks of Shanghai Pudong Development Bank(SPDB),Dongfeng...This paper proposes the generalized regression neural network(GRNN)model and multi-GRNN model with a gating network by selecting the data of Shanghai index,the stocks of Shanghai Pudong Development Bank(SPDB),Dongfeng Automobile and Baotou Steel.We analyze the two models using Matlab software to predict the opening price respectively.Through building a softmax excitation function,the multi-GRNN model with a gating network can obtain the best weights.Using the data of the four groups,the average of forecasting errors of 4 groups by GRNN neural model is 0.012 208,while the average of the multi-GRNN models's with a gating network is 0.002 659.Compared with the real data,it is found that the both results predicted by the two models have small mean square prediction errors.So the two models are suitable to be adopted to process a large quantity of data,furthermore the multi-GRNN model with a gating network is better than the GRNN model.展开更多
Social media has become increasingly significant in modern society,but it has also turned into a breeding ground for the propagation of misleading information,potentially causing a detrimental impact on public opinion...Social media has become increasingly significant in modern society,but it has also turned into a breeding ground for the propagation of misleading information,potentially causing a detrimental impact on public opinion and daily life.Compared to pure text content,multmodal content significantly increases the visibility and share ability of posts.This has made the search for efficient modality representations and cross-modal information interaction methods a key focus in the field of multimodal fake news detection.To effectively address the critical challenge of accurately detecting fake news on social media,this paper proposes a fake news detection model based on crossmodal message aggregation and a gated fusion network(MAGF).MAGF first uses BERT to extract cumulative textual feature representations and word-level features,applies Faster Region-based ConvolutionalNeuralNetwork(Faster R-CNN)to obtain image objects,and leverages ResNet-50 and Visual Geometry Group-19(VGG-19)to obtain image region features and global features.The image region features and word-level text features are then projected into a low-dimensional space to calculate a text-image affinity matrix for cross-modal message aggregation.The gated fusion network combines text and image region features to obtain adaptively aggregated features.The interaction matrix is derived through an attention mechanism and further integrated with global image features using a co-attention mechanism to producemultimodal representations.Finally,these fused features are fed into a classifier for news categorization.Experiments were conducted on two public datasets,Twitter and Weibo.Results show that the proposed model achieves accuracy rates of 91.8%and 88.7%on the two datasets,respectively,significantly outperforming traditional unimodal and existing multimodal models.展开更多
Modeling of unsteady aerodynamic loads at high angles of attack using a small amount of experimental or simulation data to construct predictive models for unknown states can greatly improve the efficiency of aircraft ...Modeling of unsteady aerodynamic loads at high angles of attack using a small amount of experimental or simulation data to construct predictive models for unknown states can greatly improve the efficiency of aircraft unsteady aerodynamic design and flight dynamics analysis.In this paper,aiming at the problems of poor generalization of traditional aerodynamic models and intelligent models,an intelligent aerodynamic modeling method based on gated neural units is proposed.The time memory characteristics of the gated neural unit is fully utilized,thus the nonlinear flow field characterization ability of the learning and training process is enhanced,and the generalization ability of the whole prediction model is improved.The prediction and verification of the model are carried out under the maneuvering flight condition of NACA0015 airfoil.The results show that the model has good adaptability.In the interpolation prediction,the maximum prediction error of the lift and drag coefficients and the moment coefficient does not exceed 10%,which can basically represent the variation characteristics of the entire flow field.In the construction of extrapolation models,the training model based on the strong nonlinear data has good accuracy for weak nonlinear prediction.Furthermore,the error is larger,even exceeding 20%,which indicates that the extrapolation and generalization capabilities need to be further optimized by integrating physical models.Compared with the conventional state space equation model,the proposed method can improve the extrapolation accuracy and efficiency by 78%and 60%,respectively,which demonstrates the applied potential of this method in aerodynamic modeling.展开更多
The Gated Recurrent Unit(GRU) neural network has great potential in estimating and predicting a variable. In addition to radar reflectivity(Z), radar echo-top height(ET) is also a good indicator of rainfall rate(R). I...The Gated Recurrent Unit(GRU) neural network has great potential in estimating and predicting a variable. In addition to radar reflectivity(Z), radar echo-top height(ET) is also a good indicator of rainfall rate(R). In this study, we propose a new method, GRU_Z-ET, by introducing Z and ET as two independent variables into the GRU neural network to conduct the quantitative single-polarization radar precipitation estimation. The performance of GRU_Z-ET is compared with that of the other three methods in three heavy rainfall cases in China during 2018, namely, the traditional Z-R relationship(Z=300R1.4), the optimal Z-R relationship(Z=79R1.68) and the GRU neural network with only Z as the independent input variable(GRU_Z). The results indicate that the GRU_Z-ET performs the best, while the traditional Z-R relationship performs the worst. The performances of the rest two methods are similar.To further evaluate the performance of the GRU_Z-ET, 200 rainfall events with 21882 total samples during May–July of 2018 are used for statistical analysis. Results demonstrate that the spatial correlation coefficients, threat scores and probability of detection between the observed and estimated precipitation are the largest for the GRU_Z-ET and the smallest for the traditional Z-R relationship, and the root mean square error is just the opposite. In addition, these statistics of GRU_Z are similar to those of optimal Z-R relationship. Thus, it can be concluded that the performance of the GRU_ZET is the best in the four methods for the quantitative precipitation estimation.展开更多
In recent years,real-time video streaming has grown in popularity.The growing popularity of the Internet of Things(IoT)and other wireless heterogeneous networks mandates that network resources be carefully apportioned...In recent years,real-time video streaming has grown in popularity.The growing popularity of the Internet of Things(IoT)and other wireless heterogeneous networks mandates that network resources be carefully apportioned among versatile users in order to achieve the best Quality of Experience(QoE)and performance objectives.Most researchers focused on Forward Error Correction(FEC)techniques when attempting to strike a balance between QoE and performance.However,as network capacity increases,the performance degrades,impacting the live visual experience.Recently,Deep Learning(DL)algorithms have been successfully integrated with FEC to stream videos across multiple heterogeneous networks.But these algorithms need to be changed to make the experience better without sacrificing packet loss and delay time.To address the previous challenge,this paper proposes a novel intelligent algorithm that streams video in multi-home heterogeneous networks based on network-centric characteristics.The proposed framework contains modules such as Intelligent Content Extraction Module(ICEM),Channel Status Monitor(CSM),and Adaptive FEC(AFEC).This framework adopts the Cognitive Learning-based Scheduling(CLS)Module,which works on the deep Reinforced Gated Recurrent Networks(RGRN)principle and embeds them along with the FEC to achieve better performances.The complete framework was developed using the Objective Modular Network Testbed in C++(OMNET++),Internet networking(INET),and Python 3.10,with Keras as the front end and Tensorflow 2.10 as the back end.With extensive experimentation,the proposed model outperforms the other existing intelligentmodels in terms of improving the QoE,minimizing the End-to-End Delay(EED),and maintaining the highest accuracy(98%)and a lower Root Mean Square Error(RMSE)value of 0.001.展开更多
Diabetes mellitus is a metabolic disease in which blood glucose levels rise as a result of pancreatic insulin production failure.It causes hyperglycemia and chronic multiorgan dysfunction,including blindness,renal fai...Diabetes mellitus is a metabolic disease in which blood glucose levels rise as a result of pancreatic insulin production failure.It causes hyperglycemia and chronic multiorgan dysfunction,including blindness,renal failure,and cardi-ovascular disease,if left untreated.One of the essential checks that are needed to be performed frequently in Type 1 Diabetes Mellitus is a blood test,this procedure involves extracting blood quite frequently,which leads to subject discomfort increasing the possibility of infection when the procedure is often recurring.Exist-ing methods used for diabetes classification have less classification accuracy and suffer from vanishing gradient problems,to overcome these issues,we proposed stacking ensemble learning-based convolutional gated recurrent neural network(CGRNN)Metamodel algorithm.Our proposed method initially performs outlier detection to remove outlier data,using the Gaussian distribution method,and the Box-cox method is used to correctly order the dataset.After the outliers’detec-tion,the missing values are replaced by the data’s mean rather than their elimina-tion.In the stacking ensemble base model,multiple machine learning algorithms like Naïve Bayes,Bagging with random forest,and Adaboost Decision tree have been employed.CGRNN Meta model uses two hidden layers Long-Short-Time Memory(LSTM)and Gated Recurrent Unit(GRU)to calculate the weight matrix for diabetes prediction.Finally,the calculated weight matrix is passed to the soft-max function in the output layer to produce the diabetes prediction results.By using LSTM-based CG-RNN,the mean square error(MSE)value is 0.016 and the obtained accuracy is 91.33%.展开更多
Speech separation is an active research topic that plays an important role in numerous applications,such as speaker recognition,hearing pros-thesis,and autonomous robots.Many algorithms have been put forward to improv...Speech separation is an active research topic that plays an important role in numerous applications,such as speaker recognition,hearing pros-thesis,and autonomous robots.Many algorithms have been put forward to improve separation performance.However,speech separation in reverberant noisy environment is still a challenging task.To address this,a novel speech separation algorithm using gate recurrent unit(GRU)network based on microphone array has been proposed in this paper.The main aim of the proposed algorithm is to improve the separation performance and reduce the computational cost.The proposed algorithm extracts the sub-band steered response power-phase transform(SRP-PHAT)weighted by gammatone filter as the speech separation feature due to its discriminative and robust spatial position in formation.Since the GRU net work has the advantage of processing time series data with faster training speed and fewer training parameters,the GRU model is adopted to process the separation featuresof several sequential frames in the same sub-band to estimate the ideal Ratio Masking(IRM).The proposed algorithm decomposes the mixture signals into time-frequency(TF)units using gammatone filter bank in the frequency domain,and the target speech is reconstructed in the frequency domain by masking the mixture signal according to the estimated IRM.The operations of decomposing the mixture signal and reconstructing the target signal are completed in the frequency domain which can reduce the total computational cost.Experimental results demonstrate that the proposed algorithm realizes omnidirectional speech sep-aration in noisy and reverberant environments,provides good performance in terms of speech quality and intelligibility,and has the generalization capacity to reverberate.展开更多
This study proposed a new real-time manufacturing process monitoring method to monitor and detect process shifts in manufacturing operations.Since real-time production process monitoring is critical in today’s smart ...This study proposed a new real-time manufacturing process monitoring method to monitor and detect process shifts in manufacturing operations.Since real-time production process monitoring is critical in today’s smart manufacturing.The more robust the monitoring model,the more reliable a process is to be under control.In the past,many researchers have developed real-time monitoring methods to detect process shifts early.However,thesemethods have limitations in detecting process shifts as quickly as possible and handling various data volumes and varieties.In this paper,a robust monitoring model combining Gated Recurrent Unit(GRU)and Random Forest(RF)with Real-Time Contrast(RTC)called GRU-RF-RTC was proposed to detect process shifts rapidly.The effectiveness of the proposed GRU-RF-RTC model is first evaluated using multivariate normal and nonnormal distribution datasets.Then,to prove the applicability of the proposed model in a realmanufacturing setting,the model was evaluated using real-world normal and non-normal problems.The results demonstrate that the proposed GRU-RF-RTC outperforms other methods in detecting process shifts quickly with the lowest average out-of-control run length(ARL1)in all synthesis and real-world problems under normal and non-normal cases.The experiment results on real-world problems highlight the significance of the proposed GRU-RF-RTC model in modern manufacturing process monitoring applications.The result reveals that the proposed method improves the shift detection capability by 42.14%in normal and 43.64%in gamma distribution problems.展开更多
As the smart home is the end-point power consumer, it is the major part to be controlled in a smart micro grid. There are so many challenges for implementing a smart home system in which the most important ones are th...As the smart home is the end-point power consumer, it is the major part to be controlled in a smart micro grid. There are so many challenges for implementing a smart home system in which the most important ones are the cost and simplicity of the implementation method. It is clear that the major share of the total cost is referred to the internal controlling system network; although there are too many methods proposed but still there is not any satisfying method at the consumers' point of view. In this paper, a novel solution for this demand is proposed, which not only minimizes the implementation cost, but also provides a high level of reliability and simplicity of operation; feasibility, extendibility, and flexibility are other leading properties of the design.展开更多
Predominantly the localization accuracy of the magnetic field-based localization approaches is severed by two limiting factors:Smartphone heterogeneity and smaller data lengths.The use of multifarioussmartphones cripp...Predominantly the localization accuracy of the magnetic field-based localization approaches is severed by two limiting factors:Smartphone heterogeneity and smaller data lengths.The use of multifarioussmartphones cripples the performance of such approaches owing to the variability of the magnetic field data.In the same vein,smaller lengths of magnetic field data decrease the localization accuracy substantially.The current study proposes the use of multiple neural networks like deep neural network(DNN),long short term memory network(LSTM),and gated recurrent unit network(GRN)to perform indoor localization based on the embedded magnetic sensor of the smartphone.A voting scheme is introduced that takes predictions from neural networks into consideration to estimate the current location of the user.Contrary to conventional magnetic field-based localization approaches that rely on the magnetic field data intensity,this study utilizes the normalized magnetic field data for this purpose.Training of neural networks is carried out using Galaxy S8 data while the testing is performed with three devices,i.e.,LG G7,Galaxy S8,and LG Q6.Experiments are performed during different times of the day to analyze the impact of time variability.Results indicate that the proposed approach minimizes the impact of smartphone variability and elevates the localization accuracy.Performance comparison with three approaches reveals that the proposed approach outperforms them in mean,50%,and 75%error even using a lesser amount of magnetic field data than those of other approaches.展开更多
An accurate prediction of earth pressure balance(EPB)shield moving performance is important to ensure the safety tunnel excavation.A hybrid model is developed based on the particle swarm optimization(PSO)and gated rec...An accurate prediction of earth pressure balance(EPB)shield moving performance is important to ensure the safety tunnel excavation.A hybrid model is developed based on the particle swarm optimization(PSO)and gated recurrent unit(GRU)neural network.PSO is utilized to assign the optimal hyperparameters of GRU neural network.There are mainly four steps:data collection and processing,hybrid model establishment,model performance evaluation and correlation analysis.The developed model provides an alternative to tackle with time-series data of tunnel project.Apart from that,a novel framework about model application is performed to provide guidelines in practice.A tunnel project is utilized to evaluate the performance of proposed hybrid model.Results indicate that geological and construction variables are significant to the model performance.Correlation analysis shows that construction variables(main thrust and foam liquid volume)display the highest correlation with the cutterhead torque(CHT).This work provides a feasible and applicable alternative way to estimate the performance of shield tunneling.展开更多
Harsh working environments and wear between blades and other unit components can easily lead to cracks and damage on wind turbine blades.The cracks on the blades can endanger the shafting of the generator set,the towe...Harsh working environments and wear between blades and other unit components can easily lead to cracks and damage on wind turbine blades.The cracks on the blades can endanger the shafting of the generator set,the tower and other components,and even cause the tower to collapse.To achieve high-precision wind blade crack detection,this paper proposes a crack fault-detection strategy that integratesGated ResidualNetwork(GRN),a fusionmodule and Transformer.Firstly,GRNcan reduce unnecessary noisy inputs that could negatively impact performancewhile preserving the integrity of feature information.In addition,to gain in-depth information about the characteristics of wind turbine blades,a fusionmodule is suggested to implement the information fusion of wind turbine features.Specifically,each fan feature ismapped to a one-dimensional vector with the same length,and all one-dimensional vectors are concatenated to obtain a two-dimensional vector.And then,in the fusion module,the information fusion of the same characteristic variables in the different channels is realized through the Channel-mixing MLP,and the information fusion of different characteristic variables in the same channel is realized through the ColumnmixingMLP.Finally,the fused feature vector is input into the Transformer for feature learning,which enhances the influence of important feature information and improves the model’s anti-noise ability and classification accuracy.Extensive experimentswere conducted on the wind turbine supervisory control and data acquisition(SCADA)data froma domesticwind field.The results show that compared with other state-of-the-artmodels,including XGBoost,LightGBM,TabNet,etc.,the F1-score of proposed gated fusion based Transformer model can reach 0.9907,which is 0.4%-2.09% higher than the comparedmodels.Thismethod provides amore reliable approach for the condition detection and maintenance of fan blades in wind farms.展开更多
Deep learning has risen in popularity as a face recognition technology in recent years.Facenet,a deep convolutional neural network(DCNN)developed by Google,recognizes faces with 128 bytes per face.It also claims to ha...Deep learning has risen in popularity as a face recognition technology in recent years.Facenet,a deep convolutional neural network(DCNN)developed by Google,recognizes faces with 128 bytes per face.It also claims to have achieved 99.96%on the reputed Labelled Faces in the Wild(LFW)dataset.How-ever,the accuracy and validation rate of Facenet drops down eventually,there is a gradual decrease in the resolution of the images.This research paper aims at developing a new facial recognition system that can produce a higher accuracy rate and validation rate on low-resolution face images.The proposed system Extended Openface performs facial recognition by using three different features i)facial landmark ii)head pose iii)eye gaze.It extracts facial landmark detection using Scattered Gated Expert Network Constrained Local Model(SGEN-CLM).It also detects the head pose and eye gaze using Enhanced Constrained Local Neur-alfield(ECLNF).Extended openface employs a simple Support Vector Machine(SVM)for training and testing the face images.The system’s performance is assessed on low-resolution datasets like LFW,Indian Movie Face Database(IMFDB).The results demonstrated that Extended Openface has a better accuracy rate(12%)and validation rate(22%)than Facenet on low-resolution images.展开更多
Near crash events are often regarded as an excellent surrogate measure for traffic safety research because they include abrupt changes in vehicle kinematics that can lead to deadly accident scenarios. In this paper, w...Near crash events are often regarded as an excellent surrogate measure for traffic safety research because they include abrupt changes in vehicle kinematics that can lead to deadly accident scenarios. In this paper, we introduced machine learning and deep learning algorithms for predicting near crash events using LiDAR data at a signalized intersection. To predict a near crash occurrence, we used essential vehicle kinematic variables such as lateral and longitudinal velocity, yaw, tracking status of LiDAR, etc. A deep learning hybrid model Convolutional Gated Recurrent Neural Network (CNN + GRU) was introduced, and comparative performances were evaluated with multiple machine learning classification models such as Logistic Regression, K Nearest Neighbor, Decision Tree, Random Forest, Adaptive Boost, and deep learning models like Long Short-Term Memory (LSTM). As vehicle kinematics changes occur after sudden brake, we considered average deceleration and kinematic energy drop as thresholds to identify near crashes after vehicle braking time . We looked at the next 3 seconds of this braking time as our prediction horizon. All models work best in the next 1-second prediction horizon to braking time. The results also reveal that our hybrid model gathers the greatest near crash information while working flawlessly. In comparison to existing models for near crash prediction, our hybrid Convolutional Gated Recurrent Neural Network model has 100% recall, 100% precision, and 100% F1-score: accurately capturing all near crashes. This prediction performance outperforms previous baseline models in forecasting near crash events and provides opportunities for improving traffic safety via Intelligent Transportation Systems (ITS).展开更多
Three recent breakthroughs due to AI in arts and science serve as motivation:An award winning digital image,protein folding,fast matrix multiplication.Many recent developments in artificial neural networks,particularl...Three recent breakthroughs due to AI in arts and science serve as motivation:An award winning digital image,protein folding,fast matrix multiplication.Many recent developments in artificial neural networks,particularly deep learning(DL),applied and relevant to computational mechanics(solid,fluids,finite-element technology)are reviewed in detail.Both hybrid and pure machine learning(ML)methods are discussed.Hybrid methods combine traditional PDE discretizations with ML methods either(1)to help model complex nonlinear constitutive relations,(2)to nonlinearly reduce the model order for efficient simulation(turbulence),or(3)to accelerate the simulation by predicting certain components in the traditional integration methods.Here,methods(1)and(2)relied on Long-Short-Term Memory(LSTM)architecture,with method(3)relying on convolutional neural networks.Pure ML methods to solve(nonlinear)PDEs are represented by Physics-Informed Neural network(PINN)methods,which could be combined with attention mechanism to address discontinuous solutions.Both LSTM and attention architectures,together with modern and generalized classic optimizers to include stochasticity for DL networks,are extensively reviewed.Kernel machines,including Gaussian processes,are provided to sufficient depth for more advanced works such as shallow networks with infinite width.Not only addressing experts,readers are assumed familiar with computational mechanics,but not with DL,whose concepts and applications are built up from the basics,aiming at bringing first-time learners quickly to the forefront of research.History and limitations of AI are recounted and discussed,with particular attention at pointing out misstatements or misconceptions of the classics,even in well-known references.Positioning and pointing control of a large-deformable beam is given as an example.展开更多
Landslide displacement prediction can enhance the efficacy of landslide monitoring system,and the prediction of the periodic displacement is particularly challenging.In the previous studies,static regression models(e....Landslide displacement prediction can enhance the efficacy of landslide monitoring system,and the prediction of the periodic displacement is particularly challenging.In the previous studies,static regression models(e.g.,support vector machine(SVM))were mostly used for predicting the periodic displacement.These models may have bad performances,when the dynamic features of landslide triggers are incorporated.This paper proposes a method for predicting the landslide displacement in a dynamic manner,based on the gated recurrent unit(GRU)neural network and complete ensemble empirical decomposition with adaptive noise(CEEMDAN).The CEEMDAN is used to decompose the training data,and the GRU is subsequently used for predicting the periodic displacement.Implementation procedures of the proposed method were illustrated by a case study in the Caojiatuo landslide area,and SVM was also adopted for the periodic displacement prediction.This case study shows that the predictors obtained by SVM are inaccurate,as the landslide displacement is in a pronouncedly step-wise manner.By contrast,the accuracy can be significantly improved using the dynamic predictive method.This paper reveals the significance of capturing the dynamic features of the inputs in the training process,when the machine learning models are adopted to predict the landslide displacement.展开更多
Spectrogram representations of acoustic scenes have achieved competitive performance for acoustic scene classification. Yet, the spectrogram alone does not take into account a substantial amount of time-frequency info...Spectrogram representations of acoustic scenes have achieved competitive performance for acoustic scene classification. Yet, the spectrogram alone does not take into account a substantial amount of time-frequency information. In this study, we present an approach for exploring the benefits of deep scalogram representations, extracted in segments from an audio stream. The approach presented firstly transforms the segmented acoustic scenes into bump and morse scalograms, as well as spectrograms; secondly, the spectrograms or scalograms are sent into pre-trained convolutional neural networks; thirdly,the features extracted from a subsequent fully connected layer are fed into(bidirectional) gated recurrent neural networks, which are followed by a single highway layer and a softmax layer;finally, predictions from these three systems are fused by a margin sampling value strategy. We then evaluate the proposed approach using the acoustic scene classification data set of 2017 IEEE AASP Challenge on Detection and Classification of Acoustic Scenes and Events(DCASE). On the evaluation set, an accuracy of 64.0 % from bidirectional gated recurrent neural networks is obtained when fusing the spectrogram and the bump scalogram, which is an improvement on the 61.0 % baseline result provided by the DCASE 2017 organisers. This result shows that extracted bump scalograms are capable of improving the classification accuracy,when fusing with a spectrogram-based system.展开更多
In lung nodules there is a huge variation in structural properties like Shape, Surface Texture. Even the spatial properties vary, where they can be found attached to lung walls, blood vessels in complex non-homogenous...In lung nodules there is a huge variation in structural properties like Shape, Surface Texture. Even the spatial properties vary, where they can be found attached to lung walls, blood vessels in complex non-homogenous lung structures. Moreover, the nodules are of small size at their early stage of development. This poses a serious challenge to develop a Computer aided diagnosis (CAD) system with better false positive reduction. Hence, to reduce the false positives per scan and to deal with the challenges mentioned, this paper proposes a set of three diverse 3D Attention based CNN architectures (3D ACNN) whose predictions on given low dose Volumetric Computed Tomography (CT) scans are fused to achieve more effective and reliable results. Attention mechanism is employed to selectively concentrate/weigh more on nodule specific features and less weight age over other irrelevant features. By using this attention based mechanism in CNN unlike traditional methods there was a significant gain in the classification performance. Contextual dependencies are also taken into account by giving three patches of different sizes surrounding the nodule as input to the ACNN architectures. The system is trained and validated using a publicly available LUNA16 dataset in a 10 fold cross validation approach where a competition performance metric (CPM) score of 0.931 is achieved. The experimental results demonstrate that either a single patch or a single architecture in a one-to-one fashion that is adopted in earlier methods cannot achieve a better performance and signifies the necessity of fusing different multi patched architectures. Though the proposed system is mainly designed for pulmonary nodule detection it can be easily extended to classification tasks of any other 3D medical diagnostic computed tomography images where there is a huge variation and uncertainty in classification.展开更多
The battlefield environment is changing rapidly,and fast and accurate identification of the tactical intention of enemy targets is an important condition for gaining a decision-making advantage.The current Intention R...The battlefield environment is changing rapidly,and fast and accurate identification of the tactical intention of enemy targets is an important condition for gaining a decision-making advantage.The current Intention Recognition(IR)method for air targets has shortcomings in temporality,interpretability and back-and-forth dependency of intentions.To address these problems,this paper designs a novel air target intention recognition method named STABC-IR,which is based on Bidirectional Gated Recurrent Unit(Bi GRU)and Conditional Random Field(CRF)with Space-Time Attention mechanism(STA).First,the problem of intention recognition of air targets is described and analyzed in detail.Then,a temporal network based on Bi GRU is constructed to achieve the temporal requirement.Subsequently,STA is proposed to focus on the key parts of the features and timing information to meet certain interpretability requirements while strengthening the timing requirements.Finally,an intention transformation network based on CRF is proposed to solve the back-and-forth dependency and transformation problem by jointly modeling the tactical intention of the target at each moment.The experimental results show that the recognition accuracy of the jointly trained STABC-IR model can reach 95.7%,which is higher than other latest intention recognition methods.STABC-IR solves the problem of intention transformation for the first time and considers both temporality and interpretability,which is important for improving the tactical intention recognition capability and has reference value for the construction of command and control auxiliary decision-making system.展开更多
基金Postdoctoral Granted Financial Support from China Postdoctoral Science Foundation(20100481307)Natural Science Foundation of Shanxi Province,China(No.2009011018-3)
文摘This paper proposes the generalized regression neural network(GRNN)model and multi-GRNN model with a gating network by selecting the data of Shanghai index,the stocks of Shanghai Pudong Development Bank(SPDB),Dongfeng Automobile and Baotou Steel.We analyze the two models using Matlab software to predict the opening price respectively.Through building a softmax excitation function,the multi-GRNN model with a gating network can obtain the best weights.Using the data of the four groups,the average of forecasting errors of 4 groups by GRNN neural model is 0.012 208,while the average of the multi-GRNN models's with a gating network is 0.002 659.Compared with the real data,it is found that the both results predicted by the two models have small mean square prediction errors.So the two models are suitable to be adopted to process a large quantity of data,furthermore the multi-GRNN model with a gating network is better than the GRNN model.
基金supported by the National Natural Science Foundation of China(No.62302540)with author Fangfang Shan.For more information,please visit their website at https://www.nsfc.gov.cn/(accessed on 31/05/2024)+3 种基金Additionally,it is also funded by the Open Foundation of Henan Key Laboratory of Cyberspace Situation Awareness(No.HNTS2022020)where Fangfang Shan is an author.Further details can be found at http://xt.hnkjt.gov.cn/data/pingtai/(accessed on 31/05/2024)supported by the Natural Science Foundation of Henan Province Youth Science Fund Project(No.232300420422)for more information,you can visit https://kjt.henan.gov.cn/2022/09-02/2599082.html(accessed on 31/05/2024).
文摘Social media has become increasingly significant in modern society,but it has also turned into a breeding ground for the propagation of misleading information,potentially causing a detrimental impact on public opinion and daily life.Compared to pure text content,multmodal content significantly increases the visibility and share ability of posts.This has made the search for efficient modality representations and cross-modal information interaction methods a key focus in the field of multimodal fake news detection.To effectively address the critical challenge of accurately detecting fake news on social media,this paper proposes a fake news detection model based on crossmodal message aggregation and a gated fusion network(MAGF).MAGF first uses BERT to extract cumulative textual feature representations and word-level features,applies Faster Region-based ConvolutionalNeuralNetwork(Faster R-CNN)to obtain image objects,and leverages ResNet-50 and Visual Geometry Group-19(VGG-19)to obtain image region features and global features.The image region features and word-level text features are then projected into a low-dimensional space to calculate a text-image affinity matrix for cross-modal message aggregation.The gated fusion network combines text and image region features to obtain adaptively aggregated features.The interaction matrix is derived through an attention mechanism and further integrated with global image features using a co-attention mechanism to producemultimodal representations.Finally,these fused features are fed into a classifier for news categorization.Experiments were conducted on two public datasets,Twitter and Weibo.Results show that the proposed model achieves accuracy rates of 91.8%and 88.7%on the two datasets,respectively,significantly outperforming traditional unimodal and existing multimodal models.
基金supported in part by the National Natural Science Foundation of China (No. 12202363)。
文摘Modeling of unsteady aerodynamic loads at high angles of attack using a small amount of experimental or simulation data to construct predictive models for unknown states can greatly improve the efficiency of aircraft unsteady aerodynamic design and flight dynamics analysis.In this paper,aiming at the problems of poor generalization of traditional aerodynamic models and intelligent models,an intelligent aerodynamic modeling method based on gated neural units is proposed.The time memory characteristics of the gated neural unit is fully utilized,thus the nonlinear flow field characterization ability of the learning and training process is enhanced,and the generalization ability of the whole prediction model is improved.The prediction and verification of the model are carried out under the maneuvering flight condition of NACA0015 airfoil.The results show that the model has good adaptability.In the interpolation prediction,the maximum prediction error of the lift and drag coefficients and the moment coefficient does not exceed 10%,which can basically represent the variation characteristics of the entire flow field.In the construction of extrapolation models,the training model based on the strong nonlinear data has good accuracy for weak nonlinear prediction.Furthermore,the error is larger,even exceeding 20%,which indicates that the extrapolation and generalization capabilities need to be further optimized by integrating physical models.Compared with the conventional state space equation model,the proposed method can improve the extrapolation accuracy and efficiency by 78%and 60%,respectively,which demonstrates the applied potential of this method in aerodynamic modeling.
基金jointly supported by the National Science Foundation of China (Grant Nos. 42275007 and 41865003)Jiangxi Provincial Department of science and technology project (Grant No. 20171BBG70004)。
文摘The Gated Recurrent Unit(GRU) neural network has great potential in estimating and predicting a variable. In addition to radar reflectivity(Z), radar echo-top height(ET) is also a good indicator of rainfall rate(R). In this study, we propose a new method, GRU_Z-ET, by introducing Z and ET as two independent variables into the GRU neural network to conduct the quantitative single-polarization radar precipitation estimation. The performance of GRU_Z-ET is compared with that of the other three methods in three heavy rainfall cases in China during 2018, namely, the traditional Z-R relationship(Z=300R1.4), the optimal Z-R relationship(Z=79R1.68) and the GRU neural network with only Z as the independent input variable(GRU_Z). The results indicate that the GRU_Z-ET performs the best, while the traditional Z-R relationship performs the worst. The performances of the rest two methods are similar.To further evaluate the performance of the GRU_Z-ET, 200 rainfall events with 21882 total samples during May–July of 2018 are used for statistical analysis. Results demonstrate that the spatial correlation coefficients, threat scores and probability of detection between the observed and estimated precipitation are the largest for the GRU_Z-ET and the smallest for the traditional Z-R relationship, and the root mean square error is just the opposite. In addition, these statistics of GRU_Z are similar to those of optimal Z-R relationship. Thus, it can be concluded that the performance of the GRU_ZET is the best in the four methods for the quantitative precipitation estimation.
文摘In recent years,real-time video streaming has grown in popularity.The growing popularity of the Internet of Things(IoT)and other wireless heterogeneous networks mandates that network resources be carefully apportioned among versatile users in order to achieve the best Quality of Experience(QoE)and performance objectives.Most researchers focused on Forward Error Correction(FEC)techniques when attempting to strike a balance between QoE and performance.However,as network capacity increases,the performance degrades,impacting the live visual experience.Recently,Deep Learning(DL)algorithms have been successfully integrated with FEC to stream videos across multiple heterogeneous networks.But these algorithms need to be changed to make the experience better without sacrificing packet loss and delay time.To address the previous challenge,this paper proposes a novel intelligent algorithm that streams video in multi-home heterogeneous networks based on network-centric characteristics.The proposed framework contains modules such as Intelligent Content Extraction Module(ICEM),Channel Status Monitor(CSM),and Adaptive FEC(AFEC).This framework adopts the Cognitive Learning-based Scheduling(CLS)Module,which works on the deep Reinforced Gated Recurrent Networks(RGRN)principle and embeds them along with the FEC to achieve better performances.The complete framework was developed using the Objective Modular Network Testbed in C++(OMNET++),Internet networking(INET),and Python 3.10,with Keras as the front end and Tensorflow 2.10 as the back end.With extensive experimentation,the proposed model outperforms the other existing intelligentmodels in terms of improving the QoE,minimizing the End-to-End Delay(EED),and maintaining the highest accuracy(98%)and a lower Root Mean Square Error(RMSE)value of 0.001.
文摘Diabetes mellitus is a metabolic disease in which blood glucose levels rise as a result of pancreatic insulin production failure.It causes hyperglycemia and chronic multiorgan dysfunction,including blindness,renal failure,and cardi-ovascular disease,if left untreated.One of the essential checks that are needed to be performed frequently in Type 1 Diabetes Mellitus is a blood test,this procedure involves extracting blood quite frequently,which leads to subject discomfort increasing the possibility of infection when the procedure is often recurring.Exist-ing methods used for diabetes classification have less classification accuracy and suffer from vanishing gradient problems,to overcome these issues,we proposed stacking ensemble learning-based convolutional gated recurrent neural network(CGRNN)Metamodel algorithm.Our proposed method initially performs outlier detection to remove outlier data,using the Gaussian distribution method,and the Box-cox method is used to correctly order the dataset.After the outliers’detec-tion,the missing values are replaced by the data’s mean rather than their elimina-tion.In the stacking ensemble base model,multiple machine learning algorithms like Naïve Bayes,Bagging with random forest,and Adaboost Decision tree have been employed.CGRNN Meta model uses two hidden layers Long-Short-Time Memory(LSTM)and Gated Recurrent Unit(GRU)to calculate the weight matrix for diabetes prediction.Finally,the calculated weight matrix is passed to the soft-max function in the output layer to produce the diabetes prediction results.By using LSTM-based CG-RNN,the mean square error(MSE)value is 0.016 and the obtained accuracy is 91.33%.
基金This work is supported by Nanjing Institute of Technology(NIT)fund for Research Startup Projects of Introduced talents under Grant No.YKJ202019Nature Sci-ence Research Project of Higher Education Institutions in Jiangsu Province under Grant No.21KJB510018+1 种基金National Nature Science Foundation of China(NSFC)under Grant No.62001215NIT fund for Doctoral Research Projects under Grant No.ZKJ2020003.
文摘Speech separation is an active research topic that plays an important role in numerous applications,such as speaker recognition,hearing pros-thesis,and autonomous robots.Many algorithms have been put forward to improve separation performance.However,speech separation in reverberant noisy environment is still a challenging task.To address this,a novel speech separation algorithm using gate recurrent unit(GRU)network based on microphone array has been proposed in this paper.The main aim of the proposed algorithm is to improve the separation performance and reduce the computational cost.The proposed algorithm extracts the sub-band steered response power-phase transform(SRP-PHAT)weighted by gammatone filter as the speech separation feature due to its discriminative and robust spatial position in formation.Since the GRU net work has the advantage of processing time series data with faster training speed and fewer training parameters,the GRU model is adopted to process the separation featuresof several sequential frames in the same sub-band to estimate the ideal Ratio Masking(IRM).The proposed algorithm decomposes the mixture signals into time-frequency(TF)units using gammatone filter bank in the frequency domain,and the target speech is reconstructed in the frequency domain by masking the mixture signal according to the estimated IRM.The operations of decomposing the mixture signal and reconstructing the target signal are completed in the frequency domain which can reduce the total computational cost.Experimental results demonstrate that the proposed algorithm realizes omnidirectional speech sep-aration in noisy and reverberant environments,provides good performance in terms of speech quality and intelligibility,and has the generalization capacity to reverberate.
基金support from the National Science and Technology Council of Taiwan(Contract Nos.111-2221 E-011081 and 111-2622-E-011019)the support from Intelligent Manufacturing Innovation Center(IMIC),National Taiwan University of Science and Technology(NTUST),Taipei,Taiwan,which is a Featured Areas Research Center in Higher Education Sprout Project of Ministry of Education(MOE),Taiwan(since 2023)was appreciatedWe also thank Wang Jhan Yang Charitable Trust Fund(Contract No.WJY 2020-HR-01)for its financial support.
文摘This study proposed a new real-time manufacturing process monitoring method to monitor and detect process shifts in manufacturing operations.Since real-time production process monitoring is critical in today’s smart manufacturing.The more robust the monitoring model,the more reliable a process is to be under control.In the past,many researchers have developed real-time monitoring methods to detect process shifts early.However,thesemethods have limitations in detecting process shifts as quickly as possible and handling various data volumes and varieties.In this paper,a robust monitoring model combining Gated Recurrent Unit(GRU)and Random Forest(RF)with Real-Time Contrast(RTC)called GRU-RF-RTC was proposed to detect process shifts rapidly.The effectiveness of the proposed GRU-RF-RTC model is first evaluated using multivariate normal and nonnormal distribution datasets.Then,to prove the applicability of the proposed model in a realmanufacturing setting,the model was evaluated using real-world normal and non-normal problems.The results demonstrate that the proposed GRU-RF-RTC outperforms other methods in detecting process shifts quickly with the lowest average out-of-control run length(ARL1)in all synthesis and real-world problems under normal and non-normal cases.The experiment results on real-world problems highlight the significance of the proposed GRU-RF-RTC model in modern manufacturing process monitoring applications.The result reveals that the proposed method improves the shift detection capability by 42.14%in normal and 43.64%in gamma distribution problems.
文摘As the smart home is the end-point power consumer, it is the major part to be controlled in a smart micro grid. There are so many challenges for implementing a smart home system in which the most important ones are the cost and simplicity of the implementation method. It is clear that the major share of the total cost is referred to the internal controlling system network; although there are too many methods proposed but still there is not any satisfying method at the consumers' point of view. In this paper, a novel solution for this demand is proposed, which not only minimizes the implementation cost, but also provides a high level of reliability and simplicity of operation; feasibility, extendibility, and flexibility are other leading properties of the design.
基金supported by the MSIT(Ministry of Science and ICT),Korea,under the ITRC(Information Technology Research Center)support program(IITP-2019-2016-0-00313)supervised by the IITP(Institute for Information&communication Technology Promotion)+1 种基金supported by Basic Science Research Program through the National Research Foundation of Korea(NRF)funded by the Ministry of Science,ICT and Future Planning(2017R1E1A1A01074345).
文摘Predominantly the localization accuracy of the magnetic field-based localization approaches is severed by two limiting factors:Smartphone heterogeneity and smaller data lengths.The use of multifarioussmartphones cripples the performance of such approaches owing to the variability of the magnetic field data.In the same vein,smaller lengths of magnetic field data decrease the localization accuracy substantially.The current study proposes the use of multiple neural networks like deep neural network(DNN),long short term memory network(LSTM),and gated recurrent unit network(GRN)to perform indoor localization based on the embedded magnetic sensor of the smartphone.A voting scheme is introduced that takes predictions from neural networks into consideration to estimate the current location of the user.Contrary to conventional magnetic field-based localization approaches that rely on the magnetic field data intensity,this study utilizes the normalized magnetic field data for this purpose.Training of neural networks is carried out using Galaxy S8 data while the testing is performed with three devices,i.e.,LG G7,Galaxy S8,and LG Q6.Experiments are performed during different times of the day to analyze the impact of time variability.Results indicate that the proposed approach minimizes the impact of smartphone variability and elevates the localization accuracy.Performance comparison with three approaches reveals that the proposed approach outperforms them in mean,50%,and 75%error even using a lesser amount of magnetic field data than those of other approaches.
基金funded by“The Pearl River Talent Recruitment Program”of Guangdong Province in 2019(Grant No.2019CX01G338)the Research Funding of Shantou University for New Faculty Member(Grant No.NTF19024-2019).
文摘An accurate prediction of earth pressure balance(EPB)shield moving performance is important to ensure the safety tunnel excavation.A hybrid model is developed based on the particle swarm optimization(PSO)and gated recurrent unit(GRU)neural network.PSO is utilized to assign the optimal hyperparameters of GRU neural network.There are mainly four steps:data collection and processing,hybrid model establishment,model performance evaluation and correlation analysis.The developed model provides an alternative to tackle with time-series data of tunnel project.Apart from that,a novel framework about model application is performed to provide guidelines in practice.A tunnel project is utilized to evaluate the performance of proposed hybrid model.Results indicate that geological and construction variables are significant to the model performance.Correlation analysis shows that construction variables(main thrust and foam liquid volume)display the highest correlation with the cutterhead torque(CHT).This work provides a feasible and applicable alternative way to estimate the performance of shield tunneling.
基金supported by the Jiangsu Provincial Key R&D Programme(BE2020034)China Huaneng Group Science and Technology Project(HNKJ20-H72).
文摘Harsh working environments and wear between blades and other unit components can easily lead to cracks and damage on wind turbine blades.The cracks on the blades can endanger the shafting of the generator set,the tower and other components,and even cause the tower to collapse.To achieve high-precision wind blade crack detection,this paper proposes a crack fault-detection strategy that integratesGated ResidualNetwork(GRN),a fusionmodule and Transformer.Firstly,GRNcan reduce unnecessary noisy inputs that could negatively impact performancewhile preserving the integrity of feature information.In addition,to gain in-depth information about the characteristics of wind turbine blades,a fusionmodule is suggested to implement the information fusion of wind turbine features.Specifically,each fan feature ismapped to a one-dimensional vector with the same length,and all one-dimensional vectors are concatenated to obtain a two-dimensional vector.And then,in the fusion module,the information fusion of the same characteristic variables in the different channels is realized through the Channel-mixing MLP,and the information fusion of different characteristic variables in the same channel is realized through the ColumnmixingMLP.Finally,the fused feature vector is input into the Transformer for feature learning,which enhances the influence of important feature information and improves the model’s anti-noise ability and classification accuracy.Extensive experimentswere conducted on the wind turbine supervisory control and data acquisition(SCADA)data froma domesticwind field.The results show that compared with other state-of-the-artmodels,including XGBoost,LightGBM,TabNet,etc.,the F1-score of proposed gated fusion based Transformer model can reach 0.9907,which is 0.4%-2.09% higher than the comparedmodels.Thismethod provides amore reliable approach for the condition detection and maintenance of fan blades in wind farms.
文摘Deep learning has risen in popularity as a face recognition technology in recent years.Facenet,a deep convolutional neural network(DCNN)developed by Google,recognizes faces with 128 bytes per face.It also claims to have achieved 99.96%on the reputed Labelled Faces in the Wild(LFW)dataset.How-ever,the accuracy and validation rate of Facenet drops down eventually,there is a gradual decrease in the resolution of the images.This research paper aims at developing a new facial recognition system that can produce a higher accuracy rate and validation rate on low-resolution face images.The proposed system Extended Openface performs facial recognition by using three different features i)facial landmark ii)head pose iii)eye gaze.It extracts facial landmark detection using Scattered Gated Expert Network Constrained Local Model(SGEN-CLM).It also detects the head pose and eye gaze using Enhanced Constrained Local Neur-alfield(ECLNF).Extended openface employs a simple Support Vector Machine(SVM)for training and testing the face images.The system’s performance is assessed on low-resolution datasets like LFW,Indian Movie Face Database(IMFDB).The results demonstrated that Extended Openface has a better accuracy rate(12%)and validation rate(22%)than Facenet on low-resolution images.
文摘Near crash events are often regarded as an excellent surrogate measure for traffic safety research because they include abrupt changes in vehicle kinematics that can lead to deadly accident scenarios. In this paper, we introduced machine learning and deep learning algorithms for predicting near crash events using LiDAR data at a signalized intersection. To predict a near crash occurrence, we used essential vehicle kinematic variables such as lateral and longitudinal velocity, yaw, tracking status of LiDAR, etc. A deep learning hybrid model Convolutional Gated Recurrent Neural Network (CNN + GRU) was introduced, and comparative performances were evaluated with multiple machine learning classification models such as Logistic Regression, K Nearest Neighbor, Decision Tree, Random Forest, Adaptive Boost, and deep learning models like Long Short-Term Memory (LSTM). As vehicle kinematics changes occur after sudden brake, we considered average deceleration and kinematic energy drop as thresholds to identify near crashes after vehicle braking time . We looked at the next 3 seconds of this braking time as our prediction horizon. All models work best in the next 1-second prediction horizon to braking time. The results also reveal that our hybrid model gathers the greatest near crash information while working flawlessly. In comparison to existing models for near crash prediction, our hybrid Convolutional Gated Recurrent Neural Network model has 100% recall, 100% precision, and 100% F1-score: accurately capturing all near crashes. This prediction performance outperforms previous baseline models in forecasting near crash events and provides opportunities for improving traffic safety via Intelligent Transportation Systems (ITS).
文摘Three recent breakthroughs due to AI in arts and science serve as motivation:An award winning digital image,protein folding,fast matrix multiplication.Many recent developments in artificial neural networks,particularly deep learning(DL),applied and relevant to computational mechanics(solid,fluids,finite-element technology)are reviewed in detail.Both hybrid and pure machine learning(ML)methods are discussed.Hybrid methods combine traditional PDE discretizations with ML methods either(1)to help model complex nonlinear constitutive relations,(2)to nonlinearly reduce the model order for efficient simulation(turbulence),or(3)to accelerate the simulation by predicting certain components in the traditional integration methods.Here,methods(1)and(2)relied on Long-Short-Term Memory(LSTM)architecture,with method(3)relying on convolutional neural networks.Pure ML methods to solve(nonlinear)PDEs are represented by Physics-Informed Neural network(PINN)methods,which could be combined with attention mechanism to address discontinuous solutions.Both LSTM and attention architectures,together with modern and generalized classic optimizers to include stochasticity for DL networks,are extensively reviewed.Kernel machines,including Gaussian processes,are provided to sufficient depth for more advanced works such as shallow networks with infinite width.Not only addressing experts,readers are assumed familiar with computational mechanics,but not with DL,whose concepts and applications are built up from the basics,aiming at bringing first-time learners quickly to the forefront of research.History and limitations of AI are recounted and discussed,with particular attention at pointing out misstatements or misconceptions of the classics,even in well-known references.Positioning and pointing control of a large-deformable beam is given as an example.
基金The authors appreciate the financial support provided by the Natural Science Foundation of China(No.41807294)This study was also financially supported by China Geological Survey Project(Nos.DD20190716 and 0001212020CC60002)。
文摘Landslide displacement prediction can enhance the efficacy of landslide monitoring system,and the prediction of the periodic displacement is particularly challenging.In the previous studies,static regression models(e.g.,support vector machine(SVM))were mostly used for predicting the periodic displacement.These models may have bad performances,when the dynamic features of landslide triggers are incorporated.This paper proposes a method for predicting the landslide displacement in a dynamic manner,based on the gated recurrent unit(GRU)neural network and complete ensemble empirical decomposition with adaptive noise(CEEMDAN).The CEEMDAN is used to decompose the training data,and the GRU is subsequently used for predicting the periodic displacement.Implementation procedures of the proposed method were illustrated by a case study in the Caojiatuo landslide area,and SVM was also adopted for the periodic displacement prediction.This case study shows that the predictors obtained by SVM are inaccurate,as the landslide displacement is in a pronouncedly step-wise manner.By contrast,the accuracy can be significantly improved using the dynamic predictive method.This paper reveals the significance of capturing the dynamic features of the inputs in the training process,when the machine learning models are adopted to predict the landslide displacement.
基金supported by the German National BMBF IKT2020-Grant(16SV7213)(EmotAsS)the European-Unions Horizon 2020 Research and Innovation Programme(688835)(DE-ENIGMA)the China Scholarship Council(CSC)
文摘Spectrogram representations of acoustic scenes have achieved competitive performance for acoustic scene classification. Yet, the spectrogram alone does not take into account a substantial amount of time-frequency information. In this study, we present an approach for exploring the benefits of deep scalogram representations, extracted in segments from an audio stream. The approach presented firstly transforms the segmented acoustic scenes into bump and morse scalograms, as well as spectrograms; secondly, the spectrograms or scalograms are sent into pre-trained convolutional neural networks; thirdly,the features extracted from a subsequent fully connected layer are fed into(bidirectional) gated recurrent neural networks, which are followed by a single highway layer and a softmax layer;finally, predictions from these three systems are fused by a margin sampling value strategy. We then evaluate the proposed approach using the acoustic scene classification data set of 2017 IEEE AASP Challenge on Detection and Classification of Acoustic Scenes and Events(DCASE). On the evaluation set, an accuracy of 64.0 % from bidirectional gated recurrent neural networks is obtained when fusing the spectrogram and the bump scalogram, which is an improvement on the 61.0 % baseline result provided by the DCASE 2017 organisers. This result shows that extracted bump scalograms are capable of improving the classification accuracy,when fusing with a spectrogram-based system.
文摘In lung nodules there is a huge variation in structural properties like Shape, Surface Texture. Even the spatial properties vary, where they can be found attached to lung walls, blood vessels in complex non-homogenous lung structures. Moreover, the nodules are of small size at their early stage of development. This poses a serious challenge to develop a Computer aided diagnosis (CAD) system with better false positive reduction. Hence, to reduce the false positives per scan and to deal with the challenges mentioned, this paper proposes a set of three diverse 3D Attention based CNN architectures (3D ACNN) whose predictions on given low dose Volumetric Computed Tomography (CT) scans are fused to achieve more effective and reliable results. Attention mechanism is employed to selectively concentrate/weigh more on nodule specific features and less weight age over other irrelevant features. By using this attention based mechanism in CNN unlike traditional methods there was a significant gain in the classification performance. Contextual dependencies are also taken into account by giving three patches of different sizes surrounding the nodule as input to the ACNN architectures. The system is trained and validated using a publicly available LUNA16 dataset in a 10 fold cross validation approach where a competition performance metric (CPM) score of 0.931 is achieved. The experimental results demonstrate that either a single patch or a single architecture in a one-to-one fashion that is adopted in earlier methods cannot achieve a better performance and signifies the necessity of fusing different multi patched architectures. Though the proposed system is mainly designed for pulmonary nodule detection it can be easily extended to classification tasks of any other 3D medical diagnostic computed tomography images where there is a huge variation and uncertainty in classification.
基金supported by the National Natural Science Foundation of China(Nos.62106283 and 72001214)。
文摘The battlefield environment is changing rapidly,and fast and accurate identification of the tactical intention of enemy targets is an important condition for gaining a decision-making advantage.The current Intention Recognition(IR)method for air targets has shortcomings in temporality,interpretability and back-and-forth dependency of intentions.To address these problems,this paper designs a novel air target intention recognition method named STABC-IR,which is based on Bidirectional Gated Recurrent Unit(Bi GRU)and Conditional Random Field(CRF)with Space-Time Attention mechanism(STA).First,the problem of intention recognition of air targets is described and analyzed in detail.Then,a temporal network based on Bi GRU is constructed to achieve the temporal requirement.Subsequently,STA is proposed to focus on the key parts of the features and timing information to meet certain interpretability requirements while strengthening the timing requirements.Finally,an intention transformation network based on CRF is proposed to solve the back-and-forth dependency and transformation problem by jointly modeling the tactical intention of the target at each moment.The experimental results show that the recognition accuracy of the jointly trained STABC-IR model can reach 95.7%,which is higher than other latest intention recognition methods.STABC-IR solves the problem of intention transformation for the first time and considers both temporality and interpretability,which is important for improving the tactical intention recognition capability and has reference value for the construction of command and control auxiliary decision-making system.