A correct and timely fault diagnosis is important for improving the safety and reliability of chemical processes. With the advancement of big data technology, data-driven fault diagnosis methods are being extensively ...A correct and timely fault diagnosis is important for improving the safety and reliability of chemical processes. With the advancement of big data technology, data-driven fault diagnosis methods are being extensively used and still have considerable potential. In recent years, methods based on deep neural networks have made significant breakthroughs, and fault diagnosis methods for industrial processes based on deep learning have attracted considerable research attention. Therefore, we propose a fusion deeplearning algorithm based on a fully convolutional neural network(FCN) to extract features and build models to correctly diagnose all types of faults. We use long short-term memory(LSTM) units to expand our proposed FCN so that our proposed deep learning model can better extract the time-domain features of chemical process data. We also introduce the attention mechanism into the model, aimed at highlighting the importance of features, which is significant for the fault diagnosis of chemical processes with many features. When applied to the benchmark Tennessee Eastman process, our proposed model exhibits impressive performance, demonstrating the effectiveness of the attention-based LSTM FCN in chemical process fault diagnosis.展开更多
There are two technical challenges in predicting slope deformation.The first one is the random displacement,which could not be decomposed and predicted by numerically resolving the observed accumulated displacement an...There are two technical challenges in predicting slope deformation.The first one is the random displacement,which could not be decomposed and predicted by numerically resolving the observed accumulated displacement and time series of a landslide.The second one is the dynamic evolution of a landslide,which could not be feasibly simulated simply by traditional prediction models.In this paper,a dynamic model of displacement prediction is introduced for composite landslides based on a combination of empirical mode decomposition with soft screening stop criteria(SSSC-EMD)and deep bidirectional long short-term memory(DBi-LSTM)neural network.In the proposed model,the time series analysis and SSSC-EMD are used to decompose the observed accumulated displacements of a slope into three components,viz.trend displacement,periodic displacement,and random displacement.Then,by analyzing the evolution pattern of a landslide and its key factors triggering landslides,appropriate influencing factors are selected for each displacement component,and DBi-LSTM neural network to carry out multi-datadriven dynamic prediction for each displacement component.An accumulated displacement prediction has been obtained by a summation of each component.For accuracy verification and engineering practicability of the model,field observations from two known landslides in China,the Xintan landslide and the Bazimen landslide were collected for comparison and evaluation.The case study verified that the model proposed in this paper can better characterize the"stepwise"deformation characteristics of a slope.As compared with long short-term memory(LSTM)neural network,support vector machine(SVM),and autoregressive integrated moving average(ARIMA)model,DBi-LSTM neural network has higher accuracy in predicting the periodic displacement of slope deformation,with the mean absolute percentage error reduced by 3.063%,14.913%,and 13.960%respectively,and the root mean square error reduced by 1.951 mm,8.954 mm and 7.790 mm respectively.Conclusively,this model not only has high prediction accuracy but also is more stable,which can provide new insight for practical landslide prevention and control engineering.展开更多
To explore new operational forecasting methods of waves,a forecasting model for wave heights at three stations in the Bohai Sea has been developed.This model is based on long short-term memory(LSTM)neural network with...To explore new operational forecasting methods of waves,a forecasting model for wave heights at three stations in the Bohai Sea has been developed.This model is based on long short-term memory(LSTM)neural network with sea surface wind and wave heights as training samples.The prediction performance of the model is evaluated,and the error analysis shows that when using the same set of numerically predicted sea surface wind as input,the prediction error produced by the proposed LSTM model at Sta.N01 is 20%,18%and 23%lower than the conventional numerical wave models in terms of the total root mean square error(RMSE),scatter index(SI)and mean absolute error(MAE),respectively.Particularly,for significant wave height in the range of 3–5 m,the prediction accuracy of the LSTM model is improved the most remarkably,with RMSE,SI and MAE all decreasing by 24%.It is also evident that the numbers of hidden neurons,the numbers of buoys used and the time length of training samples all have impact on the prediction accuracy.However,the prediction does not necessary improve with the increase of number of hidden neurons or number of buoys used.The experiment trained by data with the longest time length is found to perform the best overall compared to other experiments with a shorter time length for training.Overall,long short-term memory neural network was proved to be a very promising method for future development and applications in wave forecasting.展开更多
Purpose-To optimize train operations,dispatchers currently rely on experience for quick adjustments when delays occur.However,delay predictions often involve imprecise shifts based on known delay times.Real-time and a...Purpose-To optimize train operations,dispatchers currently rely on experience for quick adjustments when delays occur.However,delay predictions often involve imprecise shifts based on known delay times.Real-time and accurate train delay predictions,facilitated by data-driven neural network models,can significantly reduce dispatcher stress and improve adjustment plans.Leveraging current train operation data,these models enable swift and precise predictions,addressing challenges posed by train delays in high-speed rail networks during unforeseen events.Design/methodology/approach-This paper proposes CBLA-net,a neural network architecture for predicting late arrival times.It combines CNN,Bi-LSTM,and attention mechanisms to extract features,handle time series data,and enhance information utilization.Trained on operational data from the Beijing-Tianjin line,it predicts the late arrival time of a target train at the next station using multidimensional input data from the target and preceding trains.Findings-This study evaluates our model’s predictive performance using two data approaches:one considering full data and another focusing only on late arrivals.Results show precise and rapid predictions.Training with full data achieves aMAEof approximately 0.54 minutes and a RMSEof 0.65 minutes,surpassing the model trained solely on delay data(MAE:is about 1.02 min,RMSE:is about 1.52 min).Despite superior overall performance with full data,the model excels at predicting delays exceeding 15 minutes when trained exclusively on late arrivals.For enhanced adaptability to real-world train operations,training with full data is recommended.Originality/value-This paper introduces a novel neural network model,CBLA-net,for predicting train delay times.It innovatively compares and analyzes the model’s performance using both full data and delay data formats.Additionally,the evaluation of the network’s predictive capabilities considers different scenarios,providing a comprehensive demonstration of the model’s predictive performance.展开更多
In this paper,the recurrent neural network structure of a bidirectional long shortterm memory network(Bi-LSTM)with special memory cells that store information is used to characterize the deep features of the variation...In this paper,the recurrent neural network structure of a bidirectional long shortterm memory network(Bi-LSTM)with special memory cells that store information is used to characterize the deep features of the variation pattern between logging and seismic data.A mapping relationship model between high-frequency logging data and low-frequency seismic data is established via nonlinear mapping.The seismic waveform is infinitely approximated using the logging curve in the low-frequency band to obtain a nonlinear mapping model of this scale,which then stepwise approach the logging curve in the high-frequency band.Finally,a seismic-inversion method of nonlinear mapping multilevel well–seismic matching based on the Bi-LSTM network is developed.The characteristic of this method is that by applying the multilevel well–seismic matching process,the seismic data are stepwise matched to the scale range that is consistent with the logging curve.Further,the matching operator at each level can be stably obtained to effectively overcome the problems that occur in the well–seismic matching process,such as the inconsistency in the scale of two types of data,accuracy in extracting the seismic wavelet of the well-side seismic traces,and multiplicity of solutions.Model test and practical application demonstrate that this method improves the vertical resolution of inversion results,and at the same time,the boundary and the lateral characteristics of the sand body are well maintained to improve the accuracy of thin-layer sand body prediction and achieve an improved practical application effect.展开更多
Hand gestures are a natural way for human-robot interaction.Vision based dynamic hand gesture recognition has become a hot research topic due to its various applications.This paper presents a novel deep learning netwo...Hand gestures are a natural way for human-robot interaction.Vision based dynamic hand gesture recognition has become a hot research topic due to its various applications.This paper presents a novel deep learning network for hand gesture recognition.The network integrates several well-proved modules together to learn both short-term and long-term features from video inputs and meanwhile avoid intensive computation.To learn short-term features,each video input is segmented into a fixed number of frame groups.A frame is randomly selected from each group and represented as an RGB image as well as an optical flow snapshot.These two entities are fused and fed into a convolutional neural network(Conv Net)for feature extraction.The Conv Nets for all groups share parameters.To learn longterm features,outputs from all Conv Nets are fed into a long short-term memory(LSTM)network,by which a final classification result is predicted.The new model has been tested with two popular hand gesture datasets,namely the Jester dataset and Nvidia dataset.Comparing with other models,our model produced very competitive results.The robustness of the new model has also been proved with an augmented dataset with enhanced diversity of hand gestures.展开更多
Aiming at the problem of insufficient consideration of the correlation between components in the prediction of the remaining life of mechanical equipment,the method of remaining life prediction that combines the self-...Aiming at the problem of insufficient consideration of the correlation between components in the prediction of the remaining life of mechanical equipment,the method of remaining life prediction that combines the self-attention mechanism with the long short-term memory neural network(LSTM-NN)is proposed,called Self-Attention-LSTM.First,the auto-encoder is used to obtain the component-level state information;second,the state information of each component is input into the self-attention mechanism to learn the correlation between components;then,the multi-component correlation matrix is added to the LSTM input gate,and the LSTM-NN is used for life prediction.Finally,combined with the commercial modular aero-propulsion system simulation data set(C-MAPSS),the experiment was carried out and compared with the existing methods.Research results show that the proposed method can achieve better prediction accuracy and verify the feasibility of the method.展开更多
Lithium-ion batteries are commonly used in electric vehicles,mobile phones,and laptops.These batteries demonstrate several advantages,such as environmental friendliness,high energy density,and long life.However,batter...Lithium-ion batteries are commonly used in electric vehicles,mobile phones,and laptops.These batteries demonstrate several advantages,such as environmental friendliness,high energy density,and long life.However,battery overcharging and overdischarging may occur if the batteries are not monitored continuously.Overcharging causesfire and explosion casualties,and overdischar-ging causes a reduction in the battery capacity and life.In addition,the internal resistance of such batteries varies depending on their external temperature,elec-trolyte,cathode material,and other factors;the capacity of the batteries decreases with temperature.In this study,we develop a method for estimating the state of charge(SOC)using a neural network model that is best suited to the external tem-perature of such batteries based on their characteristics.During our simulation,we acquired data at temperatures of 25°C,30°C,35°C,and 40°C.Based on the tem-perature parameters,the voltage,current,and time parameters were obtained,and six cycles of the parameters based on the temperature were used for the experi-ment.Experimental data to verify the proposed method were obtained through a discharge experiment conducted using a vehicle driving simulator.The experi-mental data were provided as inputs to three types of neural network models:mul-tilayer neural network(MNN),long short-term memory(LSTM),and gated recurrent unit(GRU).The neural network models were trained and optimized for the specific temperatures measured during the experiment,and the SOC was estimated by selecting the most suitable model for each temperature.The experimental results revealed that the mean absolute errors of the MNN,LSTM,and GRU using the proposed method were 2.17%,2.19%,and 2.15%,respec-tively,which are better than those of the conventional method(4.47%,4.60%,and 4.40%).Finally,SOC estimation based on GRU using the proposed method was found to be 2.15%,which was the most accurate.展开更多
A precise and timely forecast of short-term rail transit passenger flow provides data support for traffic management and operation,assisting rail operators in efficiently allocating resources and timely relieving pres...A precise and timely forecast of short-term rail transit passenger flow provides data support for traffic management and operation,assisting rail operators in efficiently allocating resources and timely relieving pressure on passenger safety and operation.First,the passenger flow sequence models in the study are broken down using VMD for noise reduction.The objective environment features are then added to the characteristic factors that affect the passenger flow.The target station serves as an additional spatial feature and is mined concurrently using the KNN algorithm.It is shown that the hybrid model VMD-CLSMT has a higher prediction accuracy,by setting BP,CNN,and LSTM reference experiments.All models’second order prediction effects are superior to their first order effects,showing that the residual network can significantly raise model prediction accuracy.Additionally,it confirms the efficacy of supplementary and objective environmental features.展开更多
Financial time series prediction,whether for classification or regression,has been a heated research topic over the last decade.While traditional machine learning algorithms have experienced mediocre results,deep lear...Financial time series prediction,whether for classification or regression,has been a heated research topic over the last decade.While traditional machine learning algorithms have experienced mediocre results,deep learning has largely contributed to the elevation of the prediction performance.Currently,the most up-to-date review of advanced machine learning techniques for financial time series prediction is still lacking,making it challenging for finance domain experts and relevant practitioners to determine which model potentially performs better,what techniques and components are involved,and how themodel can be designed and implemented.This review article provides an overview of techniques,components and frameworks for financial time series prediction,with an emphasis on state-of-the-art deep learning models in the literature from2015 to 2023,including standalonemodels like convolutional neural networks(CNN)that are capable of extracting spatial dependencies within data,and long short-term memory(LSTM)that is designed for handling temporal dependencies;and hybrid models integrating CNN,LSTM,attention mechanism(AM)and other techniques.For illustration and comparison purposes,models proposed in recent studies are mapped to relevant elements of a generalized framework comprised of input,output,feature extraction,prediction,and related processes.Among the state-of-the-artmodels,hybrid models like CNNLSTMand CNN-LSTM-AM in general have been reported superior in performance to stand-alone models like the CNN-only model.Some remaining challenges have been discussed,including non-friendliness for finance domain experts,delayed prediction,domain knowledge negligence,lack of standards,and inability of real-time and highfrequency predictions.The principal contributions of this paper are to provide a one-stop guide for both academia and industry to review,compare and summarize technologies and recent advances in this area,to facilitate smooth and informed implementation,and to highlight future research directions.展开更多
Fraud of credit cards is a major issue for financial organizations and individuals.As fraudulent actions become more complex,a demand for better fraud detection systems is rising.Deep learning approaches have shown pr...Fraud of credit cards is a major issue for financial organizations and individuals.As fraudulent actions become more complex,a demand for better fraud detection systems is rising.Deep learning approaches have shown promise in several fields,including detecting credit card fraud.However,the efficacy of these models is heavily dependent on the careful selection of appropriate hyperparameters.This paper introduces models that integrate deep learning models with hyperparameter tuning techniques to learn the patterns and relationships within credit card transaction data,thereby improving fraud detection.Three deep learning models:AutoEncoder(AE),Convolution Neural Network(CNN),and Long Short-Term Memory(LSTM)are proposed to investigate how hyperparameter adjustment impacts the efficacy of deep learning models used to identify credit card fraud.The experiments conducted on a European credit card fraud dataset using different hyperparameters and three deep learning models demonstrate that the proposed models achieve a tradeoff between detection rate and precision,leading these models to be effective in accurately predicting credit card fraud.The results demonstrate that LSTM significantly outperformed AE and CNN in terms of accuracy(99.2%),detection rate(93.3%),and area under the curve(96.3%).These proposed models have surpassed those of existing studies and are expected to make a significant contribution to the field of credit card fraud detection.展开更多
Nowadays,with the rapid development of industrial Internet technology,on the one hand,advanced industrial control systems(ICS)have improved industrial production efficiency.However,there are more and more cyber-attack...Nowadays,with the rapid development of industrial Internet technology,on the one hand,advanced industrial control systems(ICS)have improved industrial production efficiency.However,there are more and more cyber-attacks targeting industrial control systems.To ensure the security of industrial networks,intrusion detection systems have been widely used in industrial control systems,and deep neural networks have always been an effective method for identifying cyber attacks.Current intrusion detection methods still suffer from low accuracy and a high false alarm rate.Therefore,it is important to build a more efficient intrusion detection model.This paper proposes a hybrid deep learning intrusion detection method based on convolutional neural networks and bidirectional long short-term memory neural networks(CNN-BiLSTM).To address the issue of imbalanced data within the dataset and improve the model’s detection capabilities,the Synthetic Minority Over-sampling Technique-Edited Nearest Neighbors(SMOTE-ENN)algorithm is applied in the preprocessing phase.This algorithm is employed to generate synthetic instances for the minority class,simultaneously mitigating the impact of noise in the majority class.This approach aims to create a more equitable distribution of classes,thereby enhancing the model’s ability to effectively identify patterns in both minority and majority classes.In the experimental phase,the detection performance of the method is verified using two data sets.Experimental results show that the accuracy rate on the CICIDS-2017 data set reaches 97.7%.On the natural gas pipeline dataset collected by Lan Turnipseed from Mississippi State University in the United States,the accuracy rate also reaches 85.5%.展开更多
Maintaining a steady power supply requires accurate forecasting of solar irradiance,since clean energy resources do not provide steady power.The existing forecasting studies have examined the limited effects of weathe...Maintaining a steady power supply requires accurate forecasting of solar irradiance,since clean energy resources do not provide steady power.The existing forecasting studies have examined the limited effects of weather conditions on solar radiation such as temperature and precipitation utilizing convolutional neural network(CNN),but no comprehensive study has been conducted on concentrations of air pollutants along with weather conditions.This paper proposes a hybrid approach based on deep learning,expanding the feature set by adding new air pollution concentrations,and ranking these features to select and reduce their size to improve efficiency.In order to improve the accuracy of feature selection,a maximum-dependency and minimum-redundancy(mRMR)criterion is applied to the constructed feature space to identify and rank the features.The combination of air pollution data with weather conditions data has enabled the prediction of solar irradiance with a higher accuracy.An evaluation of the proposed approach is conducted in Istanbul over 12 months for 43791 discrete times,with the main purpose of analyzing air data,including particular matter(PM10 and PM25),carbon monoxide(CO),nitric oxide(NOX),nitrogen dioxide(NO_(2)),ozone(O₃),sulfur dioxide(SO_(2))using a CNN,a long short-term memory network(LSTM),and MRMR feature extraction.Compared with the benchmark models with root mean square error(RMSE)results of 76.2,60.3,41.3,32.4,there is a significant improvement with the RMSE result of 5.536.This hybrid model presented here offers high prediction accuracy,a wider feature set,and a novel approach based on air concentrations combined with weather conditions for solar irradiance prediction.展开更多
The rapid development of unmanned aerial vehicle(UAV) swarm, a new type of aerial threat target, has brought great pressure to the air defense early warning system. At present, most of the track correlation algorithms...The rapid development of unmanned aerial vehicle(UAV) swarm, a new type of aerial threat target, has brought great pressure to the air defense early warning system. At present, most of the track correlation algorithms only use part of the target location, speed, and other information for correlation.In this paper, the artificial neural network method is used to establish the corresponding intelligent track correlation model and method according to the characteristics of swarm targets.Precisely, a route correlation method based on convolutional neural networks (CNN) and long short-term memory (LSTM)Neural network is designed. In this model, the CNN is used to extract the formation characteristics of UAV swarm and the spatial position characteristics of single UAV track in the formation,while the LSTM is used to extract the time characteristics of UAV swarm. Experimental results show that compared with the traditional algorithms, the algorithm based on CNN-LSTM neural network can make full use of multiple feature information of the target, and has better robustness and accuracy for swarm targets.展开更多
To address the shortcomings of single-step decision making in the existing deep reinforcement learning based unmanned aerial vehicle(UAV)real-time path planning problem,a real-time UAV path planning algorithm based on...To address the shortcomings of single-step decision making in the existing deep reinforcement learning based unmanned aerial vehicle(UAV)real-time path planning problem,a real-time UAV path planning algorithm based on long shortterm memory(RPP-LSTM)network is proposed,which combines the memory characteristics of recurrent neural network(RNN)and the deep reinforcement learning algorithm.LSTM networks are used in this algorithm as Q-value networks for the deep Q network(DQN)algorithm,which makes the decision of the Q-value network has some memory.Thanks to LSTM network,the Q-value network can use the previous environmental information and action information which effectively avoids the problem of single-step decision considering only the current environment.Besides,the algorithm proposes a hierarchical reward and punishment function for the specific problem of UAV real-time path planning,so that the UAV can more reasonably perform path planning.Simulation verification shows that compared with the traditional feed-forward neural network(FNN)based UAV autonomous path planning algorithm,the RPP-LSTM proposed in this paper can adapt to more complex environments and has significantly improved robustness and accuracy when performing UAV real-time path planning.展开更多
The complexity of river-tide interaction poses a significant challenge in predicting discharge in tidal rivers.Long short-term memory(LSTM)networks excel in processing and predicting crucial events with extended inter...The complexity of river-tide interaction poses a significant challenge in predicting discharge in tidal rivers.Long short-term memory(LSTM)networks excel in processing and predicting crucial events with extended intervals and time delays in time series data.Additionally,the sequence-to-sequence(Seq2Seq)model,known for handling temporal relationships,adapting to variable-length sequences,effectively capturing historical information,and accommodating various influencing factors,emerges as a robust and flexible tool in discharge forecasting.In this study,we introduce the application of LSTM-based Seq2Seq models for the first time in forecasting the discharge of a tidal reach of the Changjiang River(Yangtze River)Estuary.This study focuses on discharge forecasting using three key input characteristics:flow velocity,water level,and discharge,which means the structure of multiple input and single output is adopted.The experiment used the discharge data of the whole year of 2020,of which the first 80%is used as the training set,and the last 20%is used as the test set.This means that the data covers different tidal cycles,which helps to test the forecasting effect of different models in different tidal cycles and different runoff.The experimental results indicate that the proposed models demonstrate advantages in long-term,mid-term,and short-term discharge forecasting.The Seq2Seq models improved by 6%-60%and 5%-20%of the relative standard deviation compared to the harmonic analysis models and improved back propagation neural network models in discharge prediction,respectively.In addition,the relative accuracy of the Seq2Seq model is 1%to 3%higher than that of the LSTM model.Analytical assessment of the prediction errors shows that the Seq2Seq models are insensitive to the forecast lead time and they can capture characteristic values such as maximum flood tide flow and maximum ebb tide flow in the tidal cycle well.This indicates the significance of the Seq2Seq models.展开更多
Amid the randomness and volatility of wind speed, an improved VMD-BP-CNN-LSTM model for short-term wind speed prediction was proposed to assist in power system planning and operation in this paper. Firstly, the wind s...Amid the randomness and volatility of wind speed, an improved VMD-BP-CNN-LSTM model for short-term wind speed prediction was proposed to assist in power system planning and operation in this paper. Firstly, the wind speed time series data was processed using Variational Mode Decomposition (VMD) to obtain multiple frequency components. Then, each individual frequency component was channeled into a combined prediction framework consisting of BP neural network (BPNN), Convolutional Neural Network (CNN) and Long Short-Term Memory Network (LSTM) after the execution of differential and normalization operations. Thereafter, the predictive outputs for each component underwent integration through a fully-connected neural architecture for data fusion processing, resulting in the final prediction. The VMD decomposition technique was introduced in a generalized CNN-LSTM prediction model;a BPNN model was utilized to predict high-frequency components obtained from VMD, and incorporated a fully connected neural network for data fusion of individual component predictions. Experimental results demonstrated that the proposed improved VMD-BP-CNN-LSTM model outperformed other combined prediction models in terms of prediction accuracy, providing a solid foundation for optimizing the safe operation of wind farms.展开更多
As a high efficiency hydrogen-to-power device,proton exchange membrane fuel cell(PEMFC)attracts much attention,especially for the automotive applications.Real-time prediction of output voltage and area specific resist...As a high efficiency hydrogen-to-power device,proton exchange membrane fuel cell(PEMFC)attracts much attention,especially for the automotive applications.Real-time prediction of output voltage and area specific resistance(ASR)via the on-board model is critical to monitor the health state of the automotive PEMFC stack.In this study,we use a transient PEMFC system model for dynamic process simulation of PEMFC to generate the dataset,and a long short-term memory(LSTM)deep learning model is developed to predict the dynamic per-formance of PEMFC.The results show that the developed LSTM deep learning model has much better perfor-mance than other models.A sensitivity analysis on the input features is performed,and three insensitive features are removed,that could slightly improve the prediction accuracy and significantly reduce the data volume.The neural structure,sequence duration,and sampling frequency are optimized.We find that the optimal sequence data duration for predicting ASR is 5 s or 20 s,and that for predicting output voltage is 40 s.The sampling frequency can be reduced from 10 Hz to 0.5 Hz and 0.25 Hz,which slightly affects the prediction accuracy,but obviously reduces the data volume and computation amount.展开更多
Knowledge of pore-water pressure(PWP)variation is fundamental for slope stability.A precise prediction of PWP is difficult due to complex physical mechanisms and in situ natural variability.To explore the applicabilit...Knowledge of pore-water pressure(PWP)variation is fundamental for slope stability.A precise prediction of PWP is difficult due to complex physical mechanisms and in situ natural variability.To explore the applicability and advantages of recurrent neural networks(RNNs)on PWP prediction,three variants of RNNs,i.e.,standard RNN,long short-term memory(LSTM)and gated recurrent unit(GRU)are adopted and compared with a traditional static artificial neural network(ANN),i.e.,multi-layer perceptron(MLP).Measurements of rainfall and PWP of representative piezometers from a fully instrumented natural slope in Hong Kong are used to establish the prediction models.The coefficient of determination(R^2)and root mean square error(RMSE)are used for model evaluations.The influence of input time series length on the model performance is investigated.The results reveal that MLP can provide acceptable performance but is not robust.The uncertainty bounds of RMSE of the MLP model range from 0.24 kPa to 1.12 k Pa for the selected two piezometers.The standard RNN can perform better but the robustness is slightly affected when there are significant time lags between PWP changes and rainfall.The GRU and LSTM models can provide more precise and robust predictions than the standard RNN.The effects of the hidden layer structure and the dropout technique are investigated.The single-layer GRU is accurate enough for PWP prediction,whereas a double-layer GRU brings extra time cost with little accuracy improvement.The dropout technique is essential to overfitting prevention and improvement of accuracy.展开更多
The remaining useful life(RUL)of a system is generally predicted by utilising the data collected from the sensors that continuously monitor different indicators.Recently,different deep learning(DL)techniques have been...The remaining useful life(RUL)of a system is generally predicted by utilising the data collected from the sensors that continuously monitor different indicators.Recently,different deep learning(DL)techniques have been used for RUL prediction and achieved great success.Because the data is often time-sequential,recurrent neural network(RNN)has attracted significant interests due to its efficiency in dealing with such data.This paper systematically reviews RNN and its variants for RUL prediction,with a specific focus on understanding how different components(e.g.,types of optimisers and activation functions)or parameters(e.g.,sequence length,neuron quantities)affect their performance.After that,a case study using the well-studied NASA’s C-MAPSS dataset is presented to quantitatively evaluate the influence of various state-of-the-art RNN structures on the RUL prediction performance.The result suggests that the variant methods usually perform better than the original RNN,and among which,Bi-directional Long Short-Term Memory generally has the best performance in terms of stability,precision and accuracy.Certain model structures may fail to produce valid RUL prediction result due to the gradient vanishing or gradient exploring problem if the parameters are not chosen appropriately.It is concluded that parameter tuning is a crucial step to achieve optimal prediction performance.展开更多
文摘A correct and timely fault diagnosis is important for improving the safety and reliability of chemical processes. With the advancement of big data technology, data-driven fault diagnosis methods are being extensively used and still have considerable potential. In recent years, methods based on deep neural networks have made significant breakthroughs, and fault diagnosis methods for industrial processes based on deep learning have attracted considerable research attention. Therefore, we propose a fusion deeplearning algorithm based on a fully convolutional neural network(FCN) to extract features and build models to correctly diagnose all types of faults. We use long short-term memory(LSTM) units to expand our proposed FCN so that our proposed deep learning model can better extract the time-domain features of chemical process data. We also introduce the attention mechanism into the model, aimed at highlighting the importance of features, which is significant for the fault diagnosis of chemical processes with many features. When applied to the benchmark Tennessee Eastman process, our proposed model exhibits impressive performance, demonstrating the effectiveness of the attention-based LSTM FCN in chemical process fault diagnosis.
文摘There are two technical challenges in predicting slope deformation.The first one is the random displacement,which could not be decomposed and predicted by numerically resolving the observed accumulated displacement and time series of a landslide.The second one is the dynamic evolution of a landslide,which could not be feasibly simulated simply by traditional prediction models.In this paper,a dynamic model of displacement prediction is introduced for composite landslides based on a combination of empirical mode decomposition with soft screening stop criteria(SSSC-EMD)and deep bidirectional long short-term memory(DBi-LSTM)neural network.In the proposed model,the time series analysis and SSSC-EMD are used to decompose the observed accumulated displacements of a slope into three components,viz.trend displacement,periodic displacement,and random displacement.Then,by analyzing the evolution pattern of a landslide and its key factors triggering landslides,appropriate influencing factors are selected for each displacement component,and DBi-LSTM neural network to carry out multi-datadriven dynamic prediction for each displacement component.An accumulated displacement prediction has been obtained by a summation of each component.For accuracy verification and engineering practicability of the model,field observations from two known landslides in China,the Xintan landslide and the Bazimen landslide were collected for comparison and evaluation.The case study verified that the model proposed in this paper can better characterize the"stepwise"deformation characteristics of a slope.As compared with long short-term memory(LSTM)neural network,support vector machine(SVM),and autoregressive integrated moving average(ARIMA)model,DBi-LSTM neural network has higher accuracy in predicting the periodic displacement of slope deformation,with the mean absolute percentage error reduced by 3.063%,14.913%,and 13.960%respectively,and the root mean square error reduced by 1.951 mm,8.954 mm and 7.790 mm respectively.Conclusively,this model not only has high prediction accuracy but also is more stable,which can provide new insight for practical landslide prevention and control engineering.
基金The National Key R&D Program of China under contract No.2016YFC1402103
文摘To explore new operational forecasting methods of waves,a forecasting model for wave heights at three stations in the Bohai Sea has been developed.This model is based on long short-term memory(LSTM)neural network with sea surface wind and wave heights as training samples.The prediction performance of the model is evaluated,and the error analysis shows that when using the same set of numerically predicted sea surface wind as input,the prediction error produced by the proposed LSTM model at Sta.N01 is 20%,18%and 23%lower than the conventional numerical wave models in terms of the total root mean square error(RMSE),scatter index(SI)and mean absolute error(MAE),respectively.Particularly,for significant wave height in the range of 3–5 m,the prediction accuracy of the LSTM model is improved the most remarkably,with RMSE,SI and MAE all decreasing by 24%.It is also evident that the numbers of hidden neurons,the numbers of buoys used and the time length of training samples all have impact on the prediction accuracy.However,the prediction does not necessary improve with the increase of number of hidden neurons or number of buoys used.The experiment trained by data with the longest time length is found to perform the best overall compared to other experiments with a shorter time length for training.Overall,long short-term memory neural network was proved to be a very promising method for future development and applications in wave forecasting.
基金supported in part by the National Natural Science Foundation of China under Grant 62203468in part by the Technological Research and Development Program of China State Railway Group Co.,Ltd.under Grant Q2023X011+1 种基金in part by the Young Elite Scientist Sponsorship Program by China Association for Science and Technology(CAST)under Grant 2022QNRC001in part by the Youth Talent Program Supported by China Railway Society,and in part by the Research Program of China Academy of Railway Sciences Corporation Limited under Grant 2023YJ112.
文摘Purpose-To optimize train operations,dispatchers currently rely on experience for quick adjustments when delays occur.However,delay predictions often involve imprecise shifts based on known delay times.Real-time and accurate train delay predictions,facilitated by data-driven neural network models,can significantly reduce dispatcher stress and improve adjustment plans.Leveraging current train operation data,these models enable swift and precise predictions,addressing challenges posed by train delays in high-speed rail networks during unforeseen events.Design/methodology/approach-This paper proposes CBLA-net,a neural network architecture for predicting late arrival times.It combines CNN,Bi-LSTM,and attention mechanisms to extract features,handle time series data,and enhance information utilization.Trained on operational data from the Beijing-Tianjin line,it predicts the late arrival time of a target train at the next station using multidimensional input data from the target and preceding trains.Findings-This study evaluates our model’s predictive performance using two data approaches:one considering full data and another focusing only on late arrivals.Results show precise and rapid predictions.Training with full data achieves aMAEof approximately 0.54 minutes and a RMSEof 0.65 minutes,surpassing the model trained solely on delay data(MAE:is about 1.02 min,RMSE:is about 1.52 min).Despite superior overall performance with full data,the model excels at predicting delays exceeding 15 minutes when trained exclusively on late arrivals.For enhanced adaptability to real-world train operations,training with full data is recommended.Originality/value-This paper introduces a novel neural network model,CBLA-net,for predicting train delay times.It innovatively compares and analyzes the model’s performance using both full data and delay data formats.Additionally,the evaluation of the network’s predictive capabilities considers different scenarios,providing a comprehensive demonstration of the model’s predictive performance.
基金supported by the National Major Science and Technology Special Project(No.2016ZX05026-002).
文摘In this paper,the recurrent neural network structure of a bidirectional long shortterm memory network(Bi-LSTM)with special memory cells that store information is used to characterize the deep features of the variation pattern between logging and seismic data.A mapping relationship model between high-frequency logging data and low-frequency seismic data is established via nonlinear mapping.The seismic waveform is infinitely approximated using the logging curve in the low-frequency band to obtain a nonlinear mapping model of this scale,which then stepwise approach the logging curve in the high-frequency band.Finally,a seismic-inversion method of nonlinear mapping multilevel well–seismic matching based on the Bi-LSTM network is developed.The characteristic of this method is that by applying the multilevel well–seismic matching process,the seismic data are stepwise matched to the scale range that is consistent with the logging curve.Further,the matching operator at each level can be stably obtained to effectively overcome the problems that occur in the well–seismic matching process,such as the inconsistency in the scale of two types of data,accuracy in extracting the seismic wavelet of the well-side seismic traces,and multiplicity of solutions.Model test and practical application demonstrate that this method improves the vertical resolution of inversion results,and at the same time,the boundary and the lateral characteristics of the sand body are well maintained to improve the accuracy of thin-layer sand body prediction and achieve an improved practical application effect.
文摘Hand gestures are a natural way for human-robot interaction.Vision based dynamic hand gesture recognition has become a hot research topic due to its various applications.This paper presents a novel deep learning network for hand gesture recognition.The network integrates several well-proved modules together to learn both short-term and long-term features from video inputs and meanwhile avoid intensive computation.To learn short-term features,each video input is segmented into a fixed number of frame groups.A frame is randomly selected from each group and represented as an RGB image as well as an optical flow snapshot.These two entities are fused and fed into a convolutional neural network(Conv Net)for feature extraction.The Conv Nets for all groups share parameters.To learn longterm features,outputs from all Conv Nets are fed into a long short-term memory(LSTM)network,by which a final classification result is predicted.The new model has been tested with two popular hand gesture datasets,namely the Jester dataset and Nvidia dataset.Comparing with other models,our model produced very competitive results.The robustness of the new model has also been proved with an augmented dataset with enhanced diversity of hand gestures.
基金the National Natural Science Foundation of China(Nos.51875451 and 51834006)。
文摘Aiming at the problem of insufficient consideration of the correlation between components in the prediction of the remaining life of mechanical equipment,the method of remaining life prediction that combines the self-attention mechanism with the long short-term memory neural network(LSTM-NN)is proposed,called Self-Attention-LSTM.First,the auto-encoder is used to obtain the component-level state information;second,the state information of each component is input into the self-attention mechanism to learn the correlation between components;then,the multi-component correlation matrix is added to the LSTM input gate,and the LSTM-NN is used for life prediction.Finally,combined with the commercial modular aero-propulsion system simulation data set(C-MAPSS),the experiment was carried out and compared with the existing methods.Research results show that the proposed method can achieve better prediction accuracy and verify the feasibility of the method.
基金supported by the BK21 FOUR project funded by the Ministry of Education,Korea(4199990113966).
文摘Lithium-ion batteries are commonly used in electric vehicles,mobile phones,and laptops.These batteries demonstrate several advantages,such as environmental friendliness,high energy density,and long life.However,battery overcharging and overdischarging may occur if the batteries are not monitored continuously.Overcharging causesfire and explosion casualties,and overdischar-ging causes a reduction in the battery capacity and life.In addition,the internal resistance of such batteries varies depending on their external temperature,elec-trolyte,cathode material,and other factors;the capacity of the batteries decreases with temperature.In this study,we develop a method for estimating the state of charge(SOC)using a neural network model that is best suited to the external tem-perature of such batteries based on their characteristics.During our simulation,we acquired data at temperatures of 25°C,30°C,35°C,and 40°C.Based on the tem-perature parameters,the voltage,current,and time parameters were obtained,and six cycles of the parameters based on the temperature were used for the experi-ment.Experimental data to verify the proposed method were obtained through a discharge experiment conducted using a vehicle driving simulator.The experi-mental data were provided as inputs to three types of neural network models:mul-tilayer neural network(MNN),long short-term memory(LSTM),and gated recurrent unit(GRU).The neural network models were trained and optimized for the specific temperatures measured during the experiment,and the SOC was estimated by selecting the most suitable model for each temperature.The experimental results revealed that the mean absolute errors of the MNN,LSTM,and GRU using the proposed method were 2.17%,2.19%,and 2.15%,respec-tively,which are better than those of the conventional method(4.47%,4.60%,and 4.40%).Finally,SOC estimation based on GRU using the proposed method was found to be 2.15%,which was the most accurate.
基金the Major Projects of the National Social Science Fund in China(21&ZD127).
文摘A precise and timely forecast of short-term rail transit passenger flow provides data support for traffic management and operation,assisting rail operators in efficiently allocating resources and timely relieving pressure on passenger safety and operation.First,the passenger flow sequence models in the study are broken down using VMD for noise reduction.The objective environment features are then added to the characteristic factors that affect the passenger flow.The target station serves as an additional spatial feature and is mined concurrently using the KNN algorithm.It is shown that the hybrid model VMD-CLSMT has a higher prediction accuracy,by setting BP,CNN,and LSTM reference experiments.All models’second order prediction effects are superior to their first order effects,showing that the residual network can significantly raise model prediction accuracy.Additionally,it confirms the efficacy of supplementary and objective environmental features.
基金funded by the Natural Science Foundation of Fujian Province,China (Grant No.2022J05291)Xiamen Scientific Research Funding for Overseas Chinese Scholars.
文摘Financial time series prediction,whether for classification or regression,has been a heated research topic over the last decade.While traditional machine learning algorithms have experienced mediocre results,deep learning has largely contributed to the elevation of the prediction performance.Currently,the most up-to-date review of advanced machine learning techniques for financial time series prediction is still lacking,making it challenging for finance domain experts and relevant practitioners to determine which model potentially performs better,what techniques and components are involved,and how themodel can be designed and implemented.This review article provides an overview of techniques,components and frameworks for financial time series prediction,with an emphasis on state-of-the-art deep learning models in the literature from2015 to 2023,including standalonemodels like convolutional neural networks(CNN)that are capable of extracting spatial dependencies within data,and long short-term memory(LSTM)that is designed for handling temporal dependencies;and hybrid models integrating CNN,LSTM,attention mechanism(AM)and other techniques.For illustration and comparison purposes,models proposed in recent studies are mapped to relevant elements of a generalized framework comprised of input,output,feature extraction,prediction,and related processes.Among the state-of-the-artmodels,hybrid models like CNNLSTMand CNN-LSTM-AM in general have been reported superior in performance to stand-alone models like the CNN-only model.Some remaining challenges have been discussed,including non-friendliness for finance domain experts,delayed prediction,domain knowledge negligence,lack of standards,and inability of real-time and highfrequency predictions.The principal contributions of this paper are to provide a one-stop guide for both academia and industry to review,compare and summarize technologies and recent advances in this area,to facilitate smooth and informed implementation,and to highlight future research directions.
文摘Fraud of credit cards is a major issue for financial organizations and individuals.As fraudulent actions become more complex,a demand for better fraud detection systems is rising.Deep learning approaches have shown promise in several fields,including detecting credit card fraud.However,the efficacy of these models is heavily dependent on the careful selection of appropriate hyperparameters.This paper introduces models that integrate deep learning models with hyperparameter tuning techniques to learn the patterns and relationships within credit card transaction data,thereby improving fraud detection.Three deep learning models:AutoEncoder(AE),Convolution Neural Network(CNN),and Long Short-Term Memory(LSTM)are proposed to investigate how hyperparameter adjustment impacts the efficacy of deep learning models used to identify credit card fraud.The experiments conducted on a European credit card fraud dataset using different hyperparameters and three deep learning models demonstrate that the proposed models achieve a tradeoff between detection rate and precision,leading these models to be effective in accurately predicting credit card fraud.The results demonstrate that LSTM significantly outperformed AE and CNN in terms of accuracy(99.2%),detection rate(93.3%),and area under the curve(96.3%).These proposed models have surpassed those of existing studies and are expected to make a significant contribution to the field of credit card fraud detection.
基金support from the Liaoning Province Nature Fund Project(No.2022-MS-291)the Scientific Research Project of Liaoning Province Education Department(LJKMZ20220781,LJKMZ20220783,LJKQZ20222457,JYTMS20231488).
文摘Nowadays,with the rapid development of industrial Internet technology,on the one hand,advanced industrial control systems(ICS)have improved industrial production efficiency.However,there are more and more cyber-attacks targeting industrial control systems.To ensure the security of industrial networks,intrusion detection systems have been widely used in industrial control systems,and deep neural networks have always been an effective method for identifying cyber attacks.Current intrusion detection methods still suffer from low accuracy and a high false alarm rate.Therefore,it is important to build a more efficient intrusion detection model.This paper proposes a hybrid deep learning intrusion detection method based on convolutional neural networks and bidirectional long short-term memory neural networks(CNN-BiLSTM).To address the issue of imbalanced data within the dataset and improve the model’s detection capabilities,the Synthetic Minority Over-sampling Technique-Edited Nearest Neighbors(SMOTE-ENN)algorithm is applied in the preprocessing phase.This algorithm is employed to generate synthetic instances for the minority class,simultaneously mitigating the impact of noise in the majority class.This approach aims to create a more equitable distribution of classes,thereby enhancing the model’s ability to effectively identify patterns in both minority and majority classes.In the experimental phase,the detection performance of the method is verified using two data sets.Experimental results show that the accuracy rate on the CICIDS-2017 data set reaches 97.7%.On the natural gas pipeline dataset collected by Lan Turnipseed from Mississippi State University in the United States,the accuracy rate also reaches 85.5%.
文摘Maintaining a steady power supply requires accurate forecasting of solar irradiance,since clean energy resources do not provide steady power.The existing forecasting studies have examined the limited effects of weather conditions on solar radiation such as temperature and precipitation utilizing convolutional neural network(CNN),but no comprehensive study has been conducted on concentrations of air pollutants along with weather conditions.This paper proposes a hybrid approach based on deep learning,expanding the feature set by adding new air pollution concentrations,and ranking these features to select and reduce their size to improve efficiency.In order to improve the accuracy of feature selection,a maximum-dependency and minimum-redundancy(mRMR)criterion is applied to the constructed feature space to identify and rank the features.The combination of air pollution data with weather conditions data has enabled the prediction of solar irradiance with a higher accuracy.An evaluation of the proposed approach is conducted in Istanbul over 12 months for 43791 discrete times,with the main purpose of analyzing air data,including particular matter(PM10 and PM25),carbon monoxide(CO),nitric oxide(NOX),nitrogen dioxide(NO_(2)),ozone(O₃),sulfur dioxide(SO_(2))using a CNN,a long short-term memory network(LSTM),and MRMR feature extraction.Compared with the benchmark models with root mean square error(RMSE)results of 76.2,60.3,41.3,32.4,there is a significant improvement with the RMSE result of 5.536.This hybrid model presented here offers high prediction accuracy,a wider feature set,and a novel approach based on air concentrations combined with weather conditions for solar irradiance prediction.
文摘The rapid development of unmanned aerial vehicle(UAV) swarm, a new type of aerial threat target, has brought great pressure to the air defense early warning system. At present, most of the track correlation algorithms only use part of the target location, speed, and other information for correlation.In this paper, the artificial neural network method is used to establish the corresponding intelligent track correlation model and method according to the characteristics of swarm targets.Precisely, a route correlation method based on convolutional neural networks (CNN) and long short-term memory (LSTM)Neural network is designed. In this model, the CNN is used to extract the formation characteristics of UAV swarm and the spatial position characteristics of single UAV track in the formation,while the LSTM is used to extract the time characteristics of UAV swarm. Experimental results show that compared with the traditional algorithms, the algorithm based on CNN-LSTM neural network can make full use of multiple feature information of the target, and has better robustness and accuracy for swarm targets.
基金supported by the Natural Science Basic Research Prog ram of Shaanxi(2022JQ-593)。
文摘To address the shortcomings of single-step decision making in the existing deep reinforcement learning based unmanned aerial vehicle(UAV)real-time path planning problem,a real-time UAV path planning algorithm based on long shortterm memory(RPP-LSTM)network is proposed,which combines the memory characteristics of recurrent neural network(RNN)and the deep reinforcement learning algorithm.LSTM networks are used in this algorithm as Q-value networks for the deep Q network(DQN)algorithm,which makes the decision of the Q-value network has some memory.Thanks to LSTM network,the Q-value network can use the previous environmental information and action information which effectively avoids the problem of single-step decision considering only the current environment.Besides,the algorithm proposes a hierarchical reward and punishment function for the specific problem of UAV real-time path planning,so that the UAV can more reasonably perform path planning.Simulation verification shows that compared with the traditional feed-forward neural network(FNN)based UAV autonomous path planning algorithm,the RPP-LSTM proposed in this paper can adapt to more complex environments and has significantly improved robustness and accuracy when performing UAV real-time path planning.
基金The National Natural Science Foundation of China under contract Nos 42266006 and 41806114the Jiangxi Provincial Natural Science Foundation under contract Nos 20232BAB204089 and 20202ACBL214019.
文摘The complexity of river-tide interaction poses a significant challenge in predicting discharge in tidal rivers.Long short-term memory(LSTM)networks excel in processing and predicting crucial events with extended intervals and time delays in time series data.Additionally,the sequence-to-sequence(Seq2Seq)model,known for handling temporal relationships,adapting to variable-length sequences,effectively capturing historical information,and accommodating various influencing factors,emerges as a robust and flexible tool in discharge forecasting.In this study,we introduce the application of LSTM-based Seq2Seq models for the first time in forecasting the discharge of a tidal reach of the Changjiang River(Yangtze River)Estuary.This study focuses on discharge forecasting using three key input characteristics:flow velocity,water level,and discharge,which means the structure of multiple input and single output is adopted.The experiment used the discharge data of the whole year of 2020,of which the first 80%is used as the training set,and the last 20%is used as the test set.This means that the data covers different tidal cycles,which helps to test the forecasting effect of different models in different tidal cycles and different runoff.The experimental results indicate that the proposed models demonstrate advantages in long-term,mid-term,and short-term discharge forecasting.The Seq2Seq models improved by 6%-60%and 5%-20%of the relative standard deviation compared to the harmonic analysis models and improved back propagation neural network models in discharge prediction,respectively.In addition,the relative accuracy of the Seq2Seq model is 1%to 3%higher than that of the LSTM model.Analytical assessment of the prediction errors shows that the Seq2Seq models are insensitive to the forecast lead time and they can capture characteristic values such as maximum flood tide flow and maximum ebb tide flow in the tidal cycle well.This indicates the significance of the Seq2Seq models.
文摘Amid the randomness and volatility of wind speed, an improved VMD-BP-CNN-LSTM model for short-term wind speed prediction was proposed to assist in power system planning and operation in this paper. Firstly, the wind speed time series data was processed using Variational Mode Decomposition (VMD) to obtain multiple frequency components. Then, each individual frequency component was channeled into a combined prediction framework consisting of BP neural network (BPNN), Convolutional Neural Network (CNN) and Long Short-Term Memory Network (LSTM) after the execution of differential and normalization operations. Thereafter, the predictive outputs for each component underwent integration through a fully-connected neural architecture for data fusion processing, resulting in the final prediction. The VMD decomposition technique was introduced in a generalized CNN-LSTM prediction model;a BPNN model was utilized to predict high-frequency components obtained from VMD, and incorporated a fully connected neural network for data fusion of individual component predictions. Experimental results demonstrated that the proposed improved VMD-BP-CNN-LSTM model outperformed other combined prediction models in terms of prediction accuracy, providing a solid foundation for optimizing the safe operation of wind farms.
基金This research is supported by the National Natural Science Founda-tion of China(No.52176196)the National Key Research and Devel-opment Program of China(No.2022YFE0103100)+1 种基金the China Postdoctoral Science Foundation(No.2021TQ0235)the Hong Kong Scholars Program(No.XJ2021033).
文摘As a high efficiency hydrogen-to-power device,proton exchange membrane fuel cell(PEMFC)attracts much attention,especially for the automotive applications.Real-time prediction of output voltage and area specific resistance(ASR)via the on-board model is critical to monitor the health state of the automotive PEMFC stack.In this study,we use a transient PEMFC system model for dynamic process simulation of PEMFC to generate the dataset,and a long short-term memory(LSTM)deep learning model is developed to predict the dynamic per-formance of PEMFC.The results show that the developed LSTM deep learning model has much better perfor-mance than other models.A sensitivity analysis on the input features is performed,and three insensitive features are removed,that could slightly improve the prediction accuracy and significantly reduce the data volume.The neural structure,sequence duration,and sampling frequency are optimized.We find that the optimal sequence data duration for predicting ASR is 5 s or 20 s,and that for predicting output voltage is 40 s.The sampling frequency can be reduced from 10 Hz to 0.5 Hz and 0.25 Hz,which slightly affects the prediction accuracy,but obviously reduces the data volume and computation amount.
基金supported by the Natural Science Foundation of China(Grant Nos.51979158,51639008,51679135,and 51422905)the Program of Shanghai Academic Research Leader by Science and Technology Commission of Shanghai Municipality(Project No.19XD1421900)。
文摘Knowledge of pore-water pressure(PWP)variation is fundamental for slope stability.A precise prediction of PWP is difficult due to complex physical mechanisms and in situ natural variability.To explore the applicability and advantages of recurrent neural networks(RNNs)on PWP prediction,three variants of RNNs,i.e.,standard RNN,long short-term memory(LSTM)and gated recurrent unit(GRU)are adopted and compared with a traditional static artificial neural network(ANN),i.e.,multi-layer perceptron(MLP).Measurements of rainfall and PWP of representative piezometers from a fully instrumented natural slope in Hong Kong are used to establish the prediction models.The coefficient of determination(R^2)and root mean square error(RMSE)are used for model evaluations.The influence of input time series length on the model performance is investigated.The results reveal that MLP can provide acceptable performance but is not robust.The uncertainty bounds of RMSE of the MLP model range from 0.24 kPa to 1.12 k Pa for the selected two piezometers.The standard RNN can perform better but the robustness is slightly affected when there are significant time lags between PWP changes and rainfall.The GRU and LSTM models can provide more precise and robust predictions than the standard RNN.The effects of the hidden layer structure and the dropout technique are investigated.The single-layer GRU is accurate enough for PWP prediction,whereas a double-layer GRU brings extra time cost with little accuracy improvement.The dropout technique is essential to overfitting prevention and improvement of accuracy.
基金Supported by U.K.EPSRC Platform Grant(Grant No.EP/P027121/1).
文摘The remaining useful life(RUL)of a system is generally predicted by utilising the data collected from the sensors that continuously monitor different indicators.Recently,different deep learning(DL)techniques have been used for RUL prediction and achieved great success.Because the data is often time-sequential,recurrent neural network(RNN)has attracted significant interests due to its efficiency in dealing with such data.This paper systematically reviews RNN and its variants for RUL prediction,with a specific focus on understanding how different components(e.g.,types of optimisers and activation functions)or parameters(e.g.,sequence length,neuron quantities)affect their performance.After that,a case study using the well-studied NASA’s C-MAPSS dataset is presented to quantitatively evaluate the influence of various state-of-the-art RNN structures on the RUL prediction performance.The result suggests that the variant methods usually perform better than the original RNN,and among which,Bi-directional Long Short-Term Memory generally has the best performance in terms of stability,precision and accuracy.Certain model structures may fail to produce valid RUL prediction result due to the gradient vanishing or gradient exploring problem if the parameters are not chosen appropriately.It is concluded that parameter tuning is a crucial step to achieve optimal prediction performance.