There are two technical challenges in predicting slope deformation.The first one is the random displacement,which could not be decomposed and predicted by numerically resolving the observed accumulated displacement an...There are two technical challenges in predicting slope deformation.The first one is the random displacement,which could not be decomposed and predicted by numerically resolving the observed accumulated displacement and time series of a landslide.The second one is the dynamic evolution of a landslide,which could not be feasibly simulated simply by traditional prediction models.In this paper,a dynamic model of displacement prediction is introduced for composite landslides based on a combination of empirical mode decomposition with soft screening stop criteria(SSSC-EMD)and deep bidirectional long short-term memory(DBi-LSTM)neural network.In the proposed model,the time series analysis and SSSC-EMD are used to decompose the observed accumulated displacements of a slope into three components,viz.trend displacement,periodic displacement,and random displacement.Then,by analyzing the evolution pattern of a landslide and its key factors triggering landslides,appropriate influencing factors are selected for each displacement component,and DBi-LSTM neural network to carry out multi-datadriven dynamic prediction for each displacement component.An accumulated displacement prediction has been obtained by a summation of each component.For accuracy verification and engineering practicability of the model,field observations from two known landslides in China,the Xintan landslide and the Bazimen landslide were collected for comparison and evaluation.The case study verified that the model proposed in this paper can better characterize the"stepwise"deformation characteristics of a slope.As compared with long short-term memory(LSTM)neural network,support vector machine(SVM),and autoregressive integrated moving average(ARIMA)model,DBi-LSTM neural network has higher accuracy in predicting the periodic displacement of slope deformation,with the mean absolute percentage error reduced by 3.063%,14.913%,and 13.960%respectively,and the root mean square error reduced by 1.951 mm,8.954 mm and 7.790 mm respectively.Conclusively,this model not only has high prediction accuracy but also is more stable,which can provide new insight for practical landslide prevention and control engineering.展开更多
In this paper,the recurrent neural network structure of a bidirectional long shortterm memory network(Bi-LSTM)with special memory cells that store information is used to characterize the deep features of the variation...In this paper,the recurrent neural network structure of a bidirectional long shortterm memory network(Bi-LSTM)with special memory cells that store information is used to characterize the deep features of the variation pattern between logging and seismic data.A mapping relationship model between high-frequency logging data and low-frequency seismic data is established via nonlinear mapping.The seismic waveform is infinitely approximated using the logging curve in the low-frequency band to obtain a nonlinear mapping model of this scale,which then stepwise approach the logging curve in the high-frequency band.Finally,a seismic-inversion method of nonlinear mapping multilevel well–seismic matching based on the Bi-LSTM network is developed.The characteristic of this method is that by applying the multilevel well–seismic matching process,the seismic data are stepwise matched to the scale range that is consistent with the logging curve.Further,the matching operator at each level can be stably obtained to effectively overcome the problems that occur in the well–seismic matching process,such as the inconsistency in the scale of two types of data,accuracy in extracting the seismic wavelet of the well-side seismic traces,and multiplicity of solutions.Model test and practical application demonstrate that this method improves the vertical resolution of inversion results,and at the same time,the boundary and the lateral characteristics of the sand body are well maintained to improve the accuracy of thin-layer sand body prediction and achieve an improved practical application effect.展开更多
To explore new operational forecasting methods of waves,a forecasting model for wave heights at three stations in the Bohai Sea has been developed.This model is based on long short-term memory(LSTM)neural network with...To explore new operational forecasting methods of waves,a forecasting model for wave heights at three stations in the Bohai Sea has been developed.This model is based on long short-term memory(LSTM)neural network with sea surface wind and wave heights as training samples.The prediction performance of the model is evaluated,and the error analysis shows that when using the same set of numerically predicted sea surface wind as input,the prediction error produced by the proposed LSTM model at Sta.N01 is 20%,18%and 23%lower than the conventional numerical wave models in terms of the total root mean square error(RMSE),scatter index(SI)and mean absolute error(MAE),respectively.Particularly,for significant wave height in the range of 3–5 m,the prediction accuracy of the LSTM model is improved the most remarkably,with RMSE,SI and MAE all decreasing by 24%.It is also evident that the numbers of hidden neurons,the numbers of buoys used and the time length of training samples all have impact on the prediction accuracy.However,the prediction does not necessary improve with the increase of number of hidden neurons or number of buoys used.The experiment trained by data with the longest time length is found to perform the best overall compared to other experiments with a shorter time length for training.Overall,long short-term memory neural network was proved to be a very promising method for future development and applications in wave forecasting.展开更多
A correct and timely fault diagnosis is important for improving the safety and reliability of chemical processes. With the advancement of big data technology, data-driven fault diagnosis methods are being extensively ...A correct and timely fault diagnosis is important for improving the safety and reliability of chemical processes. With the advancement of big data technology, data-driven fault diagnosis methods are being extensively used and still have considerable potential. In recent years, methods based on deep neural networks have made significant breakthroughs, and fault diagnosis methods for industrial processes based on deep learning have attracted considerable research attention. Therefore, we propose a fusion deeplearning algorithm based on a fully convolutional neural network(FCN) to extract features and build models to correctly diagnose all types of faults. We use long short-term memory(LSTM) units to expand our proposed FCN so that our proposed deep learning model can better extract the time-domain features of chemical process data. We also introduce the attention mechanism into the model, aimed at highlighting the importance of features, which is significant for the fault diagnosis of chemical processes with many features. When applied to the benchmark Tennessee Eastman process, our proposed model exhibits impressive performance, demonstrating the effectiveness of the attention-based LSTM FCN in chemical process fault diagnosis.展开更多
Hand gestures are a natural way for human-robot interaction.Vision based dynamic hand gesture recognition has become a hot research topic due to its various applications.This paper presents a novel deep learning netwo...Hand gestures are a natural way for human-robot interaction.Vision based dynamic hand gesture recognition has become a hot research topic due to its various applications.This paper presents a novel deep learning network for hand gesture recognition.The network integrates several well-proved modules together to learn both short-term and long-term features from video inputs and meanwhile avoid intensive computation.To learn short-term features,each video input is segmented into a fixed number of frame groups.A frame is randomly selected from each group and represented as an RGB image as well as an optical flow snapshot.These two entities are fused and fed into a convolutional neural network(Conv Net)for feature extraction.The Conv Nets for all groups share parameters.To learn longterm features,outputs from all Conv Nets are fed into a long short-term memory(LSTM)network,by which a final classification result is predicted.The new model has been tested with two popular hand gesture datasets,namely the Jester dataset and Nvidia dataset.Comparing with other models,our model produced very competitive results.The robustness of the new model has also been proved with an augmented dataset with enhanced diversity of hand gestures.展开更多
This research aims to enhance Clinical Decision Support Systems(CDSS)within Wireless Body Area Networks(WBANs)by leveraging advanced machine learning techniques.Specifically,we target the challenges of accurate diagno...This research aims to enhance Clinical Decision Support Systems(CDSS)within Wireless Body Area Networks(WBANs)by leveraging advanced machine learning techniques.Specifically,we target the challenges of accurate diagnosis in medical imaging and sequential data analysis using Recurrent Neural Networks(RNNs)with Long Short-Term Memory(LSTM)layers and echo state cells.These models are tailored to improve diagnostic precision,particularly for conditions like rotator cuff tears in osteoporosis patients and gastrointestinal diseases.Traditional diagnostic methods and existing CDSS frameworks often fall short in managing complex,sequential medical data,struggling with long-term dependencies and data imbalances,resulting in suboptimal accuracy and delayed decisions.Our goal is to develop Artificial Intelligence(AI)models that address these shortcomings,offering robust,real-time diagnostic support.We propose a hybrid RNN model that integrates SimpleRNN,LSTM layers,and echo state cells to manage long-term dependencies effectively.Additionally,we introduce CG-Net,a novel Convolutional Neural Network(CNN)framework for gastrointestinal disease classification,which outperforms traditional CNN models.We further enhance model performance through data augmentation and transfer learning,improving generalization and robustness against data scarcity and imbalance.Comprehensive validation,including 5-fold cross-validation and metrics such as accuracy,precision,recall,F1-score,and Area Under the Curve(AUC),confirms the models’reliability.Moreover,SHapley Additive exPlanations(SHAP)and Local Interpretable Model-agnostic Explanations(LIME)are employed to improve model interpretability.Our findings show that the proposed models significantly enhance diagnostic accuracy and efficiency,offering substantial advancements in WBANs and CDSS.展开更多
Audiovisual speech recognition is an emerging research topic.Lipreading is the recognition of what someone is saying using visual information,primarily lip movements.In this study,we created a custom dataset for India...Audiovisual speech recognition is an emerging research topic.Lipreading is the recognition of what someone is saying using visual information,primarily lip movements.In this study,we created a custom dataset for Indian English linguistics and categorized it into three main categories:(1)audio recognition,(2)visual feature extraction,and(3)combined audio and visual recognition.Audio features were extracted using the mel-frequency cepstral coefficient,and classification was performed using a one-dimension convolutional neural network.Visual feature extraction uses Dlib and then classifies visual speech using a long short-term memory type of recurrent neural networks.Finally,integration was performed using a deep convolutional network.The audio speech of Indian English was successfully recognized with accuracies of 93.67%and 91.53%,respectively,using testing data from 200 epochs.The training accuracy for visual speech recognition using the Indian English dataset was 77.48%and the test accuracy was 76.19%using 60 epochs.After integration,the accuracies of audiovisual speech recognition using the Indian English dataset for training and testing were 94.67%and 91.75%,respectively.展开更多
Knowledge of pore-water pressure(PWP)variation is fundamental for slope stability.A precise prediction of PWP is difficult due to complex physical mechanisms and in situ natural variability.To explore the applicabilit...Knowledge of pore-water pressure(PWP)variation is fundamental for slope stability.A precise prediction of PWP is difficult due to complex physical mechanisms and in situ natural variability.To explore the applicability and advantages of recurrent neural networks(RNNs)on PWP prediction,three variants of RNNs,i.e.,standard RNN,long short-term memory(LSTM)and gated recurrent unit(GRU)are adopted and compared with a traditional static artificial neural network(ANN),i.e.,multi-layer perceptron(MLP).Measurements of rainfall and PWP of representative piezometers from a fully instrumented natural slope in Hong Kong are used to establish the prediction models.The coefficient of determination(R^2)and root mean square error(RMSE)are used for model evaluations.The influence of input time series length on the model performance is investigated.The results reveal that MLP can provide acceptable performance but is not robust.The uncertainty bounds of RMSE of the MLP model range from 0.24 kPa to 1.12 k Pa for the selected two piezometers.The standard RNN can perform better but the robustness is slightly affected when there are significant time lags between PWP changes and rainfall.The GRU and LSTM models can provide more precise and robust predictions than the standard RNN.The effects of the hidden layer structure and the dropout technique are investigated.The single-layer GRU is accurate enough for PWP prediction,whereas a double-layer GRU brings extra time cost with little accuracy improvement.The dropout technique is essential to overfitting prevention and improvement of accuracy.展开更多
The remaining useful life(RUL)of a system is generally predicted by utilising the data collected from the sensors that continuously monitor different indicators.Recently,different deep learning(DL)techniques have been...The remaining useful life(RUL)of a system is generally predicted by utilising the data collected from the sensors that continuously monitor different indicators.Recently,different deep learning(DL)techniques have been used for RUL prediction and achieved great success.Because the data is often time-sequential,recurrent neural network(RNN)has attracted significant interests due to its efficiency in dealing with such data.This paper systematically reviews RNN and its variants for RUL prediction,with a specific focus on understanding how different components(e.g.,types of optimisers and activation functions)or parameters(e.g.,sequence length,neuron quantities)affect their performance.After that,a case study using the well-studied NASA’s C-MAPSS dataset is presented to quantitatively evaluate the influence of various state-of-the-art RNN structures on the RUL prediction performance.The result suggests that the variant methods usually perform better than the original RNN,and among which,Bi-directional Long Short-Term Memory generally has the best performance in terms of stability,precision and accuracy.Certain model structures may fail to produce valid RUL prediction result due to the gradient vanishing or gradient exploring problem if the parameters are not chosen appropriately.It is concluded that parameter tuning is a crucial step to achieve optimal prediction performance.展开更多
Oil leakage between the slipper and swash plate of an axial piston pump has a significant effect on the efficiency of the pump.Therefore,it is extremely important that any leakage can be predicted.This study investiga...Oil leakage between the slipper and swash plate of an axial piston pump has a significant effect on the efficiency of the pump.Therefore,it is extremely important that any leakage can be predicted.This study investigates the leakage,oil film thickness,and pocket pressure values of a slipper with circular dimples under different working conditions.The results reveal that flat slippers suffer less leakage than those with textured surfaces.Also,a deep learning-based framework is proposed for modeling the slipper behavior.This framework is a long short-term memory-based deep neural network,which has been extremely successful in predicting time series.The model is compared with four conventional machine learning methods.In addition,statistical analyses and comparisons confirm the superiority of the proposed model.展开更多
To supplement missing logging information without increasing economic cost, a machine learning method to generate synthetic well logs from the existing log data was presented, and the experimental verification and app...To supplement missing logging information without increasing economic cost, a machine learning method to generate synthetic well logs from the existing log data was presented, and the experimental verification and application effect analysis were carried out. Since the traditional Fully Connected Neural Network(FCNN) is incapable of preserving spatial dependency, the Long Short-Term Memory(LSTM) network, which is a kind of Recurrent Neural Network(RNN), was utilized to establish a method for log reconstruction. By this method, synthetic logs can be generated from series of input log data with consideration of variation trend and context information with depth. Besides, a cascaded LSTM was proposed by combining the standard LSTM with a cascade system. Testing through real well log data shows that: the results from the LSTM are of higher accuracy than the traditional FCNN; the cascaded LSTM is more suitable for the problem with multiple series data; the machine learning method proposed provides an accurate and cost effective way for synthetic well log generation.展开更多
A tremendous amount of vendor invoices is generated in the corporate sector.To automate the manual data entry in payable documents,highly accurate Optical Character Recognition(OCR)is required.This paper proposes an e...A tremendous amount of vendor invoices is generated in the corporate sector.To automate the manual data entry in payable documents,highly accurate Optical Character Recognition(OCR)is required.This paper proposes an end-to-end OCR system that does both localization and recognition and serves as a single unit to automate payable document processing such as cheques and cash disbursement.For text localization,the maximally stable extremal region is used,which extracts a word or digit chunk from an invoice.This chunk is later passed to the deep learning model,which performs text recognition.The deep learning model utilizes both convolution neural networks and long short-term memory(LSTM).The convolution layer is used for extracting features,which are fed to the LSTM.The model integrates feature extraction,modeling sequence,and transcription into a unified network.It handles the sequences of unconstrained lengths,independent of the character segmentation or horizontal scale normalization.Furthermore,it applies to both the lexicon-free and lexicon-based text recognition,and finally,it produces a comparatively smaller model,which can be implemented in practical applications.The overall superior performance in the experimental evaluation demonstrates the usefulness of the proposed model.The model is thus generic and can be used for other similar recognition scenarios.展开更多
The identification of individuals through ear images is a prominent area of study in the biometric sector.Facial recognition systems have faced challenges during the COVID-19 pandemic due to mask-wearing,prompting the...The identification of individuals through ear images is a prominent area of study in the biometric sector.Facial recognition systems have faced challenges during the COVID-19 pandemic due to mask-wearing,prompting the exploration of supplementary biometric measures such as ear biometrics.The research proposes a Deep Learning(DL)framework,termed DeepBio,using ear biometrics for human identification.It employs two DL models and five datasets,including IIT Delhi(IITD-I and IITD-II),annotated web images(AWI),mathematical analysis of images(AMI),and EARVN1.Data augmentation techniques such as flipping,translation,and Gaussian noise are applied to enhance model performance and mitigate overfitting.Feature extraction and human identification are conducted using a hybrid approach combining Convolutional Neural Networks(CNN)and Bidirectional Long Short-Term Memory(Bi-LSTM).The DeepBio framework achieves high recognition rates of 97.97%,99.37%,98.57%,94.5%,and 96.87%on the respective datasets.Comparative analysis with existing techniques demonstrates improvements of 0.41%,0.47%,12%,and 9.75%on IITD-II,AMI,AWE,and EARVN1 datasets,respectively.展开更多
As a common and high-risk type of disease,heart disease seriously threatens people’s health.At the same time,in the era of the Internet of Thing(IoT),smart medical device has strong practical significance for medical...As a common and high-risk type of disease,heart disease seriously threatens people’s health.At the same time,in the era of the Internet of Thing(IoT),smart medical device has strong practical significance for medical workers and patients because of its ability to assist in the diagnosis of diseases.Therefore,the research of real-time diagnosis and classification algorithms for arrhythmia can help to improve the diagnostic efficiency of diseases.In this paper,we design an automatic arrhythmia classification algorithm model based on Convolutional Neural Network(CNN)and Encoder-Decoder model.The model uses Long Short-Term Memory(LSTM)to consider the influence of time series features on classification results.Simultaneously,it is trained and tested by the MIT-BIH arrhythmia database.Besides,Generative Adversarial Networks(GAN)is adopted as a method of data equalization for solving data imbalance problem.The simulation results show that for the inter-patient arrhythmia classification,the hybrid model combining CNN and Encoder-Decoder model has the best classification accuracy,of which the accuracy can reach 94.05%.Especially,it has a better advantage for the classification effect of supraventricular ectopic beats(class S)and fusion beats(class F).展开更多
Lithium-ion batteries are commonly used in electric vehicles,mobile phones,and laptops.These batteries demonstrate several advantages,such as environmental friendliness,high energy density,and long life.However,batter...Lithium-ion batteries are commonly used in electric vehicles,mobile phones,and laptops.These batteries demonstrate several advantages,such as environmental friendliness,high energy density,and long life.However,battery overcharging and overdischarging may occur if the batteries are not monitored continuously.Overcharging causesfire and explosion casualties,and overdischar-ging causes a reduction in the battery capacity and life.In addition,the internal resistance of such batteries varies depending on their external temperature,elec-trolyte,cathode material,and other factors;the capacity of the batteries decreases with temperature.In this study,we develop a method for estimating the state of charge(SOC)using a neural network model that is best suited to the external tem-perature of such batteries based on their characteristics.During our simulation,we acquired data at temperatures of 25°C,30°C,35°C,and 40°C.Based on the tem-perature parameters,the voltage,current,and time parameters were obtained,and six cycles of the parameters based on the temperature were used for the experi-ment.Experimental data to verify the proposed method were obtained through a discharge experiment conducted using a vehicle driving simulator.The experi-mental data were provided as inputs to three types of neural network models:mul-tilayer neural network(MNN),long short-term memory(LSTM),and gated recurrent unit(GRU).The neural network models were trained and optimized for the specific temperatures measured during the experiment,and the SOC was estimated by selecting the most suitable model for each temperature.The experimental results revealed that the mean absolute errors of the MNN,LSTM,and GRU using the proposed method were 2.17%,2.19%,and 2.15%,respec-tively,which are better than those of the conventional method(4.47%,4.60%,and 4.40%).Finally,SOC estimation based on GRU using the proposed method was found to be 2.15%,which was the most accurate.展开更多
In the electricity market,fluctuations in real-time prices are unstable,and changes in short-term load are determined by many factors.By studying the timing of charging and discharging,as well as the economic benefits...In the electricity market,fluctuations in real-time prices are unstable,and changes in short-term load are determined by many factors.By studying the timing of charging and discharging,as well as the economic benefits of energy storage in the process of participating in the power market,this paper takes energy storage scheduling as merely one factor affecting short-term power load,which affects short-term load time series along with time-of-use price,holidays,and temperature.A deep learning network is used to predict the short-term load,a convolutional neural network(CNN)is used to extract the features,and a long short-term memory(LSTM)network is used to learn the temporal characteristics of the load value,which can effectively improve prediction accuracy.Taking the load data of a certain region as an example,the CNN-LSTM prediction model is compared with the single LSTM prediction model.The experimental results show that the CNN-LSTM deep learning network with the participation of energy storage in dispatching can have high prediction accuracy for short-term power load forecasting.展开更多
文摘There are two technical challenges in predicting slope deformation.The first one is the random displacement,which could not be decomposed and predicted by numerically resolving the observed accumulated displacement and time series of a landslide.The second one is the dynamic evolution of a landslide,which could not be feasibly simulated simply by traditional prediction models.In this paper,a dynamic model of displacement prediction is introduced for composite landslides based on a combination of empirical mode decomposition with soft screening stop criteria(SSSC-EMD)and deep bidirectional long short-term memory(DBi-LSTM)neural network.In the proposed model,the time series analysis and SSSC-EMD are used to decompose the observed accumulated displacements of a slope into three components,viz.trend displacement,periodic displacement,and random displacement.Then,by analyzing the evolution pattern of a landslide and its key factors triggering landslides,appropriate influencing factors are selected for each displacement component,and DBi-LSTM neural network to carry out multi-datadriven dynamic prediction for each displacement component.An accumulated displacement prediction has been obtained by a summation of each component.For accuracy verification and engineering practicability of the model,field observations from two known landslides in China,the Xintan landslide and the Bazimen landslide were collected for comparison and evaluation.The case study verified that the model proposed in this paper can better characterize the"stepwise"deformation characteristics of a slope.As compared with long short-term memory(LSTM)neural network,support vector machine(SVM),and autoregressive integrated moving average(ARIMA)model,DBi-LSTM neural network has higher accuracy in predicting the periodic displacement of slope deformation,with the mean absolute percentage error reduced by 3.063%,14.913%,and 13.960%respectively,and the root mean square error reduced by 1.951 mm,8.954 mm and 7.790 mm respectively.Conclusively,this model not only has high prediction accuracy but also is more stable,which can provide new insight for practical landslide prevention and control engineering.
基金supported by the National Major Science and Technology Special Project(No.2016ZX05026-002).
文摘In this paper,the recurrent neural network structure of a bidirectional long shortterm memory network(Bi-LSTM)with special memory cells that store information is used to characterize the deep features of the variation pattern between logging and seismic data.A mapping relationship model between high-frequency logging data and low-frequency seismic data is established via nonlinear mapping.The seismic waveform is infinitely approximated using the logging curve in the low-frequency band to obtain a nonlinear mapping model of this scale,which then stepwise approach the logging curve in the high-frequency band.Finally,a seismic-inversion method of nonlinear mapping multilevel well–seismic matching based on the Bi-LSTM network is developed.The characteristic of this method is that by applying the multilevel well–seismic matching process,the seismic data are stepwise matched to the scale range that is consistent with the logging curve.Further,the matching operator at each level can be stably obtained to effectively overcome the problems that occur in the well–seismic matching process,such as the inconsistency in the scale of two types of data,accuracy in extracting the seismic wavelet of the well-side seismic traces,and multiplicity of solutions.Model test and practical application demonstrate that this method improves the vertical resolution of inversion results,and at the same time,the boundary and the lateral characteristics of the sand body are well maintained to improve the accuracy of thin-layer sand body prediction and achieve an improved practical application effect.
基金The National Key R&D Program of China under contract No.2016YFC1402103
文摘To explore new operational forecasting methods of waves,a forecasting model for wave heights at three stations in the Bohai Sea has been developed.This model is based on long short-term memory(LSTM)neural network with sea surface wind and wave heights as training samples.The prediction performance of the model is evaluated,and the error analysis shows that when using the same set of numerically predicted sea surface wind as input,the prediction error produced by the proposed LSTM model at Sta.N01 is 20%,18%and 23%lower than the conventional numerical wave models in terms of the total root mean square error(RMSE),scatter index(SI)and mean absolute error(MAE),respectively.Particularly,for significant wave height in the range of 3–5 m,the prediction accuracy of the LSTM model is improved the most remarkably,with RMSE,SI and MAE all decreasing by 24%.It is also evident that the numbers of hidden neurons,the numbers of buoys used and the time length of training samples all have impact on the prediction accuracy.However,the prediction does not necessary improve with the increase of number of hidden neurons or number of buoys used.The experiment trained by data with the longest time length is found to perform the best overall compared to other experiments with a shorter time length for training.Overall,long short-term memory neural network was proved to be a very promising method for future development and applications in wave forecasting.
文摘A correct and timely fault diagnosis is important for improving the safety and reliability of chemical processes. With the advancement of big data technology, data-driven fault diagnosis methods are being extensively used and still have considerable potential. In recent years, methods based on deep neural networks have made significant breakthroughs, and fault diagnosis methods for industrial processes based on deep learning have attracted considerable research attention. Therefore, we propose a fusion deeplearning algorithm based on a fully convolutional neural network(FCN) to extract features and build models to correctly diagnose all types of faults. We use long short-term memory(LSTM) units to expand our proposed FCN so that our proposed deep learning model can better extract the time-domain features of chemical process data. We also introduce the attention mechanism into the model, aimed at highlighting the importance of features, which is significant for the fault diagnosis of chemical processes with many features. When applied to the benchmark Tennessee Eastman process, our proposed model exhibits impressive performance, demonstrating the effectiveness of the attention-based LSTM FCN in chemical process fault diagnosis.
文摘Hand gestures are a natural way for human-robot interaction.Vision based dynamic hand gesture recognition has become a hot research topic due to its various applications.This paper presents a novel deep learning network for hand gesture recognition.The network integrates several well-proved modules together to learn both short-term and long-term features from video inputs and meanwhile avoid intensive computation.To learn short-term features,each video input is segmented into a fixed number of frame groups.A frame is randomly selected from each group and represented as an RGB image as well as an optical flow snapshot.These two entities are fused and fed into a convolutional neural network(Conv Net)for feature extraction.The Conv Nets for all groups share parameters.To learn longterm features,outputs from all Conv Nets are fed into a long short-term memory(LSTM)network,by which a final classification result is predicted.The new model has been tested with two popular hand gesture datasets,namely the Jester dataset and Nvidia dataset.Comparing with other models,our model produced very competitive results.The robustness of the new model has also been proved with an augmented dataset with enhanced diversity of hand gestures.
基金supported by the“Human Resources Program in Energy Technology”of the Korea Institute of Energy Technology Evaluation and Planning(KETEP)and granted financial resources from the Ministry of Trade,Industry,and Energy,Korea(No.20204010600090).
文摘This research aims to enhance Clinical Decision Support Systems(CDSS)within Wireless Body Area Networks(WBANs)by leveraging advanced machine learning techniques.Specifically,we target the challenges of accurate diagnosis in medical imaging and sequential data analysis using Recurrent Neural Networks(RNNs)with Long Short-Term Memory(LSTM)layers and echo state cells.These models are tailored to improve diagnostic precision,particularly for conditions like rotator cuff tears in osteoporosis patients and gastrointestinal diseases.Traditional diagnostic methods and existing CDSS frameworks often fall short in managing complex,sequential medical data,struggling with long-term dependencies and data imbalances,resulting in suboptimal accuracy and delayed decisions.Our goal is to develop Artificial Intelligence(AI)models that address these shortcomings,offering robust,real-time diagnostic support.We propose a hybrid RNN model that integrates SimpleRNN,LSTM layers,and echo state cells to manage long-term dependencies effectively.Additionally,we introduce CG-Net,a novel Convolutional Neural Network(CNN)framework for gastrointestinal disease classification,which outperforms traditional CNN models.We further enhance model performance through data augmentation and transfer learning,improving generalization and robustness against data scarcity and imbalance.Comprehensive validation,including 5-fold cross-validation and metrics such as accuracy,precision,recall,F1-score,and Area Under the Curve(AUC),confirms the models’reliability.Moreover,SHapley Additive exPlanations(SHAP)and Local Interpretable Model-agnostic Explanations(LIME)are employed to improve model interpretability.Our findings show that the proposed models significantly enhance diagnostic accuracy and efficiency,offering substantial advancements in WBANs and CDSS.
文摘Audiovisual speech recognition is an emerging research topic.Lipreading is the recognition of what someone is saying using visual information,primarily lip movements.In this study,we created a custom dataset for Indian English linguistics and categorized it into three main categories:(1)audio recognition,(2)visual feature extraction,and(3)combined audio and visual recognition.Audio features were extracted using the mel-frequency cepstral coefficient,and classification was performed using a one-dimension convolutional neural network.Visual feature extraction uses Dlib and then classifies visual speech using a long short-term memory type of recurrent neural networks.Finally,integration was performed using a deep convolutional network.The audio speech of Indian English was successfully recognized with accuracies of 93.67%and 91.53%,respectively,using testing data from 200 epochs.The training accuracy for visual speech recognition using the Indian English dataset was 77.48%and the test accuracy was 76.19%using 60 epochs.After integration,the accuracies of audiovisual speech recognition using the Indian English dataset for training and testing were 94.67%and 91.75%,respectively.
基金supported by the Natural Science Foundation of China(Grant Nos.51979158,51639008,51679135,and 51422905)the Program of Shanghai Academic Research Leader by Science and Technology Commission of Shanghai Municipality(Project No.19XD1421900)。
文摘Knowledge of pore-water pressure(PWP)variation is fundamental for slope stability.A precise prediction of PWP is difficult due to complex physical mechanisms and in situ natural variability.To explore the applicability and advantages of recurrent neural networks(RNNs)on PWP prediction,three variants of RNNs,i.e.,standard RNN,long short-term memory(LSTM)and gated recurrent unit(GRU)are adopted and compared with a traditional static artificial neural network(ANN),i.e.,multi-layer perceptron(MLP).Measurements of rainfall and PWP of representative piezometers from a fully instrumented natural slope in Hong Kong are used to establish the prediction models.The coefficient of determination(R^2)and root mean square error(RMSE)are used for model evaluations.The influence of input time series length on the model performance is investigated.The results reveal that MLP can provide acceptable performance but is not robust.The uncertainty bounds of RMSE of the MLP model range from 0.24 kPa to 1.12 k Pa for the selected two piezometers.The standard RNN can perform better but the robustness is slightly affected when there are significant time lags between PWP changes and rainfall.The GRU and LSTM models can provide more precise and robust predictions than the standard RNN.The effects of the hidden layer structure and the dropout technique are investigated.The single-layer GRU is accurate enough for PWP prediction,whereas a double-layer GRU brings extra time cost with little accuracy improvement.The dropout technique is essential to overfitting prevention and improvement of accuracy.
基金Supported by U.K.EPSRC Platform Grant(Grant No.EP/P027121/1).
文摘The remaining useful life(RUL)of a system is generally predicted by utilising the data collected from the sensors that continuously monitor different indicators.Recently,different deep learning(DL)techniques have been used for RUL prediction and achieved great success.Because the data is often time-sequential,recurrent neural network(RNN)has attracted significant interests due to its efficiency in dealing with such data.This paper systematically reviews RNN and its variants for RUL prediction,with a specific focus on understanding how different components(e.g.,types of optimisers and activation functions)or parameters(e.g.,sequence length,neuron quantities)affect their performance.After that,a case study using the well-studied NASA’s C-MAPSS dataset is presented to quantitatively evaluate the influence of various state-of-the-art RNN structures on the RUL prediction performance.The result suggests that the variant methods usually perform better than the original RNN,and among which,Bi-directional Long Short-Term Memory generally has the best performance in terms of stability,precision and accuracy.Certain model structures may fail to produce valid RUL prediction result due to the gradient vanishing or gradient exploring problem if the parameters are not chosen appropriately.It is concluded that parameter tuning is a crucial step to achieve optimal prediction performance.
基金Supported by Erciyes University Scientific Research Projects Coordination Unit(Grant No.FDK-2016-6986).
文摘Oil leakage between the slipper and swash plate of an axial piston pump has a significant effect on the efficiency of the pump.Therefore,it is extremely important that any leakage can be predicted.This study investigates the leakage,oil film thickness,and pocket pressure values of a slipper with circular dimples under different working conditions.The results reveal that flat slippers suffer less leakage than those with textured surfaces.Also,a deep learning-based framework is proposed for modeling the slipper behavior.This framework is a long short-term memory-based deep neural network,which has been extremely successful in predicting time series.The model is compared with four conventional machine learning methods.In addition,statistical analyses and comparisons confirm the superiority of the proposed model.
基金Supported by the National Natural Science Foundation of China(U1663208,51520105005)the National Science and Technology Major Project of China(2017ZX05009-005,2016ZX05037-003)
文摘To supplement missing logging information without increasing economic cost, a machine learning method to generate synthetic well logs from the existing log data was presented, and the experimental verification and application effect analysis were carried out. Since the traditional Fully Connected Neural Network(FCNN) is incapable of preserving spatial dependency, the Long Short-Term Memory(LSTM) network, which is a kind of Recurrent Neural Network(RNN), was utilized to establish a method for log reconstruction. By this method, synthetic logs can be generated from series of input log data with consideration of variation trend and context information with depth. Besides, a cascaded LSTM was proposed by combining the standard LSTM with a cascade system. Testing through real well log data shows that: the results from the LSTM are of higher accuracy than the traditional FCNN; the cascaded LSTM is more suitable for the problem with multiple series data; the machine learning method proposed provides an accurate and cost effective way for synthetic well log generation.
基金Researchers would like to thank the Deanship of Scientific Research,Qassim University,for funding publication of this project.
文摘A tremendous amount of vendor invoices is generated in the corporate sector.To automate the manual data entry in payable documents,highly accurate Optical Character Recognition(OCR)is required.This paper proposes an end-to-end OCR system that does both localization and recognition and serves as a single unit to automate payable document processing such as cheques and cash disbursement.For text localization,the maximally stable extremal region is used,which extracts a word or digit chunk from an invoice.This chunk is later passed to the deep learning model,which performs text recognition.The deep learning model utilizes both convolution neural networks and long short-term memory(LSTM).The convolution layer is used for extracting features,which are fed to the LSTM.The model integrates feature extraction,modeling sequence,and transcription into a unified network.It handles the sequences of unconstrained lengths,independent of the character segmentation or horizontal scale normalization.Furthermore,it applies to both the lexicon-free and lexicon-based text recognition,and finally,it produces a comparatively smaller model,which can be implemented in practical applications.The overall superior performance in the experimental evaluation demonstrates the usefulness of the proposed model.The model is thus generic and can be used for other similar recognition scenarios.
文摘The identification of individuals through ear images is a prominent area of study in the biometric sector.Facial recognition systems have faced challenges during the COVID-19 pandemic due to mask-wearing,prompting the exploration of supplementary biometric measures such as ear biometrics.The research proposes a Deep Learning(DL)framework,termed DeepBio,using ear biometrics for human identification.It employs two DL models and five datasets,including IIT Delhi(IITD-I and IITD-II),annotated web images(AWI),mathematical analysis of images(AMI),and EARVN1.Data augmentation techniques such as flipping,translation,and Gaussian noise are applied to enhance model performance and mitigate overfitting.Feature extraction and human identification are conducted using a hybrid approach combining Convolutional Neural Networks(CNN)and Bidirectional Long Short-Term Memory(Bi-LSTM).The DeepBio framework achieves high recognition rates of 97.97%,99.37%,98.57%,94.5%,and 96.87%on the respective datasets.Comparative analysis with existing techniques demonstrates improvements of 0.41%,0.47%,12%,and 9.75%on IITD-II,AMI,AWE,and EARVN1 datasets,respectively.
基金Fundamental Research Funds for the Central Universities(Grant No.FRF-TP-19-006A3).
文摘As a common and high-risk type of disease,heart disease seriously threatens people’s health.At the same time,in the era of the Internet of Thing(IoT),smart medical device has strong practical significance for medical workers and patients because of its ability to assist in the diagnosis of diseases.Therefore,the research of real-time diagnosis and classification algorithms for arrhythmia can help to improve the diagnostic efficiency of diseases.In this paper,we design an automatic arrhythmia classification algorithm model based on Convolutional Neural Network(CNN)and Encoder-Decoder model.The model uses Long Short-Term Memory(LSTM)to consider the influence of time series features on classification results.Simultaneously,it is trained and tested by the MIT-BIH arrhythmia database.Besides,Generative Adversarial Networks(GAN)is adopted as a method of data equalization for solving data imbalance problem.The simulation results show that for the inter-patient arrhythmia classification,the hybrid model combining CNN and Encoder-Decoder model has the best classification accuracy,of which the accuracy can reach 94.05%.Especially,it has a better advantage for the classification effect of supraventricular ectopic beats(class S)and fusion beats(class F).
基金supported by the BK21 FOUR project funded by the Ministry of Education,Korea(4199990113966).
文摘Lithium-ion batteries are commonly used in electric vehicles,mobile phones,and laptops.These batteries demonstrate several advantages,such as environmental friendliness,high energy density,and long life.However,battery overcharging and overdischarging may occur if the batteries are not monitored continuously.Overcharging causesfire and explosion casualties,and overdischar-ging causes a reduction in the battery capacity and life.In addition,the internal resistance of such batteries varies depending on their external temperature,elec-trolyte,cathode material,and other factors;the capacity of the batteries decreases with temperature.In this study,we develop a method for estimating the state of charge(SOC)using a neural network model that is best suited to the external tem-perature of such batteries based on their characteristics.During our simulation,we acquired data at temperatures of 25°C,30°C,35°C,and 40°C.Based on the tem-perature parameters,the voltage,current,and time parameters were obtained,and six cycles of the parameters based on the temperature were used for the experi-ment.Experimental data to verify the proposed method were obtained through a discharge experiment conducted using a vehicle driving simulator.The experi-mental data were provided as inputs to three types of neural network models:mul-tilayer neural network(MNN),long short-term memory(LSTM),and gated recurrent unit(GRU).The neural network models were trained and optimized for the specific temperatures measured during the experiment,and the SOC was estimated by selecting the most suitable model for each temperature.The experimental results revealed that the mean absolute errors of the MNN,LSTM,and GRU using the proposed method were 2.17%,2.19%,and 2.15%,respec-tively,which are better than those of the conventional method(4.47%,4.60%,and 4.40%).Finally,SOC estimation based on GRU using the proposed method was found to be 2.15%,which was the most accurate.
基金supported by a State Grid Zhejiang Electric Power Co.,Ltd.Economic and Technical Research Institute Project(Key Technologies and Empirical Research of Diversified Integrated Operation of User-Side Energy Storage in Power Market Environment,No.5211JY19000W)supported by the National Natural Science Foundation of China(Research on Power Market Management to Promote Large-Scale New Energy Consumption,No.71804045).
文摘In the electricity market,fluctuations in real-time prices are unstable,and changes in short-term load are determined by many factors.By studying the timing of charging and discharging,as well as the economic benefits of energy storage in the process of participating in the power market,this paper takes energy storage scheduling as merely one factor affecting short-term power load,which affects short-term load time series along with time-of-use price,holidays,and temperature.A deep learning network is used to predict the short-term load,a convolutional neural network(CNN)is used to extract the features,and a long short-term memory(LSTM)network is used to learn the temporal characteristics of the load value,which can effectively improve prediction accuracy.Taking the load data of a certain region as an example,the CNN-LSTM prediction model is compared with the single LSTM prediction model.The experimental results show that the CNN-LSTM deep learning network with the participation of energy storage in dispatching can have high prediction accuracy for short-term power load forecasting.