期刊文献+
共找到15篇文章
< 1 >
每页显示 20 50 100
Landslide displacement prediction based on optimized empirical mode decomposition and deep bidirectional long short-term memory network 被引量:1
1
作者 ZHANG Ming-yue HAN Yang +1 位作者 YANG Ping WANG Cong-ling 《Journal of Mountain Science》 SCIE CSCD 2023年第3期637-656,共20页
There are two technical challenges in predicting slope deformation.The first one is the random displacement,which could not be decomposed and predicted by numerically resolving the observed accumulated displacement an... There are two technical challenges in predicting slope deformation.The first one is the random displacement,which could not be decomposed and predicted by numerically resolving the observed accumulated displacement and time series of a landslide.The second one is the dynamic evolution of a landslide,which could not be feasibly simulated simply by traditional prediction models.In this paper,a dynamic model of displacement prediction is introduced for composite landslides based on a combination of empirical mode decomposition with soft screening stop criteria(SSSC-EMD)and deep bidirectional long short-term memory(DBi-LSTM)neural network.In the proposed model,the time series analysis and SSSC-EMD are used to decompose the observed accumulated displacements of a slope into three components,viz.trend displacement,periodic displacement,and random displacement.Then,by analyzing the evolution pattern of a landslide and its key factors triggering landslides,appropriate influencing factors are selected for each displacement component,and DBi-LSTM neural network to carry out multi-datadriven dynamic prediction for each displacement component.An accumulated displacement prediction has been obtained by a summation of each component.For accuracy verification and engineering practicability of the model,field observations from two known landslides in China,the Xintan landslide and the Bazimen landslide were collected for comparison and evaluation.The case study verified that the model proposed in this paper can better characterize the"stepwise"deformation characteristics of a slope.As compared with long short-term memory(LSTM)neural network,support vector machine(SVM),and autoregressive integrated moving average(ARIMA)model,DBi-LSTM neural network has higher accuracy in predicting the periodic displacement of slope deformation,with the mean absolute percentage error reduced by 3.063%,14.913%,and 13.960%respectively,and the root mean square error reduced by 1.951 mm,8.954 mm and 7.790 mm respectively.Conclusively,this model not only has high prediction accuracy but also is more stable,which can provide new insight for practical landslide prevention and control engineering. 展开更多
关键词 Landslide displacement Empirical mode decomposition Soft screening stop criteria Deep bidirectional long short-term memory neural network Xintan landslide Bazimen landslide
下载PDF
Seismic-inversion method for nonlinear mapping multilevel well–seismic matching based on bidirectional long short-term memory networks
2
作者 Yue You-Xi Wu Jia-Wei Chen Yi-Du 《Applied Geophysics》 SCIE CSCD 2022年第2期244-257,308,共15页
In this paper,the recurrent neural network structure of a bidirectional long shortterm memory network(Bi-LSTM)with special memory cells that store information is used to characterize the deep features of the variation... In this paper,the recurrent neural network structure of a bidirectional long shortterm memory network(Bi-LSTM)with special memory cells that store information is used to characterize the deep features of the variation pattern between logging and seismic data.A mapping relationship model between high-frequency logging data and low-frequency seismic data is established via nonlinear mapping.The seismic waveform is infinitely approximated using the logging curve in the low-frequency band to obtain a nonlinear mapping model of this scale,which then stepwise approach the logging curve in the high-frequency band.Finally,a seismic-inversion method of nonlinear mapping multilevel well–seismic matching based on the Bi-LSTM network is developed.The characteristic of this method is that by applying the multilevel well–seismic matching process,the seismic data are stepwise matched to the scale range that is consistent with the logging curve.Further,the matching operator at each level can be stably obtained to effectively overcome the problems that occur in the well–seismic matching process,such as the inconsistency in the scale of two types of data,accuracy in extracting the seismic wavelet of the well-side seismic traces,and multiplicity of solutions.Model test and practical application demonstrate that this method improves the vertical resolution of inversion results,and at the same time,the boundary and the lateral characteristics of the sand body are well maintained to improve the accuracy of thin-layer sand body prediction and achieve an improved practical application effect. 展开更多
关键词 bidirectional recurrent neural networks long short-term memory nonlinear mapping well–seismic matching seismic inversion
下载PDF
Continuous Sign Language Recognition Based on Spatial-Temporal Graph Attention Network 被引量:2
3
作者 Qi Guo Shujun Zhang Hui Li 《Computer Modeling in Engineering & Sciences》 SCIE EI 2023年第3期1653-1670,共18页
Continuous sign language recognition(CSLR)is challenging due to the complexity of video background,hand gesture variability,and temporal modeling difficulties.This work proposes a CSLR method based on a spatialtempora... Continuous sign language recognition(CSLR)is challenging due to the complexity of video background,hand gesture variability,and temporal modeling difficulties.This work proposes a CSLR method based on a spatialtemporal graph attention network to focus on essential features of video series.The method considers local details of sign language movements by taking the information on joints and bones as inputs and constructing a spatialtemporal graph to reflect inter-frame relevance and physical connections between nodes.The graph-based multihead attention mechanism is utilized with adjacent matrix calculation for better local-feature exploration,and short-term motion correlation modeling is completed via a temporal convolutional network.We adopted BLSTM to learn the long-termdependence and connectionist temporal classification to align the word-level sequences.The proposed method achieves competitive results regarding word error rates(1.59%)on the Chinese Sign Language dataset and the mean Jaccard Index(65.78%)on the ChaLearn LAP Continuous Gesture Dataset. 展开更多
关键词 Continuous sign language recognition graph attention network bidirectional long short-term memory connectionist temporal classification
下载PDF
A Time Series Intrusion Detection Method Based on SSAE,TCN and Bi-LSTM
4
作者 Zhenxiang He Xunxi Wang Chunwei Li 《Computers, Materials & Continua》 SCIE EI 2024年第1期845-871,共27页
In the fast-evolving landscape of digital networks,the incidence of network intrusions has escalated alarmingly.Simultaneously,the crucial role of time series data in intrusion detection remains largely underappreciat... In the fast-evolving landscape of digital networks,the incidence of network intrusions has escalated alarmingly.Simultaneously,the crucial role of time series data in intrusion detection remains largely underappreciated,with most systems failing to capture the time-bound nuances of network traffic.This leads to compromised detection accuracy and overlooked temporal patterns.Addressing this gap,we introduce a novel SSAE-TCN-BiLSTM(STL)model that integrates time series analysis,significantly enhancing detection capabilities.Our approach reduces feature dimensionalitywith a Stacked Sparse Autoencoder(SSAE)and extracts temporally relevant features through a Temporal Convolutional Network(TCN)and Bidirectional Long Short-term Memory Network(Bi-LSTM).By meticulously adjusting time steps,we underscore the significance of temporal data in bolstering detection accuracy.On the UNSW-NB15 dataset,ourmodel achieved an F1-score of 99.49%,Accuracy of 99.43%,Precision of 99.38%,Recall of 99.60%,and an inference time of 4.24 s.For the CICDS2017 dataset,we recorded an F1-score of 99.53%,Accuracy of 99.62%,Precision of 99.27%,Recall of 99.79%,and an inference time of 5.72 s.These findings not only confirm the STL model’s superior performance but also its operational efficiency,underpinning its significance in real-world cybersecurity scenarios where rapid response is paramount.Our contribution represents a significant advance in cybersecurity,proposing a model that excels in accuracy and adaptability to the dynamic nature of network traffic,setting a new benchmark for intrusion detection systems. 展开更多
关键词 network intrusion detection bidirectional long short-term memory network time series stacked sparse autoencoder temporal convolutional network time steps
下载PDF
A New Industrial Intrusion Detection Method Based on CNN-BiLSTM
5
作者 Jun Wang Changfu Si +1 位作者 Zhen Wang Qiang Fu 《Computers, Materials & Continua》 SCIE EI 2024年第6期4297-4318,共22页
Nowadays,with the rapid development of industrial Internet technology,on the one hand,advanced industrial control systems(ICS)have improved industrial production efficiency.However,there are more and more cyber-attack... Nowadays,with the rapid development of industrial Internet technology,on the one hand,advanced industrial control systems(ICS)have improved industrial production efficiency.However,there are more and more cyber-attacks targeting industrial control systems.To ensure the security of industrial networks,intrusion detection systems have been widely used in industrial control systems,and deep neural networks have always been an effective method for identifying cyber attacks.Current intrusion detection methods still suffer from low accuracy and a high false alarm rate.Therefore,it is important to build a more efficient intrusion detection model.This paper proposes a hybrid deep learning intrusion detection method based on convolutional neural networks and bidirectional long short-term memory neural networks(CNN-BiLSTM).To address the issue of imbalanced data within the dataset and improve the model’s detection capabilities,the Synthetic Minority Over-sampling Technique-Edited Nearest Neighbors(SMOTE-ENN)algorithm is applied in the preprocessing phase.This algorithm is employed to generate synthetic instances for the minority class,simultaneously mitigating the impact of noise in the majority class.This approach aims to create a more equitable distribution of classes,thereby enhancing the model’s ability to effectively identify patterns in both minority and majority classes.In the experimental phase,the detection performance of the method is verified using two data sets.Experimental results show that the accuracy rate on the CICIDS-2017 data set reaches 97.7%.On the natural gas pipeline dataset collected by Lan Turnipseed from Mississippi State University in the United States,the accuracy rate also reaches 85.5%. 展开更多
关键词 Intrusion detection convolutional neural network bidirectional long short-term memory neural network multi-head self-attention mechanism
下载PDF
Ultra-short-term Interval Prediction of Wind Power Based on Graph Neural Network and Improved Bootstrap Technique 被引量:3
6
作者 Wenlong Liao Shouxiang Wang +3 位作者 Birgitte Bak-Jensen Jayakrishnan Radhakrishna Pillai Zhe Yang Kuangpu Liu 《Journal of Modern Power Systems and Clean Energy》 SCIE EI CSCD 2023年第4期1100-1114,共15页
Reliable and accurate ultra-short-term prediction of wind power is vital for the operation and optimization of power systems.However,the volatility and intermittence of wind power pose uncertainties to traditional poi... Reliable and accurate ultra-short-term prediction of wind power is vital for the operation and optimization of power systems.However,the volatility and intermittence of wind power pose uncertainties to traditional point prediction,resulting in an increased risk of power system operation.To represent the uncertainty of wind power,this paper proposes a new method for ultra-short-term interval prediction of wind power based on a graph neural network(GNN)and an improved Bootstrap technique.Specifically,adjacent wind farms and local meteorological factors are modeled as the new form of a graph from the graph-theoretic perspective.Then,the graph convolutional network(GCN)and bi-directional long short-term memory(Bi-LSTM)are proposed to capture spatiotemporal features between nodes in the graph.To obtain highquality prediction intervals(PIs),an improved Bootstrap technique is designed to increase coverage percentage and narrow PIs effectively.Numerical simulations demonstrate that the proposed method can capture the spatiotemporal correlations from the graph,and the prediction results outperform popular baselines on two real-world datasets,which implies a high potential for practical applications in power systems. 展开更多
关键词 Wind power graph neural network(GNN) bidirectional long short-term memory(Bi-LSTM) prediction interval Bootstrap technique
原文传递
Construction of Human Digital Twin Model Based on Multimodal Data and Its Application in Locomotion Mode Identifcation 被引量:1
7
作者 Ruirui Zhong Bingtao Hu +4 位作者 Yixiong Feng Hao Zheng Zhaoxi Hong Shanhe Lou Jianrong Tan 《Chinese Journal of Mechanical Engineering》 SCIE EI CAS CSCD 2023年第5期7-19,共13页
With the increasing attention to the state and role of people in intelligent manufacturing, there is a strong demand for human-cyber-physical systems (HCPS) that focus on human-robot interaction. The existing intellig... With the increasing attention to the state and role of people in intelligent manufacturing, there is a strong demand for human-cyber-physical systems (HCPS) that focus on human-robot interaction. The existing intelligent manufacturing system cannot satisfy efcient human-robot collaborative work. However, unlike machines equipped with sensors, human characteristic information is difcult to be perceived and digitized instantly. In view of the high complexity and uncertainty of the human body, this paper proposes a framework for building a human digital twin (HDT) model based on multimodal data and expounds on the key technologies. Data acquisition system is built to dynamically acquire and update the body state data and physiological data of the human body and realize the digital expression of multi-source heterogeneous human body information. A bidirectional long short-term memory and convolutional neural network (BiLSTM-CNN) based network is devised to fuse multimodal human data and extract the spatiotemporal features, and the human locomotion mode identifcation is taken as an application case. A series of optimization experiments are carried out to improve the performance of the proposed BiLSTM-CNN-based network model. The proposed model is compared with traditional locomotion mode identifcation models. The experimental results proved the superiority of the HDT framework for human locomotion mode identifcation. 展开更多
关键词 Human digital twin Human-cyber-physical system bidirectional long short-term memory Convolutional neural network Multimodal data
下载PDF
Bi-LSTM-Based Deep Stacked Sequence-to-Sequence Autoencoder for Forecasting Solar Irradiation and Wind Speed 被引量:1
8
作者 Neelam Mughees Mujtaba Hussain Jaffery +2 位作者 Abdullah Mughees Anam Mughees Krzysztof Ejsmont 《Computers, Materials & Continua》 SCIE EI 2023年第6期6375-6393,共19页
Wind and solar energy are two popular forms of renewable energy used in microgrids and facilitating the transition towards net-zero carbon emissions by 2050.However,they are exceedingly unpredictable since they rely h... Wind and solar energy are two popular forms of renewable energy used in microgrids and facilitating the transition towards net-zero carbon emissions by 2050.However,they are exceedingly unpredictable since they rely highly on weather and atmospheric conditions.In microgrids,smart energy management systems,such as integrated demand response programs,are permanently established on a step-ahead basis,which means that accu-rate forecasting of wind speed and solar irradiance intervals is becoming increasingly crucial to the optimal operation and planning of microgrids.With this in mind,a novel“bidirectional long short-term memory network”(Bi-LSTM)-based,deep stacked,sequence-to-sequence autoencoder(S2SAE)forecasting model for predicting short-term solar irradiation and wind speed was developed and evaluated in MATLAB.To create a deep stacked S2SAE prediction model,a deep Bi-LSTM-based encoder and decoder are stacked on top of one another to reduce the dimension of the input sequence,extract its features,and then reconstruct it to produce the forecasts.Hyperparameters of the proposed deep stacked S2SAE forecasting model were optimized using the Bayesian optimization algorithm.Moreover,the forecasting performance of the proposed Bi-LSTM-based deep stacked S2SAE model was compared to three other deep,and shallow stacked S2SAEs,i.e.,the LSTM-based deep stacked S2SAE model,gated recurrent unit-based deep stacked S2SAE model,and Bi-LSTM-based shallow stacked S2SAE model.All these models were also optimized and modeled in MATLAB.The results simulated based on actual data confirmed that the proposed model outperformed the alternatives by achieving an accuracy of up to 99.7%,which evidenced the high reliability of the proposed forecasting. 展开更多
关键词 Deep stacked autoencoder sequence to sequence autoencoder bidirectional long short-term memory network wind speed forecasting solar irradiation forecasting
下载PDF
Parallel Reinforcement Learning-Based Energy Efficiency Improvement for a Cyber-Physical System 被引量:17
9
作者 Teng Liu Bin Tian +1 位作者 Yunfeng Ai Fei-Yue Wang 《IEEE/CAA Journal of Automatica Sinica》 EI CSCD 2020年第2期617-626,共10页
As a complex and critical cyber-physical system(CPS),the hybrid electric powertrain is significant to mitigate air pollution and improve fuel economy.Energy management strategy(EMS)is playing a key role to improve the... As a complex and critical cyber-physical system(CPS),the hybrid electric powertrain is significant to mitigate air pollution and improve fuel economy.Energy management strategy(EMS)is playing a key role to improve the energy efficiency of this CPS.This paper presents a novel bidirectional long shortterm memory(LSTM)network based parallel reinforcement learning(PRL)approach to construct EMS for a hybrid tracked vehicle(HTV).This method contains two levels.The high-level establishes a parallel system first,which includes a real powertrain system and an artificial system.Then,the synthesized data from this parallel system is trained by a bidirectional LSTM network.The lower-level determines the optimal EMS using the trained action state function in the model-free reinforcement learning(RL)framework.PRL is a fully data-driven and learning-enabled approach that does not depend on any prediction and predefined rules.Finally,real vehicle testing is implemented and relevant experiment data is collected and calibrated.Experimental results validate that the proposed EMS can achieve considerable energy efficiency improvement by comparing with the conventional RL approach and deep RL. 展开更多
关键词 bidirectional long short-term memory(LSTM)network cyber-physical system(CPS) energy management parallel system reinforcement learning(RL)
下载PDF
Traffic flow prediction based on BILSTM model and data denoising scheme 被引量:4
10
作者 Zhong-Yu Li Hong-Xia Ge Rong-Jun Cheng 《Chinese Physics B》 SCIE EI CAS CSCD 2022年第4期191-200,共10页
Accurate prediction of road traffic flow is a significant part in the intelligent transportation systems.Accurate prediction can alleviate traffic congestion,and reduce environmental pollution.For the management depar... Accurate prediction of road traffic flow is a significant part in the intelligent transportation systems.Accurate prediction can alleviate traffic congestion,and reduce environmental pollution.For the management department,it can make effective use of road resources.For individuals,it can help people plan their own travel paths,avoid congestion,and save time.Owing to complex factors on the road,such as damage to the detector and disturbances from environment,the measured traffic volume can contain noise.Reducing the influence of noise on traffic flow prediction is a piece of very important work.Therefore,in this paper we propose a combination algorithm of denoising and BILSTM to effectively improve the performance of traffic flow prediction.At the same time,three denoising algorithms are compared to find the best combination mode.In this paper,the wavelet(WL) denoising scheme,the empirical mode decomposition(EMD) denoising scheme,and the ensemble empirical mode decomposition(EEMD) denoising scheme are all introduced to suppress outliers in traffic flow data.In addition,we combine the denoising schemes with bidirectional long short-term memory(BILSTM)network to predict the traffic flow.The data in this paper are cited from performance measurement system(PeMS).We choose three kinds of road data(mainline,off ramp,on ramp) to predict traffic flow.The results for mainline show that data denoising can improve prediction accuracy.Moreover,prediction accuracy of BILSTM+EEMD scheme is the highest in the three methods(BILSTM+WL,BILSTM+EMD,BILSTM+EEMD).The results for off ramp and on ramp show the same performance as the results for mainline.It is indicated that this model is suitable for different road sections and long-term prediction. 展开更多
关键词 traffic flow prediction bidirectional long short-term memory network data denoising
下载PDF
Deep Broad Learning for Emotion Classification in Textual Conversations
11
作者 Sancheng Peng Rong Zeng +3 位作者 Hongzhan Liu Lihong Cao Guojun Wang Jianguo Xie 《Tsinghua Science and Technology》 SCIE EI CAS CSCD 2024年第2期481-491,共11页
Emotion classification in textual conversations focuses on classifying the emotion of each utterance from textual conversations.It is becoming one of the most important tasks for natural language processing in recent ... Emotion classification in textual conversations focuses on classifying the emotion of each utterance from textual conversations.It is becoming one of the most important tasks for natural language processing in recent years.However,it is a challenging task for machines to conduct emotion classification in textual conversations because emotions rely heavily on textual context.To address the challenge,we propose a method to classify emotion in textual conversations,by integrating the advantages of deep learning and broad learning,namely DBL.It aims to provide a more effective solution to capture local contextual information(i.e.,utterance-level)in an utterance,as well as global contextual information(i.e.,speaker-level)in a conversation,based on Convolutional Neural Network(CNN),Bidirectional Long Short-Term Memory(Bi-LSTM),and broad learning.Extensive experiments have been conducted on three public textual conversation datasets,which show that the context in both utterance-level and speaker-level is consistently beneficial to the performance of emotion classification.In addition,the results show that our proposed method outperforms the baseline methods on most of the testing datasets in weighted-average F1. 展开更多
关键词 emotion classification textual conversation Convolutional Neural network(CNN) bidirectional Long short-term memory(Bi-LSTM) broad learning
原文传递
Infrasound Event Classification Fusion Model Based on Multiscale SE-CNN and BiLSTM
12
作者 Hongru Li Xihai Li +3 位作者 Xiaofeng Tan Chao Niu Jihao Liu Tianyou Liu 《Applied Geophysics》 SCIE 2024年第3期579-592,620,共15页
The classification of infrasound events has considerable importance in improving the capability to identify the types of natural disasters.The traditional infrasound classification mainly relies on machine learning al... The classification of infrasound events has considerable importance in improving the capability to identify the types of natural disasters.The traditional infrasound classification mainly relies on machine learning algorithms after artificial feature extraction.However,guaranteeing the effectiveness of the extracted features is difficult.The current trend focuses on using a convolution neural network to automatically extract features for classification.This method can be used to extract signal spatial features automatically through a convolution kernel;however,infrasound signals contain not only spatial information but also temporal information when used as a time series.These extracted temporal features are also crucial.If only a convolution neural network is used,then the time dependence of the infrasound sequence will be missed.Using long short-term memory networks can compensate for the missing time-series features but induces spatial feature information loss of the infrasound signal.A multiscale squeeze excitation–convolution neural network–bidirectional long short-term memory network infrasound event classification fusion model is proposed in this study to address these problems.This model automatically extracted temporal and spatial features,adaptively selected features,and also realized the fusion of the two types of features.Experimental results showed that the classification accuracy of the model was more than 98%,thus verifying the effectiveness and superiority of the proposed model. 展开更多
关键词 infrasound classification channel attention convolution neural network bidirectional long short-term memory network multiscale feature fusion
下载PDF
DeepBio:A Deep CNN and Bi-LSTM Learning for Person Identification Using Ear Biometrics
13
作者 Anshul Mahajan Sunil K.Singla 《Computer Modeling in Engineering & Sciences》 SCIE EI 2024年第11期1623-1649,共27页
The identification of individuals through ear images is a prominent area of study in the biometric sector.Facial recognition systems have faced challenges during the COVID-19 pandemic due to mask-wearing,prompting the... The identification of individuals through ear images is a prominent area of study in the biometric sector.Facial recognition systems have faced challenges during the COVID-19 pandemic due to mask-wearing,prompting the exploration of supplementary biometric measures such as ear biometrics.The research proposes a Deep Learning(DL)framework,termed DeepBio,using ear biometrics for human identification.It employs two DL models and five datasets,including IIT Delhi(IITD-I and IITD-II),annotated web images(AWI),mathematical analysis of images(AMI),and EARVN1.Data augmentation techniques such as flipping,translation,and Gaussian noise are applied to enhance model performance and mitigate overfitting.Feature extraction and human identification are conducted using a hybrid approach combining Convolutional Neural Networks(CNN)and Bidirectional Long Short-Term Memory(Bi-LSTM).The DeepBio framework achieves high recognition rates of 97.97%,99.37%,98.57%,94.5%,and 96.87%on the respective datasets.Comparative analysis with existing techniques demonstrates improvements of 0.41%,0.47%,12%,and 9.75%on IITD-II,AMI,AWE,and EARVN1 datasets,respectively. 展开更多
关键词 Data augmentation convolutional neural network bidirectional long short-term memory deep learning ear biometrics
下载PDF
Research of Clinical Named Entity Recognition Based on Bi-LSTM-CRF 被引量:15
14
作者 QIN Ying ZENG Yingfei 《Journal of Shanghai Jiaotong university(Science)》 EI 2018年第3期392-397,共6页
Electronic Medical Records(EMR) with unstructured sentences and various conceptual expressions provide rich information for medical information extraction. However, common Named Entity Recognition(NER)in Natural Langu... Electronic Medical Records(EMR) with unstructured sentences and various conceptual expressions provide rich information for medical information extraction. However, common Named Entity Recognition(NER)in Natural Language Processing(NLP) are not well suitable for clinical NER in EMR. This study aims at applying neural networks to clinical concept extractions. We integrate Bidirectional Long Short-Term Memory Networks(Bi-LSTM) with a Conditional Random Fields(CRF) layer to detect three types of clinical named entities. Word representations fed into the neural networks are concatenated by character-based word embeddings and Continuous Bag of Words(CBOW) embeddings trained both on domain and non-domain corpus. We test our NER system on i2b2/VA open datasets and compare the performance with six related works, achieving the best result of NER with F1 value 0.853 7. We also point out a few specific problems in clinical concept extractions which will give some hints to deeper studies. 展开更多
关键词 clinical named entity recognition bidirectional long short-term memory networks conditional random fields
原文传递
A Character Flow Framework for Multi-Oriented Scene Text Detection 被引量:1
15
作者 Wen-Jun Yang Bei-Ji Zou +1 位作者 Kai-Wen Li Shu Liu 《Journal of Computer Science & Technology》 SCIE EI CSCD 2021年第3期465-477,共13页
Scene text detection plays a significant role in various applications,such as object recognition,document management,and visual navigation.The instance segmentation based method has been mostly used in existing resear... Scene text detection plays a significant role in various applications,such as object recognition,document management,and visual navigation.The instance segmentation based method has been mostly used in existing research due to its advantages in dealing with multi-oriented texts.However,a large number of non-text pixels exist in the labels during the model training,leading to text mis-segmentation.In this paper,we propose a novel multi-oriented scene text detection framework,which includes two main modules:character instance segmentation(one instance corresponds to one character),and character flow construction(one character flow corresponds to one word).We use feature pyramid network(FPN)to predict character and non-character instances with arbitrary directions.A joint network of FPN and bidirectional long short-term memory(BLSTM)is developed to explore the context information among isolated characters,which are finally grouped into character flows.Extensive experiments are conducted on ICDAR2013,ICDAR2015,MSRA-TD500 and MLT datasets to demonstrate the effectiveness of our approach.The F-measures are 92.62%,88.02%,83.69%and 77.81%,respectively. 展开更多
关键词 multi-oriented scene text detection character instance segmentation character flow feature pyramid network(FPN) bidirectional long short-term memory(BLSTM)
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部