期刊文献+
共找到19篇文章
< 1 >
每页显示 20 50 100
Enhanced Attention-Based Encoder-Decoder Framework for Text Recognition 被引量:2
1
作者 S.Prabu K.Joseph Abraham Sundar 《Intelligent Automation & Soft Computing》 SCIE 2023年第2期2071-2086,共16页
Recognizing irregular text in natural images is a challenging task in computer vision.The existing approaches still face difficulties in recognizing irre-gular text because of its diverse shapes.In this paper,we propos... Recognizing irregular text in natural images is a challenging task in computer vision.The existing approaches still face difficulties in recognizing irre-gular text because of its diverse shapes.In this paper,we propose a simple yet powerful irregular text recognition framework based on an encoder-decoder archi-tecture.The proposed framework is divided into four main modules.Firstly,in the image transformation module,a Thin Plate Spline(TPS)transformation is employed to transform the irregular text image into a readable text image.Sec-ondly,we propose a novel Spatial Attention Module(SAM)to compel the model to concentrate on text regions and obtain enriched feature maps.Thirdly,a deep bi-directional long short-term memory(Bi-LSTM)network is used to make a con-textual feature map out of a visual feature map generated from a Convolutional Neural Network(CNN).Finally,we propose a Dual Step Attention Mechanism(DSAM)integrated with the Connectionist Temporal Classification(CTC)-Attention decoder to re-weights visual features and focus on the intra-sequence relationships to generate a more accurate character sequence.The effectiveness of our proposed framework is verified through extensive experiments on various benchmarks datasets,such as SVT,ICDAR,CUTE80,and IIIT5k.The perfor-mance of the proposed text recognition framework is analyzed with the accuracy metric.Demonstrate that our proposed method outperforms the existing approaches on both regular and irregular text.Additionally,the robustness of our approach is evaluated using the grocery datasets,such as GroZi-120,Web-Market,SKU-110K,and Freiburg Groceries datasets that contain complex text images.Still,our framework produces superior performance on grocery datasets. 展开更多
关键词 Deep learning text recognition text normalization attention mechanism convolutional neural network(CNN)
下载PDF
Improving CNN-BGRU Hybrid Network for Arabic Handwritten Text Recognition 被引量:1
2
作者 Sofiene Haboubi Tawfik Guesmi +4 位作者 Badr M Alshammari Khalid Alqunun Ahmed S Alshammari Haitham Alsaif Hamid Amiri 《Computers, Materials & Continua》 SCIE EI 2022年第12期5385-5397,共13页
Handwriting recognition is a challenge that interests many researchers around the world.As an exception,handwritten Arabic script has many objectives that remain to be overcome,given its complex form,their number of f... Handwriting recognition is a challenge that interests many researchers around the world.As an exception,handwritten Arabic script has many objectives that remain to be overcome,given its complex form,their number of forms which exceeds 100 and its cursive nature.Over the past few years,good results have been obtained,but with a high cost of memory and execution time.In this paper we propose to improve the capacity of bidirectional gated recurrent unit(BGRU)to recognize Arabic text.The advantages of using BGRUs is the execution time compared to other methods that can have a high success rate but expensive in terms of time andmemory.To test the recognition capacity of BGRU,the proposed architecture is composed by 6 convolutional neural network(CNN)blocks for feature extraction and 1 BGRU+2 dense layers for learning and test.The experiment is carried out on the entire database of institut für nachrichtentechnik/ecole nationale d’ingénieurs de Tunis(IFN/ENIT)without any preprocessing or data selection.The obtained results show the ability of BGRUs to recognize handwritten Arabic script. 展开更多
关键词 Arabic handwritten script handwritten text recognition deep learning IFN/ENIT bidirectional GRU neural network
下载PDF
An Efficient Text Recognition System from Complex Color Image for Helping the Visually Impaired Persons
3
作者 Ahmed Ben Atitallah Mohamed Amin Ben Atitallah +5 位作者 Yahia Said Mohammed Albekairi Anis Boudabous Turki MAlanazi Khaled Kaaniche Mohamed Atri 《Computer Systems Science & Engineering》 SCIE EI 2023年第7期701-717,共17页
The challenge faced by the visually impaired persons in their day-today lives is to interpret text from documents.In this context,to help these people,the objective of this work is to develop an efficient text recogni... The challenge faced by the visually impaired persons in their day-today lives is to interpret text from documents.In this context,to help these people,the objective of this work is to develop an efficient text recognition system that allows the isolation,the extraction,and the recognition of text in the case of documents having a textured background,a degraded aspect of colors,and of poor quality,and to synthesize it into speech.This system basically consists of three algorithms:a text localization and detection algorithm based on mathematical morphology method(MMM);a text extraction algorithm based on the gamma correction method(GCM);and an optical character recognition(OCR)algorithm for text recognition.A detailed complexity study of the different blocks of this text recognition system has been realized.Following this study,an acceleration of the GCM algorithm(AGCM)is proposed.The AGCM algorithm has reduced the complexity in the text recognition system by 70%and kept the same quality of text recognition as that of the original method.To assist visually impaired persons,a graphical interface of the entire text recognition chain has been developed,allowing the capture of images from a camera,rapid and intuitive visualization of the recognized text from this image,and text-to-speech synthesis.Our text recognition system provides an improvement of 6.8%for the recognition rate and 7.6%for the F-measure relative to GCM and AGCM algorithms. 展开更多
关键词 text recognition system GCM AGCM OCR color images graphical interface
下载PDF
An Efficient Hybrid Model for Arabic Text Recognition
4
作者 Hicham Lamtougui Hicham El Moubtahij +1 位作者 Hassan Fouadi Khalid Satori 《Computers, Materials & Continua》 SCIE EI 2023年第2期2871-2888,共18页
In recent years,Deep Learning models have become indispensable in several fields such as computer vision,automatic object recognition,and automatic natural language processing.The implementation of a robust and effici... In recent years,Deep Learning models have become indispensable in several fields such as computer vision,automatic object recognition,and automatic natural language processing.The implementation of a robust and efficient handwritten text recognition system remains a challenge for the research community in this field,especially for the Arabic language,which,compared to other languages,has a dearth of published works.In this work,we presented an efficient and new system for offline Arabic handwritten text recognition.Our new approach is based on the combination of a Convolutional Neural Network(CNN)and a Bidirectional Long-Term Memory(BLSTM)followed by a Connectionist Temporal Classification layer(CTC).Moreover,during the training phase of the model,we introduce an algorithm of data augmentation to increase the quality of data.Our proposed approach can recognize Arabic handwritten texts without the need to segment the characters,thus overcoming several problems related to this point.To train and test(evaluate)our approach,we used two Arabic handwritten text recognition databases,which are IFN/ENIT and KHATT.The Experimental results show that our new approach,compared to other methods in the literature,gives better results. 展开更多
关键词 Deep learning arabic handwritten text recognition convolutional neural network(CNN) bidirectional long-term memory(BLSTM) connectionist temporal classification(CTC)
下载PDF
Menu Text Recognition of Few-shot Learning
5
作者 Xiaoyu Tian Zhenzhen +3 位作者 Xin Zihao Liu Suolan Chen Fuhua Wang Hongyuan 《Journal of New Media》 2022年第3期137-143,共7页
Recent advances in OCR show that end-to-end(E2E)training pipelines including detection and identification can achieve the best results.However,many existing methods usually focus on case insensitive English characters... Recent advances in OCR show that end-to-end(E2E)training pipelines including detection and identification can achieve the best results.However,many existing methods usually focus on case insensitive English characters.In this paper,we apply an E2E approach,the multiplex multilingual mask TextSpotter,which performs script recognition at the word level and uses different recognition headers to process different scripts while maintaining uniform loss,thus optimizing script recognition and multiple recognition headers simultaneously.Experiments show that this method is superior to the single-head model with similar number of parameters in endto-end identification tasks. 展开更多
关键词 text recognition script identification few-shot learning multiple languages
下载PDF
Text Recognition of Barcode Images under Harsh Lighting Conditions 被引量:1
6
作者 WU Xing GE Yuxi +1 位作者 ZHANG Qingfeng CHEN Liming 《Wuhan University Journal of Natural Sciences》 CAS CSCD 2020年第6期531-537,共7页
The inventory counting of silver ingots plays a key role in silver futures.However,the manual inventory counting is time-consuming and labor-intensive.Furthermore,the silver ingots are stored in warehouses with harsh ... The inventory counting of silver ingots plays a key role in silver futures.However,the manual inventory counting is time-consuming and labor-intensive.Furthermore,the silver ingots are stored in warehouses with harsh lighting conditions,which makes the automatic inventory counting difficult.To meet the challenge,we propose an automatic inventory counting method integrating object detection and text recognition under harsh lighting conditions.With the help of our own dataset,the barcode on each silver ingot is detected and cropped by the feature pyramid network(FPN).The cropped image is normalized and corrected for text recognition.We use the PSENet+CRNN(Progressive Scale Expansion Network,Convolutional Recurrent Neural Network)for text detection and recognition to obtain the serial number of the silver ingot image.Experimental results show that the proposed automatic inventory counting method achieves good results since the accuracy of the proposed object detection and text recognition under harsh lighting conditions is near 99%. 展开更多
关键词 BARCODE object detection text recognition deep learning
原文传递
CNN and Fuzzy Rules Based Text Detection and Recognition from Natural Scenes
7
作者 T.Mithila R.Arunprakash A.Ramachandran 《Computer Systems Science & Engineering》 SCIE EI 2022年第9期1165-1179,共15页
In today’s real world, an important research part in image processing isscene text detection and recognition. Scene text can be in different languages,fonts, sizes, colours, orientations and structures. Moreover, the... In today’s real world, an important research part in image processing isscene text detection and recognition. Scene text can be in different languages,fonts, sizes, colours, orientations and structures. Moreover, the aspect ratios andlayouts of a scene text may differ significantly. All these variations appear assignificant challenges for the detection and recognition algorithms that are consideredfor the text in natural scenes. In this paper, a new intelligent text detection andrecognition method for detectingthe text from natural scenes and forrecognizingthe text by applying the newly proposed Conditional Random Field-based fuzzyrules incorporated Convolutional Neural Network (CR-CNN) has been proposed.Moreover, we have recommended a new text detection method for detecting theexact text from the input natural scene images. For enhancing the presentation ofthe edge detection process, image pre-processing activities such as edge detectionand color modeling have beenapplied in this work. In addition, we have generatednew fuzzy rules for making effective decisions on the processes of text detectionand recognition. The experiments have been directedusing the standard benchmark datasets such as the ICDAR 2003, the ICDAR 2011, the ICDAR2005 and the SVT and have achieved better detection accuracy intext detectionand recognition. By using these three datasets, five different experiments havebeen conducted for evaluating the proposed model. And also, we have comparedthe proposed system with the other classifiers such as the SVM, the MLP and theCNN. In these comparisons, the proposed model has achieved better classificationaccuracywhen compared with the other existing works. 展开更多
关键词 CRF RULES text detection text recognition natural scene images CR-CNN
下载PDF
Embedded System Based Raspberry Pi 4 for Text Detection and Recognition
8
作者 Turki M.Alanazi 《Intelligent Automation & Soft Computing》 SCIE 2023年第6期3343-3354,共12页
Detecting and recognizing text from natural scene images presents a challenge because the image quality depends on the conditions in which the image is captured,such as viewing angles,blurring,sensor noise,etc.However... Detecting and recognizing text from natural scene images presents a challenge because the image quality depends on the conditions in which the image is captured,such as viewing angles,blurring,sensor noise,etc.However,in this paper,a prototype for text detection and recognition from natural scene images is proposed.This prototype is based on the Raspberry Pi 4 and the Universal Serial Bus(USB)camera and embedded our text detection and recognition model,which was developed using the Python language.Our model is based on the deep learning text detector model through the Efficient and Accurate Scene Text Detec-tor(EAST)model for text localization and detection and the Tesseract-OCR,which is used as an Optical Character Recognition(OCR)engine for text recog-nition.Our prototype is controlled by the Virtual Network Computing(VNC)tool through a computer via a wireless connection.The experiment results show that the recognition rate for the captured image through the camera by our prototype can reach 99.75%with low computational complexity.Furthermore,our proto-type is more performant than the Tesseract software in terms of the recognition rate.Besides,it provides the same performance in terms of the recognition rate with a huge decrease in the execution time by an average of 89%compared to the EasyOCR software on the Raspberry Pi 4 board. 展开更多
关键词 text detection text recognition OCR engine natural scene images Raspberry Pi USB camera
下载PDF
An improved CRNN for Vietnamese Identity Card Information Recognition 被引量:2
9
作者 Trinh Tan Dat Le Tran Anh Dang +4 位作者 Nguyen Nhat Truong Pham Cung Le Thien Vu Vu Ngoc Thanh Sang Pham Thi Vuong Pham The Bao 《Computer Systems Science & Engineering》 SCIE EI 2022年第2期539-555,共17页
This paper proposes an enhancement of an automatic text recognition system for extracting information from the front side of the Vietnamese citizen identity(CID)card.First,we apply Mask-RCNN to segment and align the C... This paper proposes an enhancement of an automatic text recognition system for extracting information from the front side of the Vietnamese citizen identity(CID)card.First,we apply Mask-RCNN to segment and align the CID card from the background.Next,we present two approaches to detect the CID card’s text lines using traditional image processing techniques compared to the EAST detector.Finally,we introduce a new end-to-end Convolutional Recurrent Neural Network(CRNN)model based on a combination of Connectionist Temporal Classification(CTC)and attention mechanism for Vietnamese text recognition by jointly train the CTC and attention objective functions together.The length of the CTC’s output label sequence is applied to the attention-based decoder prediction to make the final label sequence.This process helps to decrease irregular alignments and speed up the label sequence estimation during training and inference,instead of only relying on a data-driven attention-based encoder-decoder to estimate the label sequence in long sentences.We may directly learn the proposed model from a sequence of words without detailed annotations.We evaluate the proposed system using a real collected Vietnamese CID card dataset and find that our method provides a 4.28%in WER and outperforms the common techniques. 展开更多
关键词 Vietnamese text recognition OCR CRNN BLSTM attention mechanism joint CTC-Attention
下载PDF
An Attention-Based Recognizer for Scene Text 被引量:1
10
作者 Yugang Li Haibo Sun 《Journal on Artificial Intelligence》 2020年第2期103-112,共10页
Scene text recognition(STR)is the task of recognizing character sequences in natural scenes.Although STR method has been greatly developed,the existing methods still can't recognize any shape of text,such as very ... Scene text recognition(STR)is the task of recognizing character sequences in natural scenes.Although STR method has been greatly developed,the existing methods still can't recognize any shape of text,such as very rich curve text or rotating text in daily life,irregular scene text has complex layout in two-dimensional space,which is used to recognize scene text in the past Recently,some recognizers correct irregular text to regular text image with approximate 1D layout,or convert 2D image feature mapping to one-dimensional feature sequence.Although these methods have achieved good performance,their robustness and accuracy are limited due to the loss of spatial information in the process of two-dimensional to one-dimensional transformation.In this paper,we proposes a framework to directly convert the irregular text of two-dimensional layout into character sequence by using the relationship attention module to capture the correlation of feature mapping Through a large number of experiments on multiple common benchmarks,our method can effectively identify regular and irregular scene text,and is superior to the previous methods in accuracy. 展开更多
关键词 Scene text recognition irregular text ATTENTION
下载PDF
Recognition of Urdu Handwritten Alphabet Using Convolutional Neural Network (CNN)
11
作者 Gulzar Ahmed Tahir Alyas +4 位作者 Muhammad Waseem Iqbal Muhammad Usman Ashraf Ahmed Mohammed Alghamdi Adel A.Bahaddad Khalid Ali Almarhabi 《Computers, Materials & Continua》 SCIE EI 2022年第11期2967-2984,共18页
Handwritten character recognition systems are used in every field of life nowadays,including shopping malls,banks,educational institutes,etc.Urdu is the national language of Pakistan,and it is the fourth spoken langua... Handwritten character recognition systems are used in every field of life nowadays,including shopping malls,banks,educational institutes,etc.Urdu is the national language of Pakistan,and it is the fourth spoken language in the world.However,it is still challenging to recognize Urdu handwritten characters owing to their cursive nature.Our paper presents a Convolutional Neural Networks(CNN)model to recognize Urdu handwritten alphabet recognition(UHAR)offline and online characters.Our research contributes an Urdu handwritten dataset(aka UHDS)to empower future works in this field.For offline systems,optical readers are used for extracting the alphabets,while diagonal-based extraction methods are implemented in online systems.Moreover,our research tackled the issue concerning the lack of comprehensive and standard Urdu alphabet datasets to empower research activities in the area of Urdu text recognition.To this end,we collected 1000 handwritten samples for each alphabet and a total of 38000 samples from 12 to 25 age groups to train our CNN model using online and offline mediums.Subsequently,we carried out detailed experiments for character recognition,as detailed in the results.The proposed CNN model outperformed as compared to previously published approaches. 展开更多
关键词 Urdu handwritten text recognition handwritten dataset convolutional neural network artificial intelligence machine learning deep learning
下载PDF
An Auto-Grading Oriented Approach for Off-Line Handwritten Organic Cyclic Compound Structure Formulas Recognition
12
作者 Ting Zhang Yifei Wang +3 位作者 Xinxin Jin Zhiwen Gu Xiaoliang Zhang Bin He 《Computer Modeling in Engineering & Sciences》 SCIE EI 2023年第6期2267-2285,共19页
Auto-grading,as an instruction tool,could reduce teachers’workload,provide students with instant feedback and support highly personalized learning.Therefore,this topic attracts considerable attentions from researcher... Auto-grading,as an instruction tool,could reduce teachers’workload,provide students with instant feedback and support highly personalized learning.Therefore,this topic attracts considerable attentions from researchers recently.To realize the automatic grading of handwritten chemistry assignments,the problem of chemical notations recognition should be solved first.The recent handwritten chemical notations recognition solutions belonging to the end-to-end trainable category suffered fromthe problem of lacking the accurate alignment information between the input and output.They serve the aim of reading notations into electrical devices to better prepare relevant edocuments instead of auto-grading handwritten assignments.To tackle this limitation to enable the auto-grading of handwritten chemistry assignments at a fine-grained level.In this work,we propose a component-detectionbased approach for recognizing off-line handwritten Organic Cyclic Compound Structure Formulas(OCCSFs).Specifically,we define different components of OCCSFs as objects(including graphical objects and text objects),and adopt the deep learning detector to detect them.Then,regarding the detected text objects,we introduce an improved attention-based encoder-decoder model for text recognition.Finally,with these detection results and the geometric relationships of detected objects,this article designs a holistic algorithm for interpreting the spatial structure of handwritten OCCSFs.The proposedmethod is evaluated on a self-collected data set consisting of 3000 samples and achieves promising results. 展开更多
关键词 Handwritten chemical structure formulas structure interpretation components detection text recognition
下载PDF
Recognition of Handwritten Words from Digital Writing Pad Using MMU-SNet
13
作者 V.Jayanthi S.Thenmalar 《Intelligent Automation & Soft Computing》 SCIE 2023年第6期3551-3564,共14页
In this paper,Modified Multi-scale Segmentation Network(MMU-SNet)method is proposed for Tamil text recognition.Handwritten texts from digi-tal writing pad notes are used for text recognition.Handwritten words recognit... In this paper,Modified Multi-scale Segmentation Network(MMU-SNet)method is proposed for Tamil text recognition.Handwritten texts from digi-tal writing pad notes are used for text recognition.Handwritten words recognition for texts written from digital writing pad through text file conversion are challen-ging due to stylus pressure,writing on glass frictionless surfaces,and being less skilled in short writing,alphabet size,style,carved symbols,and orientation angle variations.Stylus pressure on the pad changes the words in the Tamil language alphabet because the Tamil alphabets have a smaller number of lines,angles,curves,and bends.The small change in dots,curves,and bends in the Tamil alphabet leads to error in recognition and changes the meaning of the words because of wrong alphabet conversion.However,handwritten English word recognition and conversion of text files from a digital writing pad are performed through various algorithms such as Support Vector Machine(SVM),Kohonen Neural Network(KNN),and Convolutional Neural Network(CNN)for offline and online alphabet recognition.The proposed algorithms are compared with above algorithms for Tamil word recognition.The proposed MMU-SNet method has achieved good accuracy in predicting text,about 96.8%compared to other traditional CNN algorithms. 展开更多
关键词 Digital handwritten writing pad tamil text recognition SYLLABLE DIALECT
下载PDF
Study on the de-watermark algorithm based on grayscale text
14
作者 Huang Guoquan Chen Zhipeng Sun Xiaocui 《High Technology Letters》 EI CAS 2021年第1期95-102,共8页
When using the current popular text recognition algorithms such as optical character recognition(OCR)algorithm for text images,the presence of watermarks in text images interferes with algorithm recognition to the ext... When using the current popular text recognition algorithms such as optical character recognition(OCR)algorithm for text images,the presence of watermarks in text images interferes with algorithm recognition to the extent of fuzzy font,which is not conducive to the improvement of the recognition rate.In order to pursue fast and high recognition rate,watermark removal has become a critical problem to be solved.This work studies the watermarking algorithm based on morphological algorithm set and classic image algorithm in computer images.It can not only remove the watermark in a short time,but also keep the form and clarity of the text in the image.The algorithm also meets the requirements that the higher the clarity of image and text,the better the processing effect.It can process the Chinese characters with complex structure,complicated radicals or other characters well.In addition,the algorithm can basically process ordinary size images in 1 s,the efficiency is relatively high. 展开更多
关键词 de-watermark text recognition character recognition optical character recognition(OCR)application
下载PDF
A Method for Detecting and Recognizing Yi Character Based on Deep Learning
15
作者 Haipeng Sun Xueyan Ding +2 位作者 Jian Sun HuaYu Jianxin Zhang 《Computers, Materials & Continua》 SCIE EI 2024年第2期2721-2739,共19页
Aiming at the challenges associated with the absence of a labeled dataset for Yi characters and the complexity of Yi character detection and recognition,we present a deep learning-based approach for Yi character detec... Aiming at the challenges associated with the absence of a labeled dataset for Yi characters and the complexity of Yi character detection and recognition,we present a deep learning-based approach for Yi character detection and recognition.In the detection stage,an improved Differentiable Binarization Network(DBNet)framework is introduced to detect Yi characters,in which the Omni-dimensional Dynamic Convolution(ODConv)is combined with the ResNet-18 feature extraction module to obtain multi-dimensional complementary features,thereby improving the accuracy of Yi character detection.Then,the feature pyramid network fusion module is used to further extract Yi character image features,improving target recognition at different scales.Further,the previously generated feature map is passed through a head network to produce two maps:a probability map and an adaptive threshold map of the same size as the original map.These maps are then subjected to a differentiable binarization process,resulting in an approximate binarization map.This map helps to identify the boundaries of the text boxes.Finally,the text detection box is generated after the post-processing stage.In the recognition stage,an improved lightweight MobileNetV3 framework is used to recognize the detect character regions,where the original Squeeze-and-Excitation(SE)block is replaced by the efficient Shuffle Attention(SA)that integrates spatial and channel attention,improving the accuracy of Yi characters recognition.Meanwhile,the use of depth separable convolution and reversible residual structure can reduce the number of parameters and computation of the model,so that the model can better understand the contextual information and improve the accuracy of text recognition.The experimental results illustrate that the proposed method achieves good results in detecting and recognizing Yi characters,with detection and recognition accuracy rates of 97.5%and 96.8%,respectively.And also,we have compared the detection and recognition algorithms proposed in this paper with other typical algorithms.In these comparisons,the proposed model achieves better detection and recognition results with a certain reliability. 展开更多
关键词 Yi characters text detection text recognition attention mechanism deep neural network
下载PDF
Detection and Recognition of Spray Code Numbers on Can Surfaces Based on OCR
16
作者 Hailong Wang Junchao Shi 《Computers, Materials & Continua》 SCIE EI 2025年第1期1109-1128,共20页
A two-stage algorithm based on deep learning for the detection and recognition of can bottom spray codes and numbers is proposed to address the problems of small character areas and fast production line speeds in can ... A two-stage algorithm based on deep learning for the detection and recognition of can bottom spray codes and numbers is proposed to address the problems of small character areas and fast production line speeds in can bottom spray code number recognition.In the coding number detection stage,Differentiable Binarization Network is used as the backbone network,combined with the Attention and Dilation Convolutions Path Aggregation Network feature fusion structure to enhance the model detection effect.In terms of text recognition,using the Scene Visual Text Recognition coding number recognition network for end-to-end training can alleviate the problem of coding recognition errors caused by image color distortion due to variations in lighting and background noise.In addition,model pruning and quantization are used to reduce the number ofmodel parameters to meet deployment requirements in resource-constrained environments.A comparative experiment was conducted using the dataset of tank bottom spray code numbers collected on-site,and a transfer experiment was conducted using the dataset of packaging box production date.The experimental results show that the algorithm proposed in this study can effectively locate the coding of cans at different positions on the roller conveyor,and can accurately identify the coding numbers at high production line speeds.The Hmean value of the coding number detection is 97.32%,and the accuracy of the coding number recognition is 98.21%.This verifies that the algorithm proposed in this paper has high accuracy in coding number detection and recognition. 展开更多
关键词 Can coding recognition differentiable binarization network scene visual text recognition model pruning and quantification transport model
下载PDF
Visual News Ticker Surveillance Approach from Arabic Broadcast Streams
17
作者 Moeen Tayyab Ayyaz Hussain +2 位作者 Usama Mir M.Aqeel Iqbal Muhammad Haneef 《Computers, Materials & Continua》 SCIE EI 2023年第3期6177-6193,共17页
The news ticker is a common feature of many different news networks that display headlines and other information.News ticker recognition applications are highly valuable in e-business and news surveillance for media r... The news ticker is a common feature of many different news networks that display headlines and other information.News ticker recognition applications are highly valuable in e-business and news surveillance for media regulatory authorities.In this paper,we focus on the automatic Arabic Ticker Recognition system for the Al-Ekhbariya news channel.The primary emphasis of this research is on ticker recognition methods and storage schemes.To that end,the research is aimed at character-wise explicit segmentation using a semantic segmentation technique and words identification method.The proposed learning architecture considers the grouping of homogeneousshaped classes.This incorporates linguistic taxonomy in a unified manner to address the imbalance in data distribution which leads to individual biases.Furthermore,experiments with a novel ArabicNews Ticker(Al-ENT)dataset that provides accurate character-level and character components-level labeling to evaluate the effectiveness of the suggested approach.The proposed method attains 96.5%,outperforming the current state-of-the-art technique by 8.5%.The study reveals that our strategy improves the performance of lowrepresentation correlated character classes. 展开更多
关键词 Arabic text recognition optical character recognition deep convolutional network SegNet LeNet
下载PDF
Cyclic Autoencoder for Multimodal Data Alignment Using Custom Datasets
18
作者 Zhenyu Tang Jin Liu +1 位作者 Chao Yu Y.Ken Wang 《Computer Systems Science & Engineering》 SCIE EI 2021年第10期37-54,共18页
The subtitle recognition under multimodal data fusion in this paper aims to recognize text lines from image and audio data.Most existing multimodal fusion methods tend to be associated with pre-fusion as well as post-... The subtitle recognition under multimodal data fusion in this paper aims to recognize text lines from image and audio data.Most existing multimodal fusion methods tend to be associated with pre-fusion as well as post-fusion,which is not reasonable and difficult to interpret.We believe that fusing images and audio before the decision layer,i.e.,intermediate fusion,to take advantage of the complementary multimodal data,will benefit text line recognition.To this end,we propose:(i)a novel cyclic autoencoder based on convolutional neural network.The feature dimensions of the two modal data are aligned under the premise of stabilizing the compressed image features,thus the high-dimensional features of different modal data are fused at the shallow level of the model.(ii)A residual attention mechanism that helps us improve the performance of the recognition.Regions of interest in the image are enhanced and regions of disinterest are weakened,thus we can extract the features of the text regions without further increasing the depth of the model(iii)a fully convolutional network for video subtitle recognition.We choose DenseNet-121 as the backbone network for feature extraction,which effectively enabling the recognition of video subtitles in complex backgrounds.The experiments are performed on our custom datasets,and the automatic and manual evaluation results show that our method reaches the state-of-the-art. 展开更多
关键词 Deep learning convolutional neural network MULTIMODAL text recognition
下载PDF
Scene word recognition from pieces to whole 被引量:1
19
作者 Anna ZHU Seiichi UCHIDA 《Frontiers of Computer Science》 SCIE EI CSCD 2019年第2期292-301,共10页
Convolutional neural networks (CNNs) have had great success with regard to the object classification problem. For character classification, we found that training and testing using accurately segmented character regio... Convolutional neural networks (CNNs) have had great success with regard to the object classification problem. For character classification, we found that training and testing using accurately segmented character regions with CNNs resulted in higher accuracy than when roughly segmented regions were used. Therefore, we expect to extract complete character regions from seene images. Text in natural scene images has an obvious contrast with its attachments. Many methods attempt to extract characters through different segmentation techniques. However, for blurred, occluded, and complex background cases, those methods may result in adjoined or over segmented characters. In this paper, we propose a scene word recognition model that integrates words from small pieces to entire after-cluster-based segmentation. The segmented connected components are classified as four types: background, in dividual character proposals, adjoined characters, and stroke proposals. Individual character proposals are directly inputted to a CNN that is trained using accurately segmented character images. The sliding window strategy is applied to adjoined character regions. Stroke proposals are considered as fragments of entire characters whose locations are estimated by a stroke spatial distribution system. Then、the estimated characters from adjoined characters and stroke proposals are classified by a CNN that is trained on roughly segmented character images. Finally, a lexicondriven integration method is performed to obtain the final word recognition results. Compared to other word recognition methods, our method achieves a comparable performance on Street View Text and the ICDAR 2003 and ICDAR 2013 benchmark databases. Moreover, our method can deal with recognizing text images of occlusion and improperly segmented text images. 展开更多
关键词 text recognition convolutional neural networks cluster-based segmentation character integration
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部