As an indispensable part of identity authentication,offline writer identification plays a notable role in biology,forensics,and historical document analysis.However,identifying handwriting efficiently,stably,and quick...As an indispensable part of identity authentication,offline writer identification plays a notable role in biology,forensics,and historical document analysis.However,identifying handwriting efficiently,stably,and quickly is still challenging due to the method of extracting and processing handwriting features.In this paper,we propose an efficient system to identify writers through handwritten images,which integrates local and global features from similar handwritten images.The local features are modeled by effective aggregate processing,and global features are extracted through transfer learning.Specifically,the proposed system employs a pre-trained Residual Network to mine the relationship between large image sets and specific handwritten images,while the vector of locally aggregated descriptors with double power normalization is employed in aggregating local and global features.Moreover,handwritten image segmentation,preprocessing,enhancement,optimization of neural network architecture,and normalization for local and global features are exploited,significantly improving system performance.The proposed system is evaluated on Computer Vision Lab(CVL)datasets and the International Conference on Document Analysis and Recognition(ICDAR)2013 datasets.The results show that it represents good generalizability and achieves state-of-the-art performance.Furthermore,the system performs better when training complete handwriting patches with the normalization method.The experimental result indicates that it’s significant to segment handwriting reasonably while dealing with handwriting overlap,which reduces visual burstiness.展开更多
The writer identification system identifies individuals based on their handwriting is a frequent topic in biometric authentication and verification systems.Due to its importance,numerous studies have been conducted in...The writer identification system identifies individuals based on their handwriting is a frequent topic in biometric authentication and verification systems.Due to its importance,numerous studies have been conducted in various languages.Researchers have established several learning methods for writer identification including supervised and unsupervised learning.However,supervised methods require a large amount of annotation data,which is impossible in most scenarios.On the other hand,unsupervised writer identification methods may be limited and dependent on feature extraction that cannot provide the proper objectives to the architecture and be misinterpreted.This paper introduces an unsupervised writer identification system that analyzes the data and recognizes the writer based on the inter-feature relations of the data to resolve the uncertainty of the features.A pairwise architecturebased Autoembedder was applied to generate clusterable embeddings for handwritten text images.Furthermore,the trained baseline architecture generates the embedding of the data image,and the K-means algorithm is used to distinguish the embedding of individual writers.The proposed model utilized the IAM dataset for the experiment as it is inconsistent with contributions from the authors but is easily accessible for writer identification tasks.In addition,traditional evaluation metrics are used in the proposed model.Finally,the proposed model is compared with a few unsupervised models,and it outperformed the state-of-the-art deep convolutional architectures in recognizing writers based on unlabeled data.展开更多
Writer identification(WI)based on handwritten text structures is typically focused on digital characteristics,with letters/strokes representing the information acquired from the current research in the integration of ...Writer identification(WI)based on handwritten text structures is typically focused on digital characteristics,with letters/strokes representing the information acquired from the current research in the integration of individual writing habits/styles.Previous studies have indicated that a word’s attributes contribute to greater recognition than the attributes of a character or stroke.As a result of the complexity of Arabic handwriting,segmenting and separating letters and strokes from a script poses a challenge in addition to WI schemes.In this work,we propose new texture features for WI based on text.The histogram of oriented gradient(HOG)features are modified to extract good features on the basis of the histogram of the orientation for different angles of texts.The fusion of these features with the features of convolutional neural networks(CNNs)results in a good vector of powerful features.Then,we reduce the features by selecting the best ones using a genetic algorithm.The normalization method is used to normalize the features and feed them to an artificial neural network classifier.Experimental results show that the proposed augmenter enhances the results for HOG features and ResNet50,as well as the proposed model,because the amount of data is increased.Such a large data volume helps the system to retrieve extensive information about the nature of writing patterns.The affective result of the proposed model for whole paragraphs,lines,and sub words is obtained using different models and then compared with those of the CNN and ResNet50.The whole paragraphs produce the best results in all models because they contain rich information and the model can utilize numerous features for different words.The HOG and CNN features achieve 94.2%accuracy for whole paragraphs with augmentation,83.2%of accuracy for lines,and 78%accuracy for sub words.Thus,this work provides a system that can identify writers on the basis of their handwriting and builds a powerful model that can help identify writers on the basis of their sentences,words,and sub words.展开更多
The writer identification(WI)of handwritten Arabic text is now of great concern to intelligence agencies following the recent attacks perpetrated by known Middle East terrorist organizations.It is also a useful instru...The writer identification(WI)of handwritten Arabic text is now of great concern to intelligence agencies following the recent attacks perpetrated by known Middle East terrorist organizations.It is also a useful instrument for the digitalization and attribution of old text to other authors of historic studies,including old national and religious archives.In this study,we proposed a new affective segmentation model by modifying an artificial neural network model and making it suitable for the binarization stage based on blocks.This modified method is combined with a new effective rotation model to achieve an accurate segmentation through the analysis of the histogram of binary images.Also,propose a new framework for correct text rotation that will help us to establish a segmentation method that can facilitate the extraction of text from its background.Image projections and the radon transform are used and improved using machine learning based on a co-occurrence matrix to produce binary images.The training stage involves taking a number of images for model training.These images are selected randomly with different angles to generate four classes(0–90,90–180,180–270,and 270–360).The proposed segmentation approach achieves a high accuracy of 98.18%.The study ultimately provides two major contributions that are ranked from top to bottom according to the degree of importance.The proposed method can be further developed as a new application and used in the recognition of handwritten Arabic text from small documents regardless of logical combinations and sentence construction.展开更多
基金supported in part by the Postgraduate Research&Practice Innovation Program of Jiangsu Province under Grant KYCX 20_0758in part by the Science and Technology Research Project of Jiangsu Public Security Department under Grant 2020KX005+1 种基金in part by the General Project of Philosophy and Social Science Research in Colleges and Universities in Jiangsu Province under Grant 2022SJYB0473in part by“Cyberspace Security”Construction Project of Jiangsu Provincial Key Discipline during the“14th Five Year Plan”.
文摘As an indispensable part of identity authentication,offline writer identification plays a notable role in biology,forensics,and historical document analysis.However,identifying handwriting efficiently,stably,and quickly is still challenging due to the method of extracting and processing handwriting features.In this paper,we propose an efficient system to identify writers through handwritten images,which integrates local and global features from similar handwritten images.The local features are modeled by effective aggregate processing,and global features are extracted through transfer learning.Specifically,the proposed system employs a pre-trained Residual Network to mine the relationship between large image sets and specific handwritten images,while the vector of locally aggregated descriptors with double power normalization is employed in aggregating local and global features.Moreover,handwritten image segmentation,preprocessing,enhancement,optimization of neural network architecture,and normalization for local and global features are exploited,significantly improving system performance.The proposed system is evaluated on Computer Vision Lab(CVL)datasets and the International Conference on Document Analysis and Recognition(ICDAR)2013 datasets.The results show that it represents good generalizability and achieves state-of-the-art performance.Furthermore,the system performs better when training complete handwriting patches with the normalization method.The experimental result indicates that it’s significant to segment handwriting reasonably while dealing with handwriting overlap,which reduces visual burstiness.
文摘The writer identification system identifies individuals based on their handwriting is a frequent topic in biometric authentication and verification systems.Due to its importance,numerous studies have been conducted in various languages.Researchers have established several learning methods for writer identification including supervised and unsupervised learning.However,supervised methods require a large amount of annotation data,which is impossible in most scenarios.On the other hand,unsupervised writer identification methods may be limited and dependent on feature extraction that cannot provide the proper objectives to the architecture and be misinterpreted.This paper introduces an unsupervised writer identification system that analyzes the data and recognizes the writer based on the inter-feature relations of the data to resolve the uncertainty of the features.A pairwise architecturebased Autoembedder was applied to generate clusterable embeddings for handwritten text images.Furthermore,the trained baseline architecture generates the embedding of the data image,and the K-means algorithm is used to distinguish the embedding of individual writers.The proposed model utilized the IAM dataset for the experiment as it is inconsistent with contributions from the authors but is easily accessible for writer identification tasks.In addition,traditional evaluation metrics are used in the proposed model.Finally,the proposed model is compared with a few unsupervised models,and it outperformed the state-of-the-art deep convolutional architectures in recognizing writers based on unlabeled data.
文摘Writer identification(WI)based on handwritten text structures is typically focused on digital characteristics,with letters/strokes representing the information acquired from the current research in the integration of individual writing habits/styles.Previous studies have indicated that a word’s attributes contribute to greater recognition than the attributes of a character or stroke.As a result of the complexity of Arabic handwriting,segmenting and separating letters and strokes from a script poses a challenge in addition to WI schemes.In this work,we propose new texture features for WI based on text.The histogram of oriented gradient(HOG)features are modified to extract good features on the basis of the histogram of the orientation for different angles of texts.The fusion of these features with the features of convolutional neural networks(CNNs)results in a good vector of powerful features.Then,we reduce the features by selecting the best ones using a genetic algorithm.The normalization method is used to normalize the features and feed them to an artificial neural network classifier.Experimental results show that the proposed augmenter enhances the results for HOG features and ResNet50,as well as the proposed model,because the amount of data is increased.Such a large data volume helps the system to retrieve extensive information about the nature of writing patterns.The affective result of the proposed model for whole paragraphs,lines,and sub words is obtained using different models and then compared with those of the CNN and ResNet50.The whole paragraphs produce the best results in all models because they contain rich information and the model can utilize numerous features for different words.The HOG and CNN features achieve 94.2%accuracy for whole paragraphs with augmentation,83.2%of accuracy for lines,and 78%accuracy for sub words.Thus,this work provides a system that can identify writers on the basis of their handwriting and builds a powerful model that can help identify writers on the basis of their sentences,words,and sub words.
文摘The writer identification(WI)of handwritten Arabic text is now of great concern to intelligence agencies following the recent attacks perpetrated by known Middle East terrorist organizations.It is also a useful instrument for the digitalization and attribution of old text to other authors of historic studies,including old national and religious archives.In this study,we proposed a new affective segmentation model by modifying an artificial neural network model and making it suitable for the binarization stage based on blocks.This modified method is combined with a new effective rotation model to achieve an accurate segmentation through the analysis of the histogram of binary images.Also,propose a new framework for correct text rotation that will help us to establish a segmentation method that can facilitate the extraction of text from its background.Image projections and the radon transform are used and improved using machine learning based on a co-occurrence matrix to produce binary images.The training stage involves taking a number of images for model training.These images are selected randomly with different angles to generate four classes(0–90,90–180,180–270,and 270–360).The proposed segmentation approach achieves a high accuracy of 98.18%.The study ultimately provides two major contributions that are ranked from top to bottom according to the degree of importance.The proposed method can be further developed as a new application and used in the recognition of handwritten Arabic text from small documents regardless of logical combinations and sentence construction.