The input document images with skew can be a serious problem in the optical character recognition system. A method is proposed for skew detection in binary document images using mathematic morphology. The basic proces...The input document images with skew can be a serious problem in the optical character recognition system. A method is proposed for skew detection in binary document images using mathematic morphology. The basic process of our approach consists of three steps: Firstly, a dilation operation is applied to the binary image; Secondly, the dilated image is thinned; Finally, the skew angle is detected using the Hough transform. The proposed approach with high precision can detect skew with large angle (?90δ-90Δ). The experimental result shows this method is applicable and efficient.展开更多
In the digital world,a wide range of handwritten and printed documents should be converted to digital format using a variety of tools,including mobile phones and scanners.Unfortunately,this is not an optimal procedure...In the digital world,a wide range of handwritten and printed documents should be converted to digital format using a variety of tools,including mobile phones and scanners.Unfortunately,this is not an optimal procedure,and the entire document image might be degraded.Imperfect conversion effects due to noise,motion blur,and skew distortion can lead to significant impact on the accuracy and effectiveness of document image segmentation and analysis in Optical Character Recognition(OCR)systems.In Document Image Analysis Systems(DIAS),skew estimation of images is a crucial step.In this paper,a novel,fast,and reliable skew detection algorithm based on the Radon Transform and Curve Length Fitness Function(CLF),so-called Radon CLF,was proposed.The Radon CLF model aims to take advantage of the properties of Radon spaces.The Radon CLF explores the dominating angle more effectively for a 1D signal than it does for a 2D input image due to an innovative fitness function formulation for a projected signal of the Radon space.Several significant performance indicators,including Mean Square Error(MSE),Mean Absolute Error(MAE),Peak Signal-to-Noise Ratio(PSNR),Structural Similarity Measure(SSIM),Accuracy,and run-time,were taken into consideration when assessing the performance of our model.In addition,a new dataset named DSI5000 was constructed to assess the accuracy of the CLF model.Both two-dimensional image signal and the Radon space have been used in our simulations to compare the noise effect.Obtained results show that the proposed method is more effective than other approaches already in use,with an accuracy of roughly 99.87%and a run-time of 0.048(s).The introduced model is far more accurate and timeefficient than current approaches in detecting image skew.展开更多
The skewed symmetry detection plays an improtant role in three-dimensional(3-D) reconstruction. The skewed symmetry depicts a real symmetry viewed from some unknown viewing directions. And the skewed symmetry detect...The skewed symmetry detection plays an improtant role in three-dimensional(3-D) reconstruction. The skewed symmetry depicts a real symmetry viewed from some unknown viewing directions. And the skewed symmetry detection can decrease the geometric constrains and the complexity of 3-D reconstruction. The detection technique for the quadric curve ellipse proposed by Sugimoto is improved to further cover quadric curves including hyperbola and parabola. With the parametric detection, the 3-D quadric curve projection matching is automatical- ly accomplished. Finally, the skewed symmetry surface of the quadric surface solid is obtained. Several examples are used to verify the feasibility of the algorithm and satisfying results can be obtained.展开更多
The writer identification(WI)of handwritten Arabic text is now of great concern to intelligence agencies following the recent attacks perpetrated by known Middle East terrorist organizations.It is also a useful instru...The writer identification(WI)of handwritten Arabic text is now of great concern to intelligence agencies following the recent attacks perpetrated by known Middle East terrorist organizations.It is also a useful instrument for the digitalization and attribution of old text to other authors of historic studies,including old national and religious archives.In this study,we proposed a new affective segmentation model by modifying an artificial neural network model and making it suitable for the binarization stage based on blocks.This modified method is combined with a new effective rotation model to achieve an accurate segmentation through the analysis of the histogram of binary images.Also,propose a new framework for correct text rotation that will help us to establish a segmentation method that can facilitate the extraction of text from its background.Image projections and the radon transform are used and improved using machine learning based on a co-occurrence matrix to produce binary images.The training stage involves taking a number of images for model training.These images are selected randomly with different angles to generate four classes(0–90,90–180,180–270,and 270–360).The proposed segmentation approach achieves a high accuracy of 98.18%.The study ultimately provides two major contributions that are ranked from top to bottom according to the degree of importance.The proposed method can be further developed as a new application and used in the recognition of handwritten Arabic text from small documents regardless of logical combinations and sentence construction.展开更多
In this paper a new feature called crosscount for document analysis is introduced.The feature crosscount is a function of white line segment with its start on the edgeof document images. It reflects not only the conto...In this paper a new feature called crosscount for document analysis is introduced.The feature crosscount is a function of white line segment with its start on the edgeof document images. It reflects not only the contour of image, but also the periodicity of white lines(background) and text lines in the document images. In complexprinted-page layouts, there are different blocks such as textual, graphical, tabular, andso on. of these blocks, textual ones have the most obvious periodicity with their homogeneous white lines arranged regularly. The important property of textual blockscan be extracted by crosscount functions. Here the document layouts are classifiedinto three classes on the basis of their physical structures. Then the definition andproperties of the crosscount function are described. According to the classification ofdocument layouts, the application of this new feature to different types of documentimages analysis and understanding is discussed.展开更多
文摘The input document images with skew can be a serious problem in the optical character recognition system. A method is proposed for skew detection in binary document images using mathematic morphology. The basic process of our approach consists of three steps: Firstly, a dilation operation is applied to the binary image; Secondly, the dilated image is thinned; Finally, the skew angle is detected using the Hough transform. The proposed approach with high precision can detect skew with large angle (?90δ-90Δ). The experimental result shows this method is applicable and efficient.
文摘In the digital world,a wide range of handwritten and printed documents should be converted to digital format using a variety of tools,including mobile phones and scanners.Unfortunately,this is not an optimal procedure,and the entire document image might be degraded.Imperfect conversion effects due to noise,motion blur,and skew distortion can lead to significant impact on the accuracy and effectiveness of document image segmentation and analysis in Optical Character Recognition(OCR)systems.In Document Image Analysis Systems(DIAS),skew estimation of images is a crucial step.In this paper,a novel,fast,and reliable skew detection algorithm based on the Radon Transform and Curve Length Fitness Function(CLF),so-called Radon CLF,was proposed.The Radon CLF model aims to take advantage of the properties of Radon spaces.The Radon CLF explores the dominating angle more effectively for a 1D signal than it does for a 2D input image due to an innovative fitness function formulation for a projected signal of the Radon space.Several significant performance indicators,including Mean Square Error(MSE),Mean Absolute Error(MAE),Peak Signal-to-Noise Ratio(PSNR),Structural Similarity Measure(SSIM),Accuracy,and run-time,were taken into consideration when assessing the performance of our model.In addition,a new dataset named DSI5000 was constructed to assess the accuracy of the CLF model.Both two-dimensional image signal and the Radon space have been used in our simulations to compare the noise effect.Obtained results show that the proposed method is more effective than other approaches already in use,with an accuracy of roughly 99.87%and a run-time of 0.048(s).The introduced model is far more accurate and timeefficient than current approaches in detecting image skew.
基金Supported by the National Natural Science Foundation of China(10377007)~~
文摘The skewed symmetry detection plays an improtant role in three-dimensional(3-D) reconstruction. The skewed symmetry depicts a real symmetry viewed from some unknown viewing directions. And the skewed symmetry detection can decrease the geometric constrains and the complexity of 3-D reconstruction. The detection technique for the quadric curve ellipse proposed by Sugimoto is improved to further cover quadric curves including hyperbola and parabola. With the parametric detection, the 3-D quadric curve projection matching is automatical- ly accomplished. Finally, the skewed symmetry surface of the quadric surface solid is obtained. Several examples are used to verify the feasibility of the algorithm and satisfying results can be obtained.
文摘The writer identification(WI)of handwritten Arabic text is now of great concern to intelligence agencies following the recent attacks perpetrated by known Middle East terrorist organizations.It is also a useful instrument for the digitalization and attribution of old text to other authors of historic studies,including old national and religious archives.In this study,we proposed a new affective segmentation model by modifying an artificial neural network model and making it suitable for the binarization stage based on blocks.This modified method is combined with a new effective rotation model to achieve an accurate segmentation through the analysis of the histogram of binary images.Also,propose a new framework for correct text rotation that will help us to establish a segmentation method that can facilitate the extraction of text from its background.Image projections and the radon transform are used and improved using machine learning based on a co-occurrence matrix to produce binary images.The training stage involves taking a number of images for model training.These images are selected randomly with different angles to generate four classes(0–90,90–180,180–270,and 270–360).The proposed segmentation approach achieves a high accuracy of 98.18%.The study ultimately provides two major contributions that are ranked from top to bottom according to the degree of importance.The proposed method can be further developed as a new application and used in the recognition of handwritten Arabic text from small documents regardless of logical combinations and sentence construction.
文摘In this paper a new feature called crosscount for document analysis is introduced.The feature crosscount is a function of white line segment with its start on the edgeof document images. It reflects not only the contour of image, but also the periodicity of white lines(background) and text lines in the document images. In complexprinted-page layouts, there are different blocks such as textual, graphical, tabular, andso on. of these blocks, textual ones have the most obvious periodicity with their homogeneous white lines arranged regularly. The important property of textual blockscan be extracted by crosscount functions. Here the document layouts are classifiedinto three classes on the basis of their physical structures. Then the definition andproperties of the crosscount function are described. According to the classification ofdocument layouts, the application of this new feature to different types of documentimages analysis and understanding is discussed.