期刊文献+
共找到10篇文章
< 1 >
每页显示 20 50 100
Pre-training transformer with dual-branch context content module for table detection in document images
1
作者 Yongzhi LI Pengle ZHANG +2 位作者 Meng SUN Jin HUANG Ruhan HE 《虚拟现实与智能硬件(中英文)》 EI 2024年第5期408-420,共13页
Background Document images such as statistical reports and scientific journals are widely used in information technology.Accurate detection of table areas in document images is an essential prerequisite for tasks such... Background Document images such as statistical reports and scientific journals are widely used in information technology.Accurate detection of table areas in document images is an essential prerequisite for tasks such as information extraction.However,because of the diversity in the shapes and sizes of tables,existing table detection methods adapted from general object detection algorithms,have not yet achieved satisfactory results.Incorrect detection results might lead to the loss of critical information.Methods Therefore,we propose a novel end-to-end trainable deep network combined with a self-supervised pretraining transformer for feature extraction to minimize incorrect detections.To better deal with table areas of different shapes and sizes,we added a dualbranch context content attention module(DCCAM)to high-dimensional features to extract context content information,thereby enhancing the network's ability to learn shape features.For feature fusion at different scales,we replaced the original 3×3 convolution with a multilayer residual module,which contains enhanced gradient flow information to improve the feature representation and extraction capability.Results We evaluated our method on public document datasets and compared it with previous methods,which achieved state-of-the-art results in terms of evaluation metrics such as recall and F1-score.https://github.com/Yong Z-Lee/TD-DCCAM. 展开更多
关键词 Table detection document image analysis TRANSFORMER Dilated convolution Deformable convolution Feature fusion
下载PDF
End-to-end dilated convolution network for document image semantic segmentation 被引量:8
2
作者 XU Can-hui SHI Cao CHEN Yi-nong 《Journal of Central South University》 SCIE EI CAS CSCD 2021年第6期1765-1774,共10页
Semantic segmentation is a crucial step for document understanding.In this paper,an NVIDIA Jetson Nano-based platform is applied for implementing semantic segmentation for teaching artificial intelligence concepts and... Semantic segmentation is a crucial step for document understanding.In this paper,an NVIDIA Jetson Nano-based platform is applied for implementing semantic segmentation for teaching artificial intelligence concepts and programming.To extract semantic structures from document images,we present an end-to-end dilated convolution network architecture.Dilated convolutions have well-known advantages for extracting multi-scale context information without losing spatial resolution.Our model utilizes dilated convolutions with residual network to represent the image features and predicting pixel labels.The convolution part works as feature extractor to obtain multidimensional and hierarchical image features.The consecutive deconvolution is used for producing full resolution segmentation prediction.The probability of each pixel decides its predefined semantic class label.To understand segmentation granularity,we compare performances at three different levels.From fine grained class to coarse class levels,the proposed dilated convolution network architecture is evaluated on three document datasets.The experimental results have shown that both semantic data distribution imbalance and network depth are import factors that influence the document’s semantic segmentation performances.The research is aimed at offering an education resource for teaching artificial intelligence concepts and techniques. 展开更多
关键词 semantic segmentation document images deep learning NVIDIA jetson nano
下载PDF
Effect of Direct Statistical Contrast Enhancement Technique on Document Image Binarization 被引量:2
3
作者 Wan Azani Mustafa Haniza Yazid +2 位作者 Ahmed Alkhayyat Mohd Aminudin Jamlos Hasliza A.Rahim 《Computers, Materials & Continua》 SCIE EI 2022年第2期3549-3564,共16页
Background:Contrast enhancement plays an important role in the image processing field.Contrast correction has performed an adjustment on the darkness or brightness of the input image and increases the quality of the i... Background:Contrast enhancement plays an important role in the image processing field.Contrast correction has performed an adjustment on the darkness or brightness of the input image and increases the quality of the image.Objective:This paper proposed a novel method based on statistical data from the local mean and local standard deviation.Method:The proposed method modifies the mean and standard deviation of a neighbourhood at each pixel and divides it into three categories:background,foreground,and problematic(contrast&luminosity)region.Experimental results from both visual and objective aspects show that the proposed method can normalize the contrast variation problem effectively compared to Histogram Equalization(HE),Difference of Gaussian(DoG),and Butterworth Homomorphic Filtering(BHF).Seven(7)types of binarization methods were tested on the corrected image and produced a positive and impressive result.Result:Finally,a comparison in terms of Signal Noise Ratio(SNR),Misclassification Error(ME),F-measure,Peak Signal Noise Ratio(PSNR),Misclassification Penalty Metric(MPM),and Accuracy was calculated.Each binarization method shows an incremented result after applying it onto the corrected image compared to the original image.The SNR result of our proposed image is 9.350 higher than the three(3)other methods.The average increment after five(5)types of evaluation are:(Otsu=41.64%,Local Adaptive=7.05%,Niblack=30.28%,Bernsen=25%,Bradley=3.54%,Nick=1.59%,Gradient-Based=14.6%).Conclusion:The results presented in this paper effectively solve the contrast problem and finally produce better quality images. 展开更多
关键词 BINARIZATION CONTRAST LUMINOSITY ILLUMINATION document image
下载PDF
Novel Adaptive Binarization Method for Degraded Document Images 被引量:1
4
作者 Siti Norul Huda Sheikh Abdullah Saad M.Ismail +1 位作者 Mohammad Kamrul Hasan Palaiahnakote Shivakumara 《Computers, Materials & Continua》 SCIE EI 2021年第6期3815-3832,共18页
Achieving a good recognition rate for degraded document images is difficult as degraded document images suffer from low contrast,bleedthrough,and nonuniform illumination effects.Unlike the existing baseline thresholdi... Achieving a good recognition rate for degraded document images is difficult as degraded document images suffer from low contrast,bleedthrough,and nonuniform illumination effects.Unlike the existing baseline thresholding techniques that use fixed thresholds and windows,the proposed method introduces a concept for obtaining dynamic windows according to the image content to achieve better binarization.To enhance a low-contrast image,we proposed a new mean histogram stretching method for suppressing noisy pixels in the background and,simultaneously,increasing pixel contrast at edges or near edges,which results in an enhanced image.For the enhanced image,we propose a new method for deriving adaptive local thresholds for dynamic windows.The dynamic window is derived by exploiting the advantage of Otsu thresholding.To assess the performance of the proposed method,we have used standard databases,namely,document image binarization contest(DIBCO),for experimentation.The comparative study on well-known existing methods indicates that the proposed method outperforms the existing methods in terms of quality and recognition rate. 展开更多
关键词 Global and local thresholding adaptive binarization degraded document image image histogram document image binarization contest
下载PDF
A New Wavelet-Based Document Image Segmentation Scheme
5
作者 赵健 李道京 +1 位作者 俞卞章 耿军平 《Journal of Systems Engineering and Electronics》 SCIE EI CSCD 2002年第3期86-90,共5页
The document image segmentation is very useful for printing, faxing and data processing. An algorithm is developed for segmenting and classifying document image. Feature used for classification is based on the histogr... The document image segmentation is very useful for printing, faxing and data processing. An algorithm is developed for segmenting and classifying document image. Feature used for classification is based on the histogram distribution pattern of different image classes. The important attribute of the algorithm is using wavelet correlation image to enhance raw image's pattern, so the classification accuracy is improved. In this paper document image is divided into four types; background, photo, text and graph. Firstly, the document image background has been distingusished easily by former normally method;secondly, three image types will be distinguished by their typical histograms, in order to make histograms feature clearer, each resolution's HH wavelet subimage is used to add to the raw image at their resolution. At last, the photo, text and praph have been devided according to how the feature fit to the Laplacian distrbution by 2 and L . Simulations show that classification accuracy is significantly improved. The comparison with related shows that our algorithm provides both lower classification error rates and better visual results. 展开更多
关键词 document image SEGMENTATION CLASSIFICATION Wavelet Histogram.
下载PDF
A Knife-edge Input Point Spread Function Estimation Method for Document Images
6
作者 Jianqiang Zhong 《International Journal of Technology Management》 2016年第3期50-52,共3页
In this paper the progress of document image Point Spread Function (PSF) estimation will be presented. At the beginning of the paper, an overview of PSF estimation methods will be introduced and the reason why knife... In this paper the progress of document image Point Spread Function (PSF) estimation will be presented. At the beginning of the paper, an overview of PSF estimation methods will be introduced and the reason why knife-edge input PSF estimation method is chosen will be explained. Then in the next section, the knife-edge input PSF estimation method will be detailed. After that, a simulation experiment is performed in order to verify the implemented PSF estimation method. Based on the simulation experiment, in next section we propose a procedure that makes automatic PSF estimation possible. A real document image is firstly taken as an example to illustrate the procedure and then be restored with the estimated PSF and Lucy-Richardson deconvolution method, and its OCR accuracy before and after deconvolution will be compared. Finally, we conclude the paper with the outlook for the future work. 展开更多
关键词 Point Spread Function document image Knife-edge Input
下载PDF
Radon CLF:A Novel Approach for Skew Detection Using Radon Transform
7
作者 Yuhang Chen Mahdi Bahaghighat +1 位作者 Aghil Esmaeili Kelishomi Jingyi Du 《Computer Systems Science & Engineering》 SCIE EI 2023年第10期675-697,共23页
In the digital world,a wide range of handwritten and printed documents should be converted to digital format using a variety of tools,including mobile phones and scanners.Unfortunately,this is not an optimal procedure... In the digital world,a wide range of handwritten and printed documents should be converted to digital format using a variety of tools,including mobile phones and scanners.Unfortunately,this is not an optimal procedure,and the entire document image might be degraded.Imperfect conversion effects due to noise,motion blur,and skew distortion can lead to significant impact on the accuracy and effectiveness of document image segmentation and analysis in Optical Character Recognition(OCR)systems.In Document Image Analysis Systems(DIAS),skew estimation of images is a crucial step.In this paper,a novel,fast,and reliable skew detection algorithm based on the Radon Transform and Curve Length Fitness Function(CLF),so-called Radon CLF,was proposed.The Radon CLF model aims to take advantage of the properties of Radon spaces.The Radon CLF explores the dominating angle more effectively for a 1D signal than it does for a 2D input image due to an innovative fitness function formulation for a projected signal of the Radon space.Several significant performance indicators,including Mean Square Error(MSE),Mean Absolute Error(MAE),Peak Signal-to-Noise Ratio(PSNR),Structural Similarity Measure(SSIM),Accuracy,and run-time,were taken into consideration when assessing the performance of our model.In addition,a new dataset named DSI5000 was constructed to assess the accuracy of the CLF model.Both two-dimensional image signal and the Radon space have been used in our simulations to compare the noise effect.Obtained results show that the proposed method is more effective than other approaches already in use,with an accuracy of roughly 99.87%and a run-time of 0.048(s).The introduced model is far more accurate and timeefficient than current approaches in detecting image skew. 展开更多
关键词 document image analysis skew detection Radon transform pattern recognition
下载PDF
Document image retrieval based on multi-density features
8
作者 HU Zhilan LIN Xinggang YAN Hong 《Frontiers of Electrical and Electronic Engineering in China》 CSCD 2007年第2期172-175,共4页
The development of document image databases is becoming a challenge for document image retrieval tech-niques.Traditional layout-reconstructed-based methods rely on high quality document images as well as an optical ch... The development of document image databases is becoming a challenge for document image retrieval tech-niques.Traditional layout-reconstructed-based methods rely on high quality document images as well as an optical char-acter recognition(OCR)precision,and can only deal with several widely used languages.The complexity of document layouts greatly hinders layout analysis-based approaches.This paper describes a multi-density feature based algorithm for binary document images,which is independent of OCR or layout analyses.The text area was extracted after prepro-cessing such as skew correction and marginal noise removal.Then the aspect ratio and multi-density features were extract-ed from the text area to select the best candidates from the document image database.Experimental results show that this approach is simple with loss rates less than 3%and can efficiently analyze images with different resolutions and dif-ferent input systems.The system is also robust to noise due to its notes and complex layouts,etc. 展开更多
关键词 document image image retrieval skew correc-tion multi-density features
原文传递
Guidelines for Creating a Rule-Based Knowledge Learning System and Their Application to a Chinese Business Card Layout Analysis
9
作者 潘武模 王庆人 《Journal of Computer Science & Technology》 SCIE EI CSCD 2001年第1期47-56,共10页
Rule selection has long been a problem of great challenge that has to be solved when developing a rule-based knowledge learning system. Many methods have been proposed to evaluate the eligibility of a single rule base... Rule selection has long been a problem of great challenge that has to be solved when developing a rule-based knowledge learning system. Many methods have been proposed to evaluate the eligibility of a single rule based on some criteria. However, in a knowledge learning system there is usually a set of rules. These rules are not independent, but interactive. They tend to affect each other and form a rulesystem. In such case, it is no longer reasonable to isolate each rule from others for evaluation. A best rule according to certain criterion is not always the best one for the whole system. Furthermore, the data in the real world from which people want to create their learning system are often ill-defined and inconsistent. In this case, the completeness and consistency criteria for rule selection are no longer essential. In this paper, some ideas about how to solve the rule-selection problem in a systematic way are proposed. These ideas have been applied in the design of a Chinese business card layout analysis system and gained a good result on the training data set of 425 images. The implementation of the system and the result are presented in this paper. 展开更多
关键词 rule-based system knowledge learning layout analysis document image understanding business card
原文传递
Extended Approach to Water Flow Algorithm for Text Line Segmentation
10
作者 Darko Brodi 《Journal of Computer Science & Technology》 SCIE EI CSCD 2012年第1期187-194,共8页
This paper proposes a new approach to the water flow algorithm for text line segmentation. In the basic method the hypothetical water flows under few specified angles which have been defined by water flow angle as par... This paper proposes a new approach to the water flow algorithm for text line segmentation. In the basic method the hypothetical water flows under few specified angles which have been defined by water flow angle as parameter. It is applied to the document image frame from left to right and vice versa. As a result, the unwetted and wetted areas are established. These areas separate text from non-text elements in each text line, respectively. Hence, they represent the control areas that are of major importance for text line segmentation. Primarily, an extended approach means extraction of the connected-components by bounding boxes over text. By this way, each connected component is mutually separated. Hence, the water flow angle, which defines the unwetted areas, is determined adaptively. By choosing appropriate water flow angle, the unwetted areas are lengthening which leads to the better text line segmentation. Results of this approach are encouraging due to the text line segmentation improvement which is the most challenging step in document image processing. 展开更多
关键词 document image analysis text segmentation region growing smearing method water flow algorithm
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部