A document layout can be more informative than merely a document’s visual and structural appearance.Thus,document layout analysis(DLA)is considered a necessary prerequisite for advanced processing and detailed docume...A document layout can be more informative than merely a document’s visual and structural appearance.Thus,document layout analysis(DLA)is considered a necessary prerequisite for advanced processing and detailed document image analysis to be further used in several applications and different objectives.This research extends the traditional approaches of DLA and introduces the concept of semantic document layout analysis(SDLA)by proposing a novel framework for semantic layout analysis and characterization of handwritten manuscripts.The proposed SDLA approach enables the derivation of implicit information and semantic characteristics,which can be effectively utilized in dozens of practical applications for various purposes,in a way bridging the semantic gap and providingmore understandable high-level document image analysis and more invariant characterization via absolute and relative labeling.This approach is validated and evaluated on a large dataset ofArabic handwrittenmanuscripts comprising complex layouts.The experimental work shows promising results in terms of accurate and effective semantic characteristic-based clustering and retrieval of handwritten manuscripts.It also indicates the expected efficacy of using the capabilities of the proposed approach in automating and facilitating many functional,reallife tasks such as effort estimation and pricing of transcription or typing of such complex manuscripts.展开更多
Document images often contain various page components and complex logical structures,which make document layout analysis task challenging.For most deep learning-based document layout analysis methods,convolutional neu...Document images often contain various page components and complex logical structures,which make document layout analysis task challenging.For most deep learning-based document layout analysis methods,convolutional neural networks(CNNs)are adopted as the feature extraction networks.In this paper,a hybrid spatial-channel attention network(HSCA-Net)is proposed to improve feature extraction capability by introducing attention mechanism to explore more salient properties within document pages.The HSCA-Net consists of spatial attention module(SAM),channel attention module(CAM),and designed lateral attention connection.CAM adaptively adjusts channel feature responses by emphasizing selective information,which depends on the contribution of the features of each channel.SAM guides CNNs to focus on the informative contents and capture global context information among page objects.The lateral attention connection incorporates SAM and CAM into multiscale feature pyramid network,and thus retains original feature information.The effectiveness and adaptability of HSCA-Net are evaluated through multiple experiments on publicly available datasets such as PubLayNet,ICDAR-POD,and Article Regions.Experimental results demonstrate that HSCA-Net achieves state-of-the-art performance on document layout analysis task.展开更多
Rule selection has long been a problem of great challenge that has to be solved when developing a rule-based knowledge learning system. Many methods have been proposed to evaluate the eligibility of a single rule base...Rule selection has long been a problem of great challenge that has to be solved when developing a rule-based knowledge learning system. Many methods have been proposed to evaluate the eligibility of a single rule based on some criteria. However, in a knowledge learning system there is usually a set of rules. These rules are not independent, but interactive. They tend to affect each other and form a rulesystem. In such case, it is no longer reasonable to isolate each rule from others for evaluation. A best rule according to certain criterion is not always the best one for the whole system. Furthermore, the data in the real world from which people want to create their learning system are often ill-defined and inconsistent. In this case, the completeness and consistency criteria for rule selection are no longer essential. In this paper, some ideas about how to solve the rule-selection problem in a systematic way are proposed. These ideas have been applied in the design of a Chinese business card layout analysis system and gained a good result on the training data set of 425 images. The implementation of the system and the result are presented in this paper.展开更多
Previously we have designed and implemented new image browsing facilities to support effective offiine image contents on mobile devices with limited capabilities: low bandwidth, small display, and slow processing. In...Previously we have designed and implemented new image browsing facilities to support effective offiine image contents on mobile devices with limited capabilities: low bandwidth, small display, and slow processing. In this letter, we fulfill the automatic production of cartoon contents fitting small-screen display, and introduce a clustering method useful for various types of cartoon images as a prerequisite stage for preserving semantic meaning. The usage of neural networks is to properly cut the various forms of pages. Texture information that is useful for grayscale image segmentation gives us a good clue for page layout analysis using the multilayer perceptron (MLP) based x-y recursive algorithm. We also automatically frame the segment MLP using agglomerative segmentation. Our experimental results show that the combined approaches yield good results of segmentation for several cartoons.展开更多
In this paper, a visual similarity based document layout analysis (DLA) scheme is proposed, which by using clustering strategy can adaptively deal with documents in different languages, with different layout structu...In this paper, a visual similarity based document layout analysis (DLA) scheme is proposed, which by using clustering strategy can adaptively deal with documents in different languages, with different layout structures and skew angles. Aiming at a robust and adaptive DLA approach, the authors first manage to find a set of representative filters and statistics to characterize typical texture patterns in document images, which is through a visual similarity testing process. Texture features are then extracted from these filters and passed into a dynamic clustering procedure, which is called visual similarity clustering. Finally, text contents are located from the clustered results. Benefit from this scheme, the algorithm demonstrates strong robustness and adaptability in a wide variety of documents, which previous traditional DLA approaches do not possess.展开更多
To manipulate the layout analysis problem for complex or irregular document image, a Unified HMM-based Layout Analysis Framework is presented in this paper. Based on the multi-resolution wavelet analysis results of th...To manipulate the layout analysis problem for complex or irregular document image, a Unified HMM-based Layout Analysis Framework is presented in this paper. Based on the multi-resolution wavelet analysis results of the document image, we use HMM method in both inner-scale image model and trans-scale context model to classify the pixel region properties, such as text, picture or background. In each scale, a HMM direct segmentation method is used to get better inner-scale classification result. Then another HMM method is used to fuse the inner-scale result in each scale and then get better final segmentation result. The optimized algorithm uses a stop rule in the coarse to fine multi-scale segmentation process, so the speed is improved remarkably. Experiments prove the efficiency of proposed algorithm.展开更多
The volume of academic literature,such as academic conference papers and journals,has increased rapidly worldwide,and research on metadata extraction is ongoing.However,high-performing metadata extraction is still cha...The volume of academic literature,such as academic conference papers and journals,has increased rapidly worldwide,and research on metadata extraction is ongoing.However,high-performing metadata extraction is still challenging due to diverse layout formats according to journal publishers.To accommodate the diversity of the layouts of academic journals,we propose a novel LAyout-aware Metadata Extraction(LAME)framework equipped with the three characteristics(e.g.,design of automatic layout analysis,construction of a large meta-data training set,and implementation of metadata extractor).In the framework,we designed an automatic layout analysis using PDF Miner.Based on the layout analysis,a large volume of metadata-separated training data,including the title,abstract,author name,author affiliated organization,and keywords,were automatically extracted.Moreover,we constructed a pre-trainedmodel,Layout-Meta BERT,to extract the metadata from academic journals with varying layout formats.The experimental results with our metadata extractor exhibited robust performance(Macro-F1,93.27%)in metadata extraction for unseen journals with different layout formats.展开更多
We study here effects of nozzle layout on the droplet ejection of a micro atomizer, which was fabricated with the arrayed nozzles by the MEMS technology and actuated by a piezoelectric disc. A theoretical model was fi...We study here effects of nozzle layout on the droplet ejection of a micro atomizer, which was fabricated with the arrayed nozzles by the MEMS technology and actuated by a piezoelectric disc. A theoretical model was first built for this piezoelectric-liquid-structure coupling system to characterize the acoustic wave propagation in the liquid chamber, which determined the droplet formation out of nozzles. The modal analysis was carried out numerically to predict resonant frequencies and simulate the corresponding pressure wave field. By comparing the amplitude contours of pressure wave on the liquid-solid interface at nozzle inlets with the designed nozzle layout, behaviors of the device under different vibration modes can be predicted. Experimentally, an impedance analyzer was used to measure the resonant frequencies of the system. Three types of atomizers with different nozzle layouts were fabricated for measuring the effect of nozzle distribution on the ejection performance. The visualization experiment of droplet generation was carried out and volume flow rates of these devices were measured. The good agreement between the experiment and the prediction proved that only the increase of nozzles may not enhance the droplet generation and a design of nozzle distribution from a view-point of frequency is necessary for a resonant related atomizer.展开更多
inductive fault analysis is a technique for enumerating likely bridges that is limited by the weighted critical area computation. Based on the rectangle model of a real defect and mathematical morphology, an efficient...inductive fault analysis is a technique for enumerating likely bridges that is limited by the weighted critical area computation. Based on the rectangle model of a real defect and mathematical morphology, an efficient algorithm is presented to compute the weighted critical area of a layout. The algorithm avoids the need to determine which rectangles belong to a net and the merging of the critical area corresponding to a net pair. Experimental resuits showing the algorithm's performance are presented.展开更多
As die size and complexity increase, accurate and efficient extraction of the critical area is essential for yield prediction. Aiming at eliminating the potential integration errors of the traditional shape shifting m...As die size and complexity increase, accurate and efficient extraction of the critical area is essential for yield prediction. Aiming at eliminating the potential integration errors of the traditional shape shifting method, an improved shape shifting method is proposed for Manhattan layouts. By mathematical analyses of the relevance of critical areas to defect sizes, the critical area for all defect sizes is modeled as a piecewise quadratic polynomial function of defect size, which can be obtained by extracting critical area for some certain defect sizes. Because the improved method calculates critical areas for all defect sizes instead of several discrete values with traditional shape shifting method, it eliminates the integration error of the average critical area. Experiments on industrial layouts show that the improved shape shifting method can improve the accuracy of the average critical area calculation by 24.3% or reduce about 59.7% computational expense compared with the traditional method.展开更多
基金This research was supported and funded by KAU Scientific Endowment,King Abdulaziz University,Jeddah,Saudi Arabia.
文摘A document layout can be more informative than merely a document’s visual and structural appearance.Thus,document layout analysis(DLA)is considered a necessary prerequisite for advanced processing and detailed document image analysis to be further used in several applications and different objectives.This research extends the traditional approaches of DLA and introduces the concept of semantic document layout analysis(SDLA)by proposing a novel framework for semantic layout analysis and characterization of handwritten manuscripts.The proposed SDLA approach enables the derivation of implicit information and semantic characteristics,which can be effectively utilized in dozens of practical applications for various purposes,in a way bridging the semantic gap and providingmore understandable high-level document image analysis and more invariant characterization via absolute and relative labeling.This approach is validated and evaluated on a large dataset ofArabic handwrittenmanuscripts comprising complex layouts.The experimental work shows promising results in terms of accurate and effective semantic characteristic-based clustering and retrieval of handwritten manuscripts.It also indicates the expected efficacy of using the capabilities of the proposed approach in automating and facilitating many functional,reallife tasks such as effort estimation and pricing of transcription or typing of such complex manuscripts.
文摘Document images often contain various page components and complex logical structures,which make document layout analysis task challenging.For most deep learning-based document layout analysis methods,convolutional neural networks(CNNs)are adopted as the feature extraction networks.In this paper,a hybrid spatial-channel attention network(HSCA-Net)is proposed to improve feature extraction capability by introducing attention mechanism to explore more salient properties within document pages.The HSCA-Net consists of spatial attention module(SAM),channel attention module(CAM),and designed lateral attention connection.CAM adaptively adjusts channel feature responses by emphasizing selective information,which depends on the contribution of the features of each channel.SAM guides CNNs to focus on the informative contents and capture global context information among page objects.The lateral attention connection incorporates SAM and CAM into multiscale feature pyramid network,and thus retains original feature information.The effectiveness and adaptability of HSCA-Net are evaluated through multiple experiments on publicly available datasets such as PubLayNet,ICDAR-POD,and Article Regions.Experimental results demonstrate that HSCA-Net achieves state-of-the-art performance on document layout analysis task.
文摘Rule selection has long been a problem of great challenge that has to be solved when developing a rule-based knowledge learning system. Many methods have been proposed to evaluate the eligibility of a single rule based on some criteria. However, in a knowledge learning system there is usually a set of rules. These rules are not independent, but interactive. They tend to affect each other and form a rulesystem. In such case, it is no longer reasonable to isolate each rule from others for evaluation. A best rule according to certain criterion is not always the best one for the whole system. Furthermore, the data in the real world from which people want to create their learning system are often ill-defined and inconsistent. In this case, the completeness and consistency criteria for rule selection are no longer essential. In this paper, some ideas about how to solve the rule-selection problem in a systematic way are proposed. These ideas have been applied in the design of a Chinese business card layout analysis system and gained a good result on the training data set of 425 images. The implementation of the system and the result are presented in this paper.
基金Project partially supported by the Ministry of Knowledge Economy (MKE) of Korea under the Information Technology Research Center (ITRC) Support Programthe Basic Research Program of the Korea Science (No. R01-2006-000-11214-0)
文摘Previously we have designed and implemented new image browsing facilities to support effective offiine image contents on mobile devices with limited capabilities: low bandwidth, small display, and slow processing. In this letter, we fulfill the automatic production of cartoon contents fitting small-screen display, and introduce a clustering method useful for various types of cartoon images as a prerequisite stage for preserving semantic meaning. The usage of neural networks is to properly cut the various forms of pages. Texture information that is useful for grayscale image segmentation gives us a good clue for page layout analysis using the multilayer perceptron (MLP) based x-y recursive algorithm. We also automatically frame the segment MLP using agglomerative segmentation. Our experimental results show that the combined approaches yield good results of segmentation for several cartoons.
基金This work is supported by the National Natural Science Foundation of China under Grant No. 60472002.
文摘In this paper, a visual similarity based document layout analysis (DLA) scheme is proposed, which by using clustering strategy can adaptively deal with documents in different languages, with different layout structures and skew angles. Aiming at a robust and adaptive DLA approach, the authors first manage to find a set of representative filters and statistics to characterize typical texture patterns in document images, which is through a visual similarity testing process. Texture features are then extracted from these filters and passed into a dynamic clustering procedure, which is called visual similarity clustering. Finally, text contents are located from the clustered results. Benefit from this scheme, the algorithm demonstrates strong robustness and adaptability in a wide variety of documents, which previous traditional DLA approaches do not possess.
文摘To manipulate the layout analysis problem for complex or irregular document image, a Unified HMM-based Layout Analysis Framework is presented in this paper. Based on the multi-resolution wavelet analysis results of the document image, we use HMM method in both inner-scale image model and trans-scale context model to classify the pixel region properties, such as text, picture or background. In each scale, a HMM direct segmentation method is used to get better inner-scale classification result. Then another HMM method is used to fuse the inner-scale result in each scale and then get better final segmentation result. The optimized algorithm uses a stop rule in the coarse to fine multi-scale segmentation process, so the speed is improved remarkably. Experiments prove the efficiency of proposed algorithm.
基金supported by the Korea Institute of Science and Technology Information(KISTI)through Construction on Science&Technology Content Curation Program(K-20-L01-C01)the National Research Foundation of Korea(NRF)under a grant funded by the Korean Government(MSIT)(No.NRF-2018R1C1B5031408).
文摘The volume of academic literature,such as academic conference papers and journals,has increased rapidly worldwide,and research on metadata extraction is ongoing.However,high-performing metadata extraction is still challenging due to diverse layout formats according to journal publishers.To accommodate the diversity of the layouts of academic journals,we propose a novel LAyout-aware Metadata Extraction(LAME)framework equipped with the three characteristics(e.g.,design of automatic layout analysis,construction of a large meta-data training set,and implementation of metadata extractor).In the framework,we designed an automatic layout analysis using PDF Miner.Based on the layout analysis,a large volume of metadata-separated training data,including the title,abstract,author name,author affiliated organization,and keywords,were automatically extracted.Moreover,we constructed a pre-trainedmodel,Layout-Meta BERT,to extract the metadata from academic journals with varying layout formats.The experimental results with our metadata extractor exhibited robust performance(Macro-F1,93.27%)in metadata extraction for unseen journals with different layout formats.
基金the National Natural Science Foundation of China(50405001).
文摘We study here effects of nozzle layout on the droplet ejection of a micro atomizer, which was fabricated with the arrayed nozzles by the MEMS technology and actuated by a piezoelectric disc. A theoretical model was first built for this piezoelectric-liquid-structure coupling system to characterize the acoustic wave propagation in the liquid chamber, which determined the droplet formation out of nozzles. The modal analysis was carried out numerically to predict resonant frequencies and simulate the corresponding pressure wave field. By comparing the amplitude contours of pressure wave on the liquid-solid interface at nozzle inlets with the designed nozzle layout, behaviors of the device under different vibration modes can be predicted. Experimentally, an impedance analyzer was used to measure the resonant frequencies of the system. Three types of atomizers with different nozzle layouts were fabricated for measuring the effect of nozzle distribution on the ejection performance. The visualization experiment of droplet generation was carried out and volume flow rates of these devices were measured. The good agreement between the experiment and the prediction proved that only the increase of nozzles may not enhance the droplet generation and a design of nozzle distribution from a view-point of frequency is necessary for a resonant related atomizer.
文摘inductive fault analysis is a technique for enumerating likely bridges that is limited by the weighted critical area computation. Based on the rectangle model of a real defect and mathematical morphology, an efficient algorithm is presented to compute the weighted critical area of a layout. The algorithm avoids the need to determine which rectangles belong to a net and the merging of the critical area corresponding to a net pair. Experimental resuits showing the algorithm's performance are presented.
文摘As die size and complexity increase, accurate and efficient extraction of the critical area is essential for yield prediction. Aiming at eliminating the potential integration errors of the traditional shape shifting method, an improved shape shifting method is proposed for Manhattan layouts. By mathematical analyses of the relevance of critical areas to defect sizes, the critical area for all defect sizes is modeled as a piecewise quadratic polynomial function of defect size, which can be obtained by extracting critical area for some certain defect sizes. Because the improved method calculates critical areas for all defect sizes instead of several discrete values with traditional shape shifting method, it eliminates the integration error of the average critical area. Experiments on industrial layouts show that the improved shape shifting method can improve the accuracy of the average critical area calculation by 24.3% or reduce about 59.7% computational expense compared with the traditional method.