Pre-training transformer with dual-branch context content module for table detection in document images

下载PDF

导出

摘要 Background Document images such as statistical reports and scientific journals are widely used in information technology.Accurate detection of table areas in document images is an essential prerequisite for tasks such as information extraction.However,because of the diversity in the shapes and sizes of tables,existing table detection methods adapted from general object detection algorithms,have not yet achieved satisfactory results.Incorrect detection results might lead to the loss of critical information.Methods Therefore,we propose a novel end-to-end trainable deep network combined with a self-supervised pretraining transformer for feature extraction to minimize incorrect detections.To better deal with table areas of different shapes and sizes,we added a dualbranch context content attention module(DCCAM)to high-dimensional features to extract context content information,thereby enhancing the network's ability to learn shape features.For feature fusion at different scales,we replaced the original 3×3 convolution with a multilayer residual module,which contains enhanced gradient flow information to improve the feature representation and extraction capability.Results We evaluated our method on public document datasets and compared it with previous methods,which achieved state-of-the-art results in terms of evaluation metrics such as recall and F1-score.https://github.com/Yong Z-Lee/TD-DCCAM.

作者 Yongzhi LI Pengle ZHANG Meng SUN Jin HUANG Ruhan HE

机构地区 School of Computer Science and Artificial Intelligence School of Computer Science Hubei Provincial Engineering Research Center for Intelligent Textile and Fashion

出处《虚拟现实与智能硬件（中英文）》 EI 2024年第5期408-420,共13页 Virtual Reality & Intelligent Hardware

关键词 Table detection Document image analysis TRANSFORMER Dilated convolution Deformable convolution Feature fusion

分类号 TP391.41 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

1王道累,肖贝成,姚从荣,赵文彬,朱瑞.基于改进YOLOv5的深度学习光伏组件“热斑效应”检测方法[J].太阳能学报,2024,45(8):342-348.
2曾昊阳,童博,赵纯熙.多厂商云管网络及系统设计研究[J].邮电设计技术,2024(9):87-92.
3宋存利,柴伟琴,张雪松.基于改进YOLO v5算法的道路小目标检测[J].系统工程与电子技术,2024,46(10):3271-3278.
4张秀再,张昊,杨昌军.基于DeeplabV3+网络的轻量化语义分割算法[J].科学技术与工程,2024,24(24):10382-10393.
5李珅,杜科,李舟演,李宁,熊岑,柳明慧,秦伦明.一种改进YOLOv8n的电力设备红外图像识别网络[J].无线电工程,2024,54(10):2362-2370.
6杨博洋,方倩,郑懿,王红英.超声生理盐水造影对新生儿环状胰腺的定量诊断价值[J].实用医学杂志,2024,40(18):2618-2622.
7岳庚.基于交叉可变特征融合和动态稀疏注意力YOLOv8的遥感森林野火检测模型[J].计算机科学与应用,2024,14(9):130-140.
8乔仕杰,胡芳睿,李春华.基于集成模型的蛋白变构位点预测方法[J].生物物理学,2024,12(2):31-37.
9Yan Li,Tai-Kang Tian,Meng-Yu Zhuang,Yu-Ting Sun.De-biased knowledge distillation framework based on knowledge infusion and label de-biasing techniques[J].Journal of Electronic Science and Technology,2024,22(3):57-68.
10HUANG Bin,GAO Shi-bo,YU Run-ling,ZHAO Wei,ZHOU Guan-bo.Monitoring Sea Fog over the Yellow Sea and Bohai Bay Based on Deep Convolutional Neural Network[J].Journal of Tropical Meteorology,2024,30(3):223-229.

虚拟现实与智能硬件（中英文）

2024年第5期

浏览历史

内容加载中请稍等...

Pre-training transformer with dual-branch context content module for table detection in document images

相关作者

相关机构

相关主题

浏览历史