期刊文献+

碎纸片自动拼接复原 被引量:2

Automatic Reconstruction of Fragments
下载PDF
导出
摘要 为了提高碎纸片的拼接效率和保护信息安全,提出一种碎纸片拼接复原算法,主要由墨迹特征提取和图像匹配两个过程构成.针对英文文档碎纸片,本文提出一种基于文字基线、改进的遗传算法(GA)及光学字符识别技术(OCR)的自动化拼接算法.该算法先根据同行字母的下基线基本相近的准则进行分行归类;并将分类后同一类碎纸片拼接问题转化为旅行商问题,采用改进的遗传算法及光学字符识别技术进行求解;最后根据下基线的位置采用贪婪算法为辅助,实现组行成页.此外,本文结合中文文字的特征,将算法进行修改,得到自动拼接中文文档碎纸片的算法.根据以上算法编写MATLAB程序对横纵切的中英文碎纸片进行拼接试验,结果表明,无需进行人工干预,能够实现全自动拼接. In order to improve the efficiency of fragment reconstruction and protect the information security, a strategy for fragment reconstruction is proposed, which mainly consists of two characteristics: ink feature extraction and image matching. An automatic restoring algorithm which is based on the baseline of text, advanced genetic algorithm (GA)and optical character recognition (OCR), is proposed for English documents. Firstly, the algorithm classifies fragments according to the same criterion of the lower baseline of the peer letters. Therefore, the problem is transformed to Traveling Salesman Problem after the classification. Secondly, the improvement genetic algorithm and the optical character recognition technology are used to solve the problem. Thirdly, the algorithm arranges the lines to page by the location of the baseline and greedy algorithm as an assistant. In addition, based on the characteristics of Chinese characters, the algorithm can be modified as an automatic reconstruction algorithm for Chinese documents. In the end, MATLAB programs are developed according to the auto-matching algorithm of fragments. Experimental results demonstrate that the algorithm is efficient.
出处 《汕头大学学报(自然科学版)》 2018年第1期31-39,48,共10页 Journal of Shantou University:Natural Science Edition
关键词 碎纸片 自动拼接 旅行商问题(TSP) 改进的遗传算法(GA) 光学字符识别技术(OCR) fragments automatic reconstruction traveling salesman problem(TSP) improved genetic algorithm(GA) optical character recognition technology(OCR)
  • 相关文献

参考文献6

二级参考文献47

共引文献96

同被引文献15

引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部