摘要
通过分析具有规范版式的中文文档,以2013年全国大学生数学建模竞赛B题附件3的数据为样本,提出基于文字特征的规则碎纸片的自动拼接算法.该算法通过研究文字行高、间距等特征构建一系列分类准则对碎纸片进行分类,并将分类后的同一行碎纸片排列问题转化为旅行商问题进行求解,最后将组行成页问题再次转化为旅行商问题进行求解,实现规则碎纸片的自动拼接.根据该算法编写MATLAB和LINGO的程序对规则切割形成的碎纸片进行了拼接试验,试验结果表明该算法效果较好.
A new auto-matching algorithm of regular fragments based on analyzing the characters of printed Chinese document with the standard format is proposed. The third attachment of Problem B of Contemporary Undergraduate Mathematical Contest in Modeling in 2013 is used as samples of regular fragments. By analyzing the height of lines and space between lines, several criteria are set up to classify regular fragments and transform the arrangement of lines and pages to the traveling salesman problem. MATLAB and LINGO programs are developed according to the auto-matching algorithm of regular fragments. Experimental results demonstrate that the algorithm is efficient.
出处
《汕头大学学报(自然科学版)》
2014年第2期4-10,59,共8页
Journal of Shantou University:Natural Science Edition
基金
汕头大学青年科研基金资助项目(YR13001)
关键词
规则碎纸片拼接
不匹配程度
旅行商问题
matching algorithm of regular fragments
un-matching degree
travelingsalesman problem