摘要
描述了一种汉字类型文件的碎纸片拼接技术.先将碎纸片转换为数字矩阵.再利用由字高和行间距转换的"斑马线",对碎纸片进行归类.之后使用"双边拼接"的方法将碎纸片尽可能多的通过计算机自动拼接.在逐个拼接相邻的碎纸片时,将每个碎纸片的矩阵左右上下四个分向量抽象为多维空间中的坐标点,比较各坐标点之间的"最小欧氏距离",寻找最小者作为拼接对象以达到碎纸片之间最佳吻合的效果。
This paper analyzes the automatic reconstruction technology of cross- cut shredded text document on which writes Chinese characters. Firstly, the shreds are transformed into matrixes. Then they are classified into different sets according to the font height and the line space. The shreds will be spliced automatically as much as possible when the'Double-edges method'is applied into the computing process. Double-edges method includes two steps: At first, set the row vectors and column vectors of the matrixes as coordinate points in multi-dimensional space. Secondly, find the Minimum Euclidean distance between the located coordinate point and other points remain to be locate in the three kinds of distance. The most suitable shred will be found as it has a least distance from the located one.
出处
《自动化与仪器仪表》
2015年第10期51-53,共3页
Automation & Instrumentation
关键词
字高
行间距
最小欧氏距离
双边拼接
Font height
Line space
Minimum Euclidean distance
Double-edges method