摘要
粘连字符串模式复杂,难以通过基于传统图像处理的方法进行准确分割,针对该问题,提出一种基于机器学习的粘连字符串切分方法。包括训练和分割2个部分,对字符串之间的分割位置进行学习,对于输入的粘连字符串,利用马尔科夫随机场网络得到各点可作为分割点的概率,在概率图上使用图像分割的算法确定分割位置。实验结果表明,该算法对模拟的粘连字符串、重叠字符串和真实的手写字符串都可以得到较好的分割结果。
For the complicated mode of the touched string, it is difficult to segment accurately based on the conventional image processing method, a touched string segmentation method is proposed based on machine learning, which includes training and segmentation. The segmentation knowledge between the characters is learned from an example database. The input touched string is processing via a Markov random field network to obtain a probability map, and the tradition image segmentation algorithm can be applied on the probability map to determine the split position. Experimental results on simulated touched string, overlapping string and the real touched hand writing string show that the algorithm is effectiveness.
出处
《计算机工程》
CAS
CSCD
2013年第4期258-262,共5页
Computer Engineering
关键词
字符串切分
粘连字符串
机器学习
马尔科夫随机场
信念传播
概率图
string segmentation
touched string
machine learning
Markov random filed
belief propagation
probability map