摘要
图像中的文字提取是处理文字信息的关键步骤。提出一种基于条件随机场(CRFs)的分割方法。针对背景复杂的情况,精心选择特征,设计分类器,计算在给定观测数据情况下,像素标签的条件概率分布。这样,有效地避免了计算联合分布所带来的问题(如马尔可夫随机场)。与传统的分类器相比,该方法更加注重像素以及标签之间互相影响程度以及合理性。试验结果表明,与其他方法相比,CRFs的文字分割结果有明显的提高。
A key problem in text image analysis is to extract text accurately from image data. The text segmentation is solved efficiently by conditional random fields(CRFs). The underlying idea is that of defining a conditional probability distribution over label sequence in a situation of a given particular observation sequence, by which a joint distribution over both label and observation sequence is avoided. Features are selected deliberately and classifier is well designed according to Bayes rules and Markov properties. After the parameter training process, the testing results demonstrate that the performance of CRFs is better than other methods.
出处
《现代电子技术》
2011年第7期150-154,共5页
Modern Electronics Technique
基金
国家自然科学基金(60736007)
陕西省自然科学基金(2007F22)
关键词
文字图像分割
条件随机场
分类器设计
参数训练
text image segmentation
conditional random fields
classifier design
parameter training