摘要
各种单据中存在着大量填写在线框中的手写体数字需要处理。这些线框为定位和分割图像带来了更多的参考信息 ,但同时也提出了如何在尽量不损伤待识别字符条件下去除线框的新问题。论文对带有线框的数字金额字符串的处理和识别进行了研究 ,并给出了一个通用方法。该处理方法采用数学形态法去除数字串的边框格线 ,利用格线的信息和连通域的特征把字符串分割成独立的待识别的数字字符 ,字符识别采用支撑向量机 ,最佳识别结果的选择采用概率统计和先验知识相结合的方法。该处理方法已应用于实际系统并取得了满意效果 。
There are many hand written numbers with frame lines to be recognized in the different types of forms. The frames can cause some problems when the numerals are extracted from the frames with the least lost of the numeral image, but the frame also provides much information to locate and segment the numeral string. A system is presented to recognize the handwritten numerals with frame lines. The frame lines are removed while keeping as many points as possible when the numbers touch the frame lines using mathematical morphology operations. The numeral string image is separated into individual numerals with the information from the frame line and connected component analysis. The numerals are recognized with SVM network with the best result chosen by statistical analysis and pre existing knowledge. The system has proved effective in banks.
出处
《清华大学学报(自然科学版)》
EI
CAS
CSCD
北大核心
2002年第3期387-390,共4页
Journal of Tsinghua University(Science and Technology)
基金
国家自然科学基金资助项目 (69775 0 0 1)
关键词
格线边框
手写体数字串处理系统
模式识别
pattern recognition
character segmentation
mathematical morphology
support vector machine