摘要
连续手写识别是中文手写输入技术的核心,自然、快捷地输入中文信息一直是模式识别乃至人工智能领域追求的目标。提出了一种有效克服小屏幕限制的连续叠写汉字识别方法。该方法基于切分-识别集成的解码框架,先使用过切分算法处理输入的书写轨迹;然后启用一种新颖的感知机算法判定字符的边界;随后采用来自字符分类模型、几何模型和语言模型的多种上下文信息进行路径解码。为适应不同类型的移动终端,特别提出了一种高效压缩字符分类模型的方法,以有效减少字符识别过程对存储和内存的占用。该识别方法已在Android平台上部署,并进行了大规模的测试实验。实验结果证实了该识别方法的性能和效率。
Continuous Chinese handwriting recognition is the primary bottleneck for Chinese handwritten character input method.Naturally and quickly inputting Chinese text is the fundamental goal to the pattern recognition field even to the artificial intelligence.A novel recognition method was proposed for overlaid Chinese handwriting.It follows a segmentation-recognition integrated framework.Firstly,an over-segmentation algorithm is used to partition the handwriting trajectory.Then a perceptron algorithm is developed to locate the candidate character boundaries.Finally,multiple contexts including character recognition score,geometrical score and linguistic score,are utilized to decode the optimal recognition path.To match different mobile terminals,an appealing compression algorithm was proposed to make the character classifier compact,which reduces the storage consumption both in memory fingerprint and disk storage.The principled method is successfully ported to Android platform,enabling overlaid Chinese handwriting to be input on smart phones and further tested on large overlaid Chinese handwriting samples.Experimental results verify the effectiveness and efficiency of the method.It also works smoothly on smart phone,whose overlapped handwriting input function makes handwriting input remarkably efficient.
出处
《计算机科学》
CSCD
北大核心
2015年第7期300-304,共5页
Computer Science
基金
国家自然科学基金(61203260)
黑龙江省博士后基金(LBH-Q13066)
哈尔滨工业大学科研创新基金(HIT.NSRIF.2015083)资助
关键词
模式识别
连续中文叠写
笔画分类
分类器压缩
集束搜索
Pattern recognition
Overlaid Chinese handwriting
Stroke classification
Classifier compression
Beam search