摘要
本文提出了一种用于机器识字的汉字容错编码方法。该编码采用统计粗分类和结构细分类相结合的方法,定义了易于机器识别的汉字结构字元集,给出了笔划字元的顺序判断规则。构建了37类子结构的编码和冗余容错编码,对易重码和误码的字做了特定的区分,建立了仿人构字的汉字编码规则和字典。仿真实验表明,这种编码方法易于机器识别,具有容错性,且拒识和误识率较低。
A kind of Chinese characters coding for computer cognition is presented in this paper.This kind of coding adopts the method based on the combibnation of Stat.classification in general with fine structure classification.Elements groups of Chinese characters are made for machine cognition.Rules for judging stroke sequence are given.37 kinds of subsidiary configuration cod- ings and redundant mistake bearable codes are constructed.Some special differentiation is also made about repeated codes and wrong codes.The code principles and dictionary of Chinese characters are established which agree with apery imitation.Emula- tional experimental results show that it applies to computer cognition with a low rate of repeated codes and wrong codes.
出处
《微型电脑应用》
2006年第7期7-10,4,共4页
Microcomputer Applications
关键词
汉字编码
字元
汉字特征
容错
Chinese characters code
Character elements Chineses characters characteristic
Mistake bearable