摘要
提出一种适用于超多类手写汉字识别的新改型Adaboost算法,采用基于描述性模型的多类分类器(modified quadratic discriminant function,MQDF)作为Adaboost基元分类器,可直接进行多类分类,无需将多类问题转化为多个两类问题处理,其训练复杂度大大低于已有的多类Adaboost算法。算法提出根据广义置信度更新样本权重,实验证明这种算法适用于大规模多类分类问题。为了降低算法的识别复杂度,提出从所有训练后得到的Adaboost基元分类器组中选择一个最优的基元分类器作为最终分类器的方法进行删减。在HCL2000及THOCR-HCD数据集上进行实验证明,所提改型Adaboost算法提高了识别率的有效性,该算法的相对错误率比现有最优算法分别下降了14.3%,8.1%和19.5%。
The proposed modified Adaboost algorithm adopts the descriptive model based on multi-class classifiers (modified quadratic discriminant function, MQDF) as element classifiers which perform multi-class classification directly. It does not need to convert multi-class classifications to multiple binary classifications and has lower training complexity. Besides, it updates sample weights according to the generalized confidence which is simple and effective. In order to reduce the recognition complexity, the pruning method was performed to pick out only one best element classifier from all boosted classifiers to do the classification. Applying the algorithm to Chinese handwritten character recognition on HCL2000 and THOCR-HCD databases, the relative error rate reduced 14.3 %, 8.1% and 19.5 % respectively.
出处
《中国工程科学》
2009年第10期19-24,31,共7页
Strategic Study of CAE
基金
国家自然科学基金资助项目(60472002)
国家"八六三"高科技研究发展计划(2006AA01Z115)
关键词
多类Adaboost算法
手写汉字识别
广义置信度
改进的二次鉴别函数
multiclass Adaboost algorithm
Chinese handwritten character recognition
generalized confidence
modified quadratic discriminant function