摘要
针对大部分多类Adaboost算法因训练复杂度过高而难以应用于手写汉字识别这种大类别数分类的问题,提出了一种新的改型的多类Adaboost算法。该算法采用基于描述性模型的多类分类器——改进的二次鉴别函数(MQDF)分类器作为基元分类器,可直接进行多类分类,无需将多类问题转化为多个两类问题处理,大大降低了训练复杂度。此外,该算法根据广义置信度更新样本权重,实验证明此方法简单有效。为了降低算法的识别复杂度,对训练后得到的基元分类器组进行删减,仅保留一个最优的基元分类器作为最终分类器。在HCL2000及THOCR-HCD数据集上进行的实验表明,该算法的相对错误率比现有算法分别下降了14.3%、8.1%和19.5%。
In consideration of the problem that most of present Adaboost algorithms are hard to deal with such large scale classifications as Chinese handwritten character recognition because of their high training complexity, the paper proposes a novel modified muhiclass Adaboost algorithm. It adopts the descriptive model based multiclass classifiers, that is, modified quadratic discriminant function (MQDF) classifiers as element classifiers, which perform muhiclass classification directly. It does not need to convert multiclass classifications to multiple binary classifications, and has lower training complexity. Besides, it updates sample weights according to the generalized confidence, which proved to be simple and effective. In order to reduce the recognition complexity, it performs the pruning method to pick out only one best element classifier from all boosted classifiers to do the classification. The algorithm was applied to Chinese handwritten character recognition on HCL2000 and THOCR-HCD databases, and the results showed that its relative error rate reduced 14.3%, 8.1% and 19.5 % respectively compared with the present methods.
出处
《高技术通讯》
EI
CAS
CSCD
北大核心
2009年第4期331-336,共6页
Chinese High Technology Letters
基金
国家自然科学基金(60472002)
863计划(2006AA01Z115)资助项目
关键词
多类Adaboost算法
手写汉字识别
广义置信度
改进的二次鉴别函数
multiclass Adaboost, Chinese handwritten character recognition, generalized confidence, modified quadratic discriminant function