摘要
学习算法是否具有增量学习能力是衡量其是否适合于解决现实问题的一个重要方面。增量学习使学习算法的时间和空间资源消耗保持在可以管理和控制的水平,已被广泛应用于解决大规模数据集问题。针对文本分类问题,本文提出了增量学习算法的一般性问题。基于推拉策略的基本思想,本文提出了文本分类的增量学习算法ICCDP,并使用该算法对提出的一般性问题进行了分析。实验表明,该算法训练速度快,分类精度高,具有较高的实用价值。
The ability to incrementally learn from batches of data is an important feature that makes a learning algorithm more applicable to real world problems. Incremental learning may be used to keep memory and time consump tion of the learning algorithm at a manageable level. Incremental learning algorithms have been widely used for sol- ving large-scale dataset problems. For text classification problem, the paper presents the general issues of an incremental learning algorithm. Based on DragPush strategy, the paper introduces a text classification incremental learn ing method, named ICCDP. Finally, it explores the issues of incremental learning based on ICCDP. The results of the experiment reveals that ICCDP is of high value for its fast training and its excellent classification performance.
出处
《中文信息学报》
CSCD
北大核心
2008年第1期37-43,共7页
Journal of Chinese Information Processing
基金
国家973资助项目(2004CB318109,2007CB311100)
关键词
计算机应用
中文信息处理
增量学习
推拉策略
文本分类
中心法
computer application
Chinese information processing
incremental learning
DragPush
text classification
centroid method