摘要
文本自动分类系统是信息处理的重要研究方向。针对文本分错类和相似度低找不到合适类别等两种错误 ,提出一种迭代的学习算法 ,它利用分错的文本向量来提高或降低相应类别向量的权重 ,从而纠正分类错误 ,提高了分类准确率 ,并且最终得到了较精确的类别描述向量和较优的分类器。
The text categorization system is an important research aspect of information processing.On text error categorization and low similarity that can't find the adaptable class,etc.,this paper presents an iterative learning algorithm,which adopts error text vector to improve or decrease the weight of relative class vector to correct the error categorization,and improve category accuracy.Finally a more accuracy category description vector and better classifier has been obtained.
出处
《电脑开发与应用》
2004年第2期5-6,共2页
Computer Development & Applications
基金
山西省回国留学人员基金资助 (2 0 0 2 0 0 4 )
关键词
迭代学习
文本分类器
信息处理
度量函数
特征抽取
text categorization,categorization system,automatic categorization,iterative learning algorithm,classifier