摘要
文本分类常采用的算法一般是基于向量比较的分类技术。文本中关键字较多,形成的特征向量维数相当高,因而会导致分类比较处理的运算量太大,而降低维数后又会不可避免地丢失有用信息。将粗糙集理论应用于分类处理过程中可有效地解决此问题。
Presently, the popular arithmetic of text classification is the classifying techniques based on vector comparison. However, asthere are too many key words in the text forming a rather high dimension of eigenvector, which consequently either leads to a very big operation amount in classifying disposal or surely misses useful information after lowering the dimension. If rough set theory can be applied in the process of classifying disposal, this problem can be easily dealt with.
出处
《重庆科技学院学报(自然科学版)》
CAS
2009年第4期166-168,共3页
Journal of Chongqing University of Science and Technology:Natural Sciences Edition
关键词
文本分类
粗糙集
约简
文本分类算法
text classification
rough set
reduction
text classifying arithmetic