摘要
用词上下文向量来表达文本集内一个词语与其他词语之间的上下文关系,并在词上下文向量的基础上生成分类器中所有类别的类别特征向量,以及待分类文本的特征向量,最后由分类器给出待分类文本的所属类别。实验显示,在类别特征向量和文本向量中融入词语上下文关系有助于改善文本分类效果。
In this paper, a term context vector is used to represent the relation between a term and its context terms. Based on term context vectors, class feature vectors of a classifier, and the document vector of the document to be classi-fied are generated, and then the document is classified. The experiment shows that adding term context relations into class feature vector and document vector can improve the classification effect.
出处
《现代图书情报技术》
CSSCI
北大核心
2008年第5期44-49,共6页
New Technology of Library and Information Service
关键词
文本自动分类
上下文
词上下文向量
Text automatic classification
Context
Term context vector