摘要
随着网络信息的迅猛发展 ,信息处理已经成为人们获取有用信息不可缺少的工具。文本自动分类系统是信息处理的重要研究方向 ,它是指在给定的分类体系下 ,根据文本的内容自动判别文本类别的过程。对文本分类中所涉及的关键技术 ,包括向量空间模型、特征提取、机器学习方法等进行了研究和探讨 ,并且提出了基于向量空间模型的文本分类系统的结构 。
In recent years , information processing turns more and more important for us to get useful information . Text categorization, the automated assigning of natural language texts to predefined categories based on their contents, is a task of increasing importance. This paper gives a research to several key techniques about text categorization , including vector space model , feature extraction , machine learning . It also describes a text categorization model based on VSM, and gives the evaluations and results .
出处
《计算机应用研究》
CSCD
北大核心
2001年第9期23-26,共4页
Application Research of Computers