摘要
文本分类的关键是对高维的特征集进行降维.降维的主要方法是特征选择和特征提取.本文综述了已有的特征选择和特征抽取方法,评价了它们的优缺点和适用范围.
The key to text categorization is how to reduce the high-dimension of the feature vectors. Feature reduction method involves feature selection and feature extraction. In this paper feature selection methods and feature extraction methods are colligated. Their advantage and disadvantage are evaluated.
出处
《情报学报》
CSSCI
北大核心
2005年第6期690-695,共6页
Journal of the China Society for Scientific and Technical Information
基金
浙江省教育厅资助项目
关键词
文本分类
特征降维
特征选择
特征提取
text categorization, feature reduction, feature selection, feature extraction.