摘要
特征的提取是文本分类中的关键技术,该文着重介绍了文本特征提取方法中的文档频率(DF)、信息增益(IG)、x2统计(CHI),该文通过实验对于以上三种特征提取方法进行了研究和比较,该结论对于今后研究特征选择方法对于英文作文自动评分的影响奠定了理论和实践基础。
The feature extraction is a key technology in the text categorization.This essay emphasizes the importance of DF,IG and CHI.In the essay three methods of feature selection are explored and compared by experimenting.Its conclusion will lay a theoretical and practical foundation for the future research on the influence of such selection method on Automated English Essay Scoring Using Text Categorization.
出处
《电脑知识与技术(过刊)》
2009年第7X期5425-5426,共2页
Computer Knowledge and Technology
关键词
文本分类
特征提取
SVM
text categorization
feature extraction
SVM